23 February 2007 Changes (HHH) ------------------------------ 1. Action: girl at end but not girls? Added: BadURL_WordEnds[i++]="girls"; Reason: faceoffgirls.com If I get too many false positives we will downgrade it from URL to HOST, but it stays. See the count... 5263 girls_Parts.txt 2176 girls_Starts_and_Ends.txt 2360 girls_Passed_All_Rules.txt ------------------------------- 9799 total 2. Action: Bad HOST "women" block rule Added: BadHostParts[i++] = "women"; Reason: island-women.com - it stings (MVPHosts) MVPHosts does NOT block porn - Mike Burgess only blocks abusive hosts. 1867 women_Parts.txt 294 women_Starts_and_Ends.txt 698 women_Passed_All_Rules.txt ------------------------------- 2859 total 3. Action: Bad URL "jiz" rule Added: BadURL_Parts[i++] = "jiz"; Reason: look4jizz.com - it stings (MVPHosts) 31 jiz_Parts.txt 12 jiz_Starts_and_Ends.txt 122 jiz_Passed_All_Rules.txt -------------------- 165 total These are mostly "jizz", but in looking at the lists wed have: jizface.com jizqueen.com jiztore.vsni.com jizvideo.com manjiz.com More to the point, I can't think of a word in English or any Romance language this is a pattern of in any word. 4. Action: Bad URL "hunk" rule Added: BadURL_Parts[i++] = "[^c]hunk"; Reason: todayshunks.com - it stings (MVPHosts) 96 hunk_Parts.txt 11 hunk_Starts_and_Ends.txt 141 hunk_Passed_All_Rules.txt ----------------------------- 248 total I will do the "chunk" hosts in the hosts file. I thought about "thunk" but that was invariably a *hothunk*.com in the host names which will cascade down into the URL. I don't see any problems in it applying to the entire URL. 5. Action: Altered rule From: BadDomains[i++] = ".sextracker.com"; To: BadHostParts[i++] = "sextracker"; Reason: There are sextracker.de, sextracker.nl, and I don't now how many others. This will stop ALL of them. 6. Action: Added Bad Domains rule Added: BadDomains[i++] = ".sexybabesx.com"; Reason: // FOR MVPS HOSTS Technically speaking, a rule like this is NOT needed. I am trying to drop some hints to Mike Burgess that this is a MUCH better way of stopping things - e.g. - he can take what we provided and make his own specialized proxy. Without ALL of the rules, he would NOT have to worry about false positives either. He has over 125 of these. Even if doesn't block Porn, this rule will block all of them and any new ones that exist. 7. Action: Added Bad Domains rule Added: BadDomains[i++] = ".searchmiracle.com"; Reason: 8. Action: Added GoodDomain Added: GoodDomains[i++]="gnome.org"; Reason: live.gnome.org/Seahorse/SessionIntegration 9. Action: Removed "jezebel" rule Removed: BadURL_Parts[i++] = "jezebel"; Reason: not enough - will use hosts if necessary 3 jezebel_Parts.txt 1 jezebel_Starts_and_Ends.txt 10 jezebel_Passed_All_Rules.txt ------------------------------- 14 total 10. Action: Removed "nookey" rule Removed: BadURL_Parts[i++] = "nookey"; Reason: not enough - will use hosts if necessary 0 nookey_Parts.txt 0 nookey_Starts_and_Ends.txt 2 nookey_Passed_All_Rules.txt ----------------------------- 2 total 11. Action: Removed "nooky" rule Removed: BadURL_Parts[i++] = "nooky"; Reason: not enough - will use hosts if necessary 0 nooky_Parts.txt 0 nooky_Starts_and_Ends.txt 6 nooky_Passed_All_Rules.txt ---------------------------- 6 total 12. Action: Removed "peepshow" rule Removed: BadURL_Parts[i++] = "peepshow"; Reason: NONE - all of the other rules handle it 86 peepshow_Parts.txt 0 peepshow_Starts_and_Ends.txt 0 peepshow_Passed_All_Rules.txt -------------------------------- 86 total 13. Action: Removed "penile" rule Removed: BadURL_Parts[i++] = "penile"; Reason: Not enough. Most names are either dead or parked. The rest I put in the hosts file but with a constant barrage of ads on TV and SPAM in the email box ... hosts file entries already added. 14. Action: Removed "hairless" rule Removed: BadURL_Parts[i++] = "hairless"; Reason: not enough - will use hosts if necessary 48 hairless_Parts.txt 1 hairless_Starts_and_Ends.txt 5 hairless_Passed_All_Rules.txt -------------------------------- 54 total 15. Action: Added rules for "youngrepublicans"; Added: GoodDomains[i++] = "youngrepublicans.com"; Reason: defeat "young" rule which needs to be looked at 16. Action: Added "slave" rule Added: BadURL_Parts[i++] = "slave"; Reason: slave-labor-inc.com We may need to drop to HOST level, but we NEED THIS rule! 162 slave_Parts.txt 55 slave_Starts_and_Ends.txt 194 slave_Passed_All_Rules.txt ------------------------------ 411 total 23 February 2007 UNresolved False Positives (HHH) ------------------------------------------------- 1. Pattern: "lips" Rules: BadHostWordStarts[i++]="lips"; BadURL_WordEnds[i++]="[^c]lips"; Reason: creativosparc.ads.uigc.net/RealMedia/ads/Creatives/\ OasDefault/BR_20061201_BUSCAPE-BOND/br_20061201_\ buscape-bond-BP-hometheaterphilips_pop.gif My initial hunch is to just downgrade the rules. The pattern is too short. Here is what happens if remove the rules for the hosts ... Both rules removed: =================== 454 lips_Parts.txt 110 lips_Starts_and_Ends.txt 263 lips_Passed_All_Rules.txt ----------------------------- 827 total Start rule removed: =================== 454 lips_Parts.txt 206 lips_Starts_and_Ends.txt 167 lips_Passed_All_Rules.txt ----------------------------- 827 total End rule removed: ================= 454 lips_Parts.txt 148 lips_Starts_and_Ends.txt 225 lips_Passed_All_Rules.txt 2. Pattern: "hot" OLD, NOW "hot[^em]" Rules: BadHostWordStarts[i++]="hot[^em]" Reason: hotmail.com I added the exclusion for e & m for hotel, AND hotmail (but that means I now need to scope out if the exclusion rules are needed any more for hotmail and what hosts need to be added. 3. Pattern: "oral" Rules: BadURL_WordStarts[i++]="oral"; BadURL_WordEnds[i++]="oral"; Reason: Nov 28 20:20:53 www.wliw.org/productions/images/doral_logo.jpg Sun Dec 17 20:24:37 byub.org/programaz/images/byuphilharmonicchoral.jpg Jan 15 08:09:51 iowa.brickriver.com/files/oZone_Objects_XNCYVE/\ 070112_Moral_Witness_PWMXY96T.jpg hostsfile.mine.nu/img/coral.gif The START rule is okay; it is the END rule that kills us. suggested "[^cdhm]oral" ? Here is the moral count for the hosts (immoral): 8 moral_Parts.txt 11 moral_Starts_and_Ends.txt 8 moral_Passed_All_Rules.txt ----------------------------- 27 total From a hosts perspective we can drop the END rule. 23 February 2007 RESOLVED False Positives (HHH) ----------------------------------------------- 1. Pattern: "hard" Rules: BadURL_WordStarts[i++]="hard[(b|c|e|p|s)]"; BadURL_WordEnds[i++]="[^cs]hard"; // Changed from "hard" to "[^cs]hard" Reason: digg.com/security/Marcus_Ranum_on_hard_disk_encryption So far this is the ONLY one I have encountered Good enough until somebody complains.