11 May 2007 Changes (HHH) ------------------------- 1. Action: "free" counter rules Added: GoodDomains[i++] = "freeos.com"; GoodDomains[i++] = ".sourceforge.net"; Reason: countermand "free" rule 2. Action: Totally removed "tender" rule. Removed: BadURL_Parts[i++] = "tender"; Reason: Sun Apr 1 00:12:33: i.cnn.net/v5cache/TCM/Images/\ Dynamic/i6/tendertrap_tr_104x78_112620021151.gif Sun Apr 15 02:33:47: realmusic.com/images/c_chapp2_tender.jpg 56 tender_Parts.txt 2 tender_Starts_and_Ends.txt 13 tender_Passed_All_Rules.txt ------------------------------ 71 total There is also "extender" and some other patterns that will nail you with this pattern. Not only that, but the word itself has so many different meanings that it has gotta go! Latin -> French -> English 3. Action: "cool" rules Added: GoodDomains[i++] = ".coolclips.com"; Reason: countermand "cool" rule. I can already see that "cool", "free", "hot", and "live" rules are going to be commented out. 4. Action: "hittail.com" Added: BadDomains[i++] = ".hittail.com"; Reason: // INFO GATHER RULE This is DNS Wildcard Domain. Please report any hittails you see to me. I suspect there are FAR more hittails than Mike Burgess has. 5. Action: "boys" or "boy" or "boyz" Added: BadHostParts[i++] = "boys"; BadHostParts[i++] = "boyz"; BadHostWordStarts[i++] = "boy"; BadHostWordEnds[i++] = "boy"; Reason: 941 / 1734 of them, boy - boys = 793 of 793, boyz = 122 Actually, had problems with "cuterussianboys.com" that Mike Burgess recently added so I SNOOPED! The Boy Scouts of America is www.scouting.org and www.scoutstuff.org. I think they already thought about this! Too bad we didn't. 976 boys_Parts.txt 155 boys_Starts_and_Ends.txt 941 boys_Passed_All_Rules.txt ------------------------------ 2072 total 1716 boy_Parts.txt 272 boy_Starts_and_Ends.txt 1734 boy_Passed_All_Rules.txt ----------------------------- 3722 total But after I added those rules: 2943 boy_Parts.txt 715 boy_Starts_and_Ends.txt 64 boy_Passed_All_Rules.txt 3722 total 6. Action: "vegas" Added: GoodDomains[i++] = ".vegas.com"; Reason: // COUNTER "vegas" rule I realize that they will have to type: www.vegas.com to get there, but if I shortened the rule to "vegas.com", that allows the following through: afterhoursvegas.com asslass-lasvegas.com assvegas.com babydollsoflasvegas.com captainvegas.com clublasvegas.com diva-lasvegas.com getbookedlasvegas.com girlcolasvegas.com girlsinvegas.com grouplasvegas.com online-lasvegas.com partygirlsvegas.com redvegas.com rosvegas.com roxylasvegas.co sandyvegas.com thegirlsoflasvegas.com toonvegas.com veronicavegas.com Get the idea? 7. Action: "virgin" rule From: BadURL_Parts[i++] = "virgin"; To: BadURL_Parts[i++] = "virgin[^i]"; Reason: "virginia". The number one use of the word is "virgins", and the second rule stops that. That gives me the following count: 889 virgin_Parts.txt 59 virgin_Starts_and_Ends.txt 31 virgin_Passed_All_Rules.txt ------------------------------- 979 total I can handle the hosts with "virgini", but more to the point, NO exclusions are needed for "viriginia" which was a BIG problem. More to the point, the restaraunt named Blue Iguana in Fairfax, VA has bad reports, as does the one in Tempe, AZ. THERE IS NO REAL TEX-MEX FOOD IN THE WASHINGTON, DC AREA. I know. I have been there. The Blue Iguana in Fairfax sounds like a Friday singles bar. The Blue Iguana in Salt Lake City has TOP RATE FOOD! Well, it USED to have good food. Tempe, AZ: http://phoenix.citysearch.com/review/40847615 Fairfax, VA: http://www.theblueiguana.com/location.html SLC, UT: http://www.blueiguanarestaurant.net/ Food - excellent Ambience - excellent Pricing - exhorbitant http://tinyurl.com/2wu8kx (La Frontera) Excellent prices, go there if starved and LOCK YOUR CAR! My relatives said they were scared going there, but you should have nothing to fear but fear itself. 8. Action: "800" rules Removed: GoodDomains[i++] = "1-800-flowers.com"; GoodDomains[i++] = "800flowers.com"; Reason: Since we have removed the BLOCKING "800" rule, we don't need these any more. I realize dialers are a problem in Europe, but they don't use the "800" or "888" numbers. Neither do the call-sites for Porn in the United States use them any more. They shifted to using the 900 or 700 numbers. 9. Action: "pee" start rule From: BadURL_WordStarts[i++] = "pee"; To: BadURL_WordStarts[i++] = "pee[^nrv]"; Reason: Fri May 11 20:31:38: www.sarc.com/avcenter/venc/data/w32.peerload.a.html 10. Action: "secret" rule From: BadURL_Parts[i++] = "secret"; To: BadHostParts[i++] = "secret"; Reason: Tue Apr 3 09:32:36: ndc.shockwave.com/images/picons/highlight/\ hideandsecret_highlight.png Wed Apr 25 20:49:17 www.uen.org/kulc/images/series_images/\ SEDE-secretsdead.jpg Thu Apr 26 21:51:47: www.uen.org/kulc/images/series_images/\ SEDE-secretsdead.jpg Fri May 11 07:10:18: www.pbs.org/wnet/secrets/ 11. Action: "hotbar.com" Added: BadDomains[i++] = ".hotbar.com"; Reason: www.estationary.com is an alias to vip-farm0v.hotbar.com Further, they are adding others so this is a // INFO GATHER RULE 11 May 2007 UNresolved False Positives (HHH) -------------------------------------------- NONE 11 May 2007 RESOLVED False Positives (HHH) ------------------------------------------ 1. Pattern: "tender" Rules: BadURL_Parts[i++] = "tender"; Reason: Sun Apr 1 00:12:33: i.cnn.net/v5cache/TCM/Images/\ Dynamic/i6/tendertrap_tr_104x78_112620021151.gif Sun Apr 15 02:33:47: realmusic.com/images/c_chapp2_tender.jpg 56 tender_Parts.txt 2 tender_Starts_and_Ends.txt 13 tender_Passed_All_Rules.txt ------------------------------ 71 total There is also "extender" and some other patterns that will nail you with this pattern. Not only that, but the word itself has so many different meanings that it has gotta go! Latin -> French -> English Solution: Removed the rule.