September 15 Changes (HHH) -------------------------- ---------------------------------------------------------------------- 1. Action: Downgraded scope of "live" at the START From: BadURL_WordStarts[i++]="live"; To: BadHostWordStarts[i++]="live"; Reason: There were just too many false positives in URLs. (see next one for some counts) 2. Action: Downgraded scope of "live" at the END From: BadURL_WordEnds[i++]="live"; To: BadHostWordEnds[i++]="live"; (also, the URL in the comment above it should be changed to HOST) I think every place we say HOST or URL WE Reason: There were just too many false positives in URLs. live at neither Start or End: (497, 291, 222) live only at Start: (497, 291, 120) live only at End: (497, 291, 126) live at both Start and End: (497, 291, 44) ---------------------------------------------------------------------- ---------------------------------------------------------------------- 3. Action: Downgraded scope of "strip" at the START From: BadURL_WordStarts[i++]="strip" To: BadHostWordStarts[i++]="strip"; Reason: There were just too many false positives in URLs. (see next one for some counts) 4. Action: REMOVED "strip" FROM the END From: BadURL_WordEnds[i++]="strip"; To: * NOTHING * strip at neither Start or End: (105 / 50 / 50) Reason: There were just too many false positives in URLs. strip at neither Start or End: (105, 50, 50) strip only at Start: (105, 50, 18) strip only at End: (105, 50, 50) strip at both Start and End: (105, 50, 18) I could have just downgraded strip at the end, but it isn't worth it. It doesn't contribute anything to reducing hosts. ---------------------------------------------------------------------- 5. Action: "sinful" rule added Rule: BadURL_Parts[i++] = "sinful"; Reason: Handles 20 hosts, probably even more URLs. I saw no reason why I could NOT apply it to an entire URL. It is usually at the start but why not go all the way? 6. Action: "3x" rule re-added Rule: BadHostParts[i++] = "3x"; Reason: This is just an alias for "xxx" 7. Action: "free" rule added Rule: BadHostParts[i++] = "free"; Reason: 1,688 hosts made it past all of our rules. Is that a good enough reason? Here is the count before we added the rule: Parts removes: 3432 Starts / Ends removes: 588 Passes: 1688 ---------------------------- Total: 5708 If people want a host with "free" in it then we are going to have to add some of them. Either that or take a long hard look at rules to manage.