Paul Graham on Filters that Fight Back

The inimitable Paul Graham has published his latest installment on anti-spam filters: Filters that Fight Back.

He summarizes today’s state of affairs, then plays out the next ply or so:
Spammers are trying to foil learning filters with chaff of various kinds. Once they get good at it (here’s one I haven’t seen yet: pick up the chaff during your web scraping), the spam text itself will need to try to look more bland and indistinguishable, and the distinguishing features will no longer be embedded but will lie one or two HTTP GETs away instead.

So PG expands on the auto-retrieval of web content as part of filtering.

But, in my view, he enters an area fraught with peril for both technical and legal reaons:

a “punish” mode which, if turned on, would retrieve whatever’s at the end of every URL in a suspected spam n times, where n could be set by the user.

While auto-retrieval will become part of the landscape as part of the machinery of automated personal assistants, it will be tricky to implement without unwanted side-effects. Spammers will try to create new legal cover by including “shrink-wrap consent” triggered by auto-retrievers. The mere suggestion of “hack-back” intent creates a legal vulnerability as well.

Leave a Reply