Create checksums based on URL content?

Paul Wright
Sun Jan 4 22:56:08 UTC 2004

On Sat, 3 Jan 2004, Vernon Schryver wrote:

> The DCC is just past the edge of a steep and slipperly slope.  At
> the top of the slope is detecting entirely identical copies of bulk
> mail.  At the bottom is detecting characteristics at best distantly
> related to "unsolicited and bulk" such as the average number of
> syllables or naughty words like "remove".

True. However, much of the spam containing random English
words as personalisation which I've seen could be dealt with with
modifications to the fuzzing process. To my mind, those modifications
aren't going further in the direction of SpamAssassin, they've just
making better bulkiness detectors. Whether you want to get further into
that arms race is another question.

> There are good uses for other mechanisms including
>   - IP and domain name blacklists,
>   - blacklist naughty URLs or words,

Yes. All the stuff which is getting throught the DCC so far is either
from machines in the Spamhaus XBL or advertises sites hosted on machines
in the Spamhaus SBL. I think I need another layer of filtering.

Paul Wright
my site: my days:

More information about the DCC mailing list

Contact by mail or use the form.