local vs. global counts for checksums

Vernon Schryver vjs@calcite.rhyolite.com
Wed Mar 23 19:41:27 UTC 2011

> From: Matus UHLAR - fantomas <uhlar@fantomas.sk>

> however I also think that there are different kinds/levels of bulkiness that
> could have different scores and/or different ways to get handled.

If you would say there are various kinds of mail that is usually bulk
and that has various probabilities of being unsolicited, I could agree.
But I think it is wrong to talk about different kinds of bulkiness
other than numbers of copies.  Precise language is important, because
sloppy language cause sloppy thinking, and sloppy thinking causes bad
results.  This week I had a discussion with a spammer who insisted
that his burst of around 100,000 identical messages was not "bulk mail"
because he claimed he wasn't selling anything.  

> Rejecting clear spam (SA score >10) while keeping the rest for later recheck
> or delivering suspicious mail to spam folder is OK I think.
> However since I don't plan to reject all bulk messages, I keep spamassassin
> to work with the scores. 

SpamAssassin is like every other spam filter and imperfect.  If you
set the scoring so that SA can ever detect anything, than some
legitimate email will have a score >10 or whatever threshold you
choose.  In your configuration, such legitimate mail or false
positives will disappear into blackholes.

On the other hand, if you would do all SA scanning during the SMTP
transaction, you could reject instead of accept any mail that you
might eventually not deliver.  That prevents blackholes.  SpamAssassin
can be run in popular MTAs including sendmail and postfix so that
the SA tests can be completed before the end of the transaction and
so you could give 5yz response to any email not delivered.

> the same mail for multiple users can be scanned two times: first with global
> set of rules at SMTP level, second time with per-user filters (and their
> whitelists).

As I think dccifd+postfix and dccm+sendmail demonstrate, there is no
technical reason that absolutely prevents doing do both global and
per-user scanning in the MTA during the original SMTP transaction.
(You can deal with single response code to the DATA command by
temporarily rejecting second and later Rcpt_To mailboxes that have
whitelists or other settings that differ from the first Rcpt_To

Vernon Schryver    vjs@rhyolite.com

More information about the DCC mailing list

Contact vjs@rhyolite.com by mail or use the form.