DCC clients changing Subject header survey

Vernon Schryver vjs@calcite.rhyolite.com
Thu Sep 26 20:12:51 UTC 2002

> From: Brandon Long <blong@fiction.net>

> ...
> But having SpamAssassin mark messages as "many" that are only seen by
> one SpamAssassin and just might be due to something being caught by its
> rules doesn't mean its bulk.  For instance, I have friends who will mail
> perhaps 10 people on a message (members of a sports team, for instance).
> I would normally never have to worry about white listing that, since my
> threshold for 'bulk' is on the order of 50 or 100... but if that message
> runs foul of some random SpamAssassin rule and gets marked 'many', now I
> won't see the message. 

If that mail really is not bulk, then how can some random SpamAssassin
rule see it to miss-mark it as "many"?...I think only if one of your
10 correspondents uses random SpamAssassin rules.

This notion is why it is possible to claim that the DCC has essentially
no false positives.  Either mail is private and so won't be miss-marked
or it's not-private and at least arguably "bulk".
Of course, claim is an overstatement for two reasons.  First, "bulk"
is not necessarily the opposite of "private."  Second, for a while
the Fuz2 checksum was counting some rather small messages such as those
consisting of "please subscribe" and some random cybercrud.

By the way, 50 or 100 for "bulk" is probably too high for a DCC server
that does not see at least 50,000 reports/day.  The bulk threshold
needs to be scaled to the size of the sample of a bulk spew that the
DCC server will see.

>                         I'm not running SpamAssassin by choice - because
> its false positive rate is too high, but having others use SpamAssassin
> to mark a message as bulk will increase my false positive rate.

Only for mail that is not private.

>                                                                  If I
> have to whitelist all of my 'friends' on the off chance that they might
> Cc someone running SpamAssassin, that seems to defeat the purpose of
> running a program that counts the number of times a message is received.
> Basically, having a 'many' means that I am subject to everyone else's
> concept of 'bulk' instead of deciding that on my own.  People who
> automatically upgrade 20 -> many 'break' my limit of 100.

That's a point, but it also applys to any of your friends who upgrade
1->many.  Since I've encounter people who do that (or most commonly,
would if they understood how), I'm sure you know some as well.  My
solution is to cut them out of my circle of correspondents.

This is all probably moot.  Even if it were a good idea, it would
probably be too hard for SpamAssasin to delay the dcc_check (or
check_dcc?) until after all of the other checks and then vary the
args depending the score at that point.

For people misapplying `dccproc -t many` or `repeat 1000 dccproc -t 1`
to random mail, the cat is out of the bag, the horse is gone, the milk is
spilled, and the water is over the dam or under the bridge.

Vernon Schryver    vjs@rhyolite.com

More information about the DCC mailing list

Contact vjs@rhyolite.com by mail or use the form.