DCC clients changing Subject header survey

Brandon Long blong@fiction.net
Thu Sep 26 19:22:58 UTC 2002

On 09/26/02 Vernon Schryver uttered the following other thing:
[ essentially, you have to whitelist bulk mail, which I understand ]
> That is why the DCC servers only tell about "bulk."  To determine
> "spam", you MUST add something such as a local white-list.  That's
> why the DCC source now includes those prototype point-and-click
> white-listing CGI scripts.  There are ISPs using the DCC with global
> white-lists and DCC rejection thresholds of 20 or 50.  I don't
> understand how they can do that, but I'm not in charge of or a user
> of their operations and so have no standing to offer an opinion.
> SpamAssassin users say their rules yield a tolerable false positive
> rate and perhaps making a DCC determination of "bulk" be only half of
> a SpamAssassin threshold is a good choice for them.  It wouldn't be
> for me, because I think the false positive rate must be below 0.1%
> and ought to be below 0.01%.
> Having SpamAssassin automatically mark mail that the other SpamAssassin
> rules don't like would only not significantly inflate the DCC counts.
> It could not cause false positives unless your correspondents include
> people who mark everything as spam.  Really private mail is not seen
> by anyone else's and so can't have its DCC counts inflated.  Having
> SpamAssassin mark its spam with a DCC count of "many" would only tell
> DCC clients "this mail is awfully bulky."

But having SpamAssassin mark messages as "many" that are only seen by
one SpamAssassin and just might be due to something being caught by its
rules doesn't mean its bulk.  For instance, I have friends who will mail
perhaps 10 people on a message (members of a sports team, for instance).
I would normally never have to worry about white listing that, since my
threshold for 'bulk' is on the order of 50 or 100... but if that message
runs foul of some random SpamAssassin rule and gets marked 'many', now I
won't see the message.  I'm not running SpamAssassin by choice - because
its false positive rate is too high, but having others use SpamAssassin
to mark a message as bulk will increase my false positive rate.  If I
have to whitelist all of my 'friends' on the off chance that they might
Cc someone running SpamAssassin, that seems to defeat the purpose of
running a program that counts the number of times a message is received.

Basically, having a 'many' means that I am subject to everyone else's
concept of 'bulk' instead of deciding that on my own.  People who
automatically upgrade 20 -> many 'break' my limit of 100.

  "... segmenting the market can be so tricky, sometimes." -- Jason Ozolins

