Fuz2 false positive

Vernon Schryver vjs@calcite.rhyolite.com
Tue Apr 22 14:41:05 UTC 2008


> From: Jeff Mincy <mincy@rcn.com>

> SpamAssassin translates body/fuz1/fuz2 values of "many" to 999999 and
> then compares the translated body/fuz1/fuz2 values to
> dcc_body_max/dcc_fuz1_max/dcc_fuz2_max which default to 999999.  So,
> by default the DCC_CHECK test hits if at least one of the
> body/fuz1/fuz2 values is "many".
>
> SpamAssassin will short-circuit if there is a X-DCC header with
> "bulk".  Otherwise, SpamAssassin uses either dccproc or dccifd to get
> the dcc response.   If there is a X-DCC header with "bulk" then
> the DCC_CHECK hits and the body/fuz1/fuz2 counts are ignored.
> When SpamAssassin explicitly calls dccproc or dccifd then the "bulk"
> string is ignored.  SpamAssassin should presumably notice the "bulk"
> string when calling dccproc or dccifd.
>
> Anyway, you seem to object to SpamAssassin doing s/many/999999/ 
> The many/999999 thing doesn't cause any problems does it?

SpamAssassin's many/999999 thing dates from before dccproc had -c to
set per-checksum thresholds as well as before dccifd existed, and not
to mention before per-checksum thresholds could be into per-user whiteclnt
files.

Setting SpamAssassin's own threshold for FUZ2 as was suggested would
not have the desired effect of ignoring FUZ2 results because the real
value of "many" is larger than 999999.  Some people run (or once ran)
spam traps that reported bad mail with large counts insteadd of "many".
That could result in a dccifd or dccproc header with a FUZ2 result
larger than 1000000 like "X-DCC...FUZ2=1234567..." that would not be
ignored by setting the SpamAassassin threshold to 1000000.

What should be done by someone with the keys to SpamAssassin's DCC
plugin is to
   - have SpamAssassin pass its thresholds as -c args to dccproc
   - use the rejection status it gets from dccifd instead of the
      X-DCC header
   - if SpamAssassin must use the X-DCC header from dccifd, then always 
      and only look for the string "bulk" in the header
   - try harder to find the dccifd socket and to use dccifd instead of dccproc
      and check that the SpamAssassin thresholds are in the dcc_conf file.
   - make SpamAssassin always and only look for the string "bulk" in
      the X-DCC header it gets from dccproc, at the tiny sites using
      dccproc

Then someone who wanted to ignore the FUZ2 result but not the BODY
or FUZ1 results could set the FUZ2 threshold to "never" and get
that very unlikely result.

Instead of turning off only one of the DCC results, it would make far
more sense to adjust the score that SpamAssassin gives a DCC hit.
If you think you have FUZ2 false positives, then you surely think you
have FUZ1 and BODY false positives.

Of course, I still think the right answer is not scoring DCC hits
but per-site and per-user whitelists for solicited bulk email and 
rejections of unsolicited bulk email.


Vernon Schryver    vjs@rhyolite.com



More information about the DCC mailing list

Contact vjs@rhyolite.com by mail or use the form.