Using DCC To Filter Message Bodies

Vernon Schryver
Tue Jan 22 20:00:34 UTC 2002

> From:

> I'd like to use DCC to filter spam messages based on the contents of the body.
> IE, I add the message about Britney Spears's naked playtime (which I get about
>6 times a day, from various servers and addresses) to my database and then when
>that message body comes in, it is blocked or an X-This-Is-Spam header is added.
>Anyone have a howto on how to set this up?  I already have DCCm up and going
>via sendmail 8.12.2's milter, and I can report messages via dccproc, but I don't
>get the part about filtering based on message body's checksum being in the
>database.  All I'm getting is a count of how many times a particular message has
>arrived, whether it's been specifically reported or not.
> I guess waht I'm asking is how do I ONLY match messages that I've specifically
> set to be matched by manually reporting them with dccproc (or any other manual
> reporting tool).

I would do that by
  - using an isolated DCC server (dccd) that has no incoming flood/feeds of
     checksum reports and listens only to my own DCC clients,
     including dccproc and dccm.
  - reporting the junk by `dccproc -t many`
  - setting appropriate thresholds for dccm or dccproc
For example, if you use body thresholds of "many" or just 1,000,000,
messages that have not been manually reported won't be rejected.

Note that 
  - this mode would radical reduce the effectiveness of the DCC.  
  - the DCC body filtering is not based on regular expressions.  Two
     of the checksums are "fuzzy" and so will ignore some text in bodies.
     However, if you want to reject all mail containing the string
    "Britney Spears", procmail would probably work better.

I've hacked a web page that accepts sample spam and uses `dccproc`
to do a query at
so that people can "test drive" the DCC with the publicly available
feed of checksums.

Vernon Schryver

