dccm and dccd (greylist) - Another newbie-

Paul R. Ganci ganci@nurdog.com
Sat Jan 14 18:45:32 UTC 2006


Vernon Schryver wrote:

>Why not let sendmail+dccm reject spam?  Why complicate it by involving
>SpamAssassin?  
>
Because of the chance of a false positive (FP). SpamAssassin allows one 
to process the added DCC header to one's content. Therefore I have finer 
control over what is considered spam. In this instance it is a "black & 
white" decision made on something that actually is grey.

>As I've said many times, I think is wrong.  The right way to use the
>DCC has nothing to do with SpamAssassin.  Assuming you are using sendmail,
>it consists of:
>
>  - installing dccm as described in the INSTALL.txt or .html file
>
>  - letting sendmail+dccm reject unsolicited bulk email
>      To do that, sent DCCM_REJECT_AT in /var/dcc/dcc_conf to what
>      you consider "bulk".  Common choices range from 5 to 500, with
>      small values appropriate for small sites
>  
>
Okay one can do this but then what does one do with scores < 
DCCM_REJECT_AT? Perhaps some percentage of that is spam too. Most 
probably some percentage of rejected Email was legitimate. Effective 
spam detection requires a multi-layered, multi-tool approach. Set 
DCCM_REJECT_AT too low and you get FPs. Set it too high and you get too 
much spam allowed in. I don't believe a single DCCM_REJECT_AT score 
addresses this problem.

A while back I was asking about a two tier system where there were two 
controls ... namely a DCCM_REJECT_AT control which DCC uses to reject 
Email and a DCCM_BULK_AT control which DCC uses to only add a bulk 
designation to the DCC header ... i.e. no rejection. Then I could set 
the DCCM_REJECT_AT to a "large" value which would reject the most 
obvious stuff with low risk for FP and let my SpamAssassin use the 
"bulk" DCC header to determine what to do with stuff which was scored 
DCCM_BULK_AT <= Score < DCCM_REJECT_AT. Email scored <DCCM_BULK_AT would 
not be assessed any SpamaAssassin penalty. With two DCC controls I can 
assert much finer influence over what is bulk and what isn't and make 
the most efficient use of DCC and SpamAssassin together in regards to 
the number of FPs found and computer resources used.

>   - using site-local as well as per-user logs and whitelists to
>      identify solicited bulk email.
>      To do that, follow comments in /var/dcc/dcc_conf about setting
>      DCCM_LOG_AT= your notion of bulk, leaving DCCM_REJECT_AT blank,
>      and monitoring bulk mail in /var/dcc/log.  Each time you see
>      solicited bulk mail, whitelist the sender.
>      That can be done by pointing-and-clicking if you set up the
>      CGI scripts as described in /var/dcc/cgi-bin/README.
>  
>
For a large site this still seems to me to be impractical. Some things 
come to mind:

    1.) How large can a whitelist grow before it takes a figurative 
"days" for DCC to read and process?

    2.) When can I safely turn on DCCM_REJECT_AT? Even for my small site 
(400 subs) we are growing. I can monitor the logs for a time, create my 
whitelist and then turn on DCCM_REJECT_AT. However as soon as I add my 
next batch of new users aren't they subject to loosing legitimate bulk 
Email not found in my whitelist? I argue new users will sooner or later 
end up with a false positive.  As, I am a volunteer for a rural, 
mountain internet coop in the CO Rockies, I don't have time to monitor 
logs and maintain whitelists given my day job responsibilities. 
Moreover, I try to avoid FPs like the plague. I just get tired hearing 
complaints from those 400 subs ... the job just doesn't pay enough. :)

    3.) How do the scripts work when an organization has multiple Email 
servers with multiple instances of DCC? How is all the data from the 
various logs combined to form one unique whitelist used by all flooded 
servers?

   4.) How would users maintain their own DCC whitelists from a single 
location given that there exist multiple DCC servers and the fact that 
the incoming Email servers where DCC resides are not the servers with 
user accounts?

   5.)  Unfortunately Both DCC and SpamAssassin allow for both global 
and per user whitelisting. How do I reconcile with the end user that he 
has to whitelist senders in two places now? I don't believe I can make a 
single whitelist available to both tools ... hence I have 2 times the 
work and twice the user hassle.

Admittedly questions 3-5 are likely due to my lack of understanding of 
what the scripts do and my opinion that whitelists/blacklists are just 
too dynamic to effectively maintain. Our Coop had a serious user revolt 
on its hands when we attempted to reject Email at the MTA based upon 
public DNSBLs. Even our private sendmail whitelist/blacklist constructed 
from our sendmail log files was too volatile to maintain. Hence we 
purposely went to a SpamAssassin approach and leave it to the user to 
decide what to do with Spam at least to some degree. The Coop is still 
willing to reject outright "high" enough scoring stuff since a 
statistical analysis indicated it was "always" spam. Hence the reason I 
am pushing for two DCC controls and have some fear of outright Email 
rejection.

Vernon, I want to make it perfectly clear I am not picking on DCC ... it 
is clearly one of the most useful UBE killers I have found. I also 
understand why you suggest the usage you do. Clearly catching bulk 
before it ever gets to MailScanner or SpamAssassin is worthwhile ... it 
avoids the resource hogs they are. However, in the scenario I describe, 
I don't have to maintain a DCC whitelist, I can still reject some bulk 
Email upfront with less FP risk, and can process questionable stuff in 
more detail with some resource cost. This spam detection procedure would 
work the best in my situation. I am sure there are many others who will 
differ with this opinion.

-- 
Paul (ganci@nurdog.com)




More information about the DCC mailing list

Contact vjs@rhyolite.com by mail or use the form.