Unable to get counts for IP,env_from cksums when using ALL in DCCM_CKSUMS

Vernon Schryver vjs@calcite.rhyolite.com
Fri Mar 1 07:45:37 UTC 2002


> From: Yusuf Goolamabbas <yusufg@outblaze.com>

> Hi, I am using dcc 1.0.47 alongwith sendmail (8.12.2) in milter mode. At
> present, I am just logging everything which exceeds my threshold
> I have the following set in /var/dcc/dcc_conf
>
> DCCM_LOG_AT=20
> DCCM_CKSUMS=ALL
>
> However, when I log at a file in the logdir, I only see checksums of
> Body,Fuz1 and Fuz2 reported in X-DCC-Brand-Metrics line. Did I miss
> anything ?

Unless the DCC server has something to say about a checksum, the X-DCC
line won't have a count for that checksum.  That keeps the X-DCC line
from being long and filled with boring, consistently zero counts,
or boring counts that are equal to the number of target mail 
addresses in the mail message.

Starting a few versions ago, the DCC server does not keep non-body
checksums from its own clients unless you run it with -K all (or some
subset of "all").  As a result, by default the DCC server has nothing
to say about IP address or other non-body checksums unless they are
in the server's white list.

My reason for this change was that it reduces the amount of data in
busy servers by up to 5X.  Keeping only the body checksums should
reduce the database size to 3/8ths or about 2X, but a lot of spam with
common bodies differs in other checksums.  Not having those other,
generally useless counts makes it possible to compress or combine the
records of identical body checksums.  This doesn't matter for servers
dealing with fewer than 100,000 mail messages/day, but the database
size and dbclean performace for 500,000 or more mail messages/day
shouldn't be ignored.

Each mail message generates about 84 bytes in the dcc_db file and
about 36 bytes in the dcc_db.hash file.  120 bytes times 100,000
messages/day is only 84 MByte/week.  Of that, the 25 MByte of hash
table needs to be in RAM, with two copies while the database is being
rebuilt (one for the server and one for dbclean).  That's not too much
for modern systems, but 1,000,000 mail messages/day gets to 250*2 MBytes,
and that's not so small.

Again, if you want to count IP checksums, you need to add "-Kall" to
DCCD_ARGS in dcc_conf for your DCC server.  In practice the IP address
seemed of little use except for white-listing, and that seems to be
done best by the DCC client instead of the server.  Putting a non-trivial
DCC rejection threshold on an SMTP client by its IP address seems
unlikely to be useful.  Trivial rejection thresholds of 0 are best
handled by blacklisting in the DCC client, either in the DCC whitelist
or the sendmail access_db.


> Also, man times I have observed transient msg.* files in the logdir (ie)
> for a brief moment you see it and then the file goes away. Under what
> circumstances would this occur ?

Dccm can't know whether a log file is needed until it has collected
the entire body.  To avoid awkward questions of what to do with
multi-MByte SMTP bodies, dccm puts everything into a log file as they
collect it.  After the DCC server answers, dccm checks the logging
thresholds and either completes the useful log file with what the
server had to say or it unlink()s and throws away the useless, incomplete
log file.


Vernon Schryver    vjs@rhyolite.com



More information about the DCC mailing list

Contact vjs@rhyolite.com by mail or use the form.