short log with dcc

Vernon Schryver vjs@calcite.rhyolite.com
Fri Oct 16 21:07:40 UTC 2009


> From: Bokhan Artem <APTEM@ngs.ru>

> > Building dccm with `./configure --with-max-log-size=1` would
> > limit log files to 1 KByte of message body.
> >   
> The reason is the waste of resources, servers are quite busy with email 
> traffic. 

I don't think you have a local DCC server, and you have not attracted
attention by using the public DCC servers to more than 100K msgs/day.
Therefore it seems likely that your mail systems are handling
fewer than 200K messages per day.

20 years ago 200K msgs/day was a big deal.  (I'll spare you war stories
of days when computers and networks were 1000 times and more slower.)
Today 200K msgs/day is not trivial, but not worth mentioning.  I now run
spam traps that feed 30K spam/day through sendmail+dccm in about 1% of
a cheap computer.
If your mail system is quite busy with less than 200K msgs/day, it might
pay to look at your other spam filters that use lots of CPU cycles
such as DNSBLs, ClamAV, and SpamAssassin.  


> Writing files to disk is expensive (all stuff is in memory now, no any 
> disk i/o),
> writing files into memory and frequent postprocessing them with script 
> is an alternative,
> but it does not look elegant and needs more memory.

If you don't have spare resources to write a 4K Byte log file, then you
surely do not have the larger resources needed to fork(), exec(), parse,
and run a script.  Just creating the u area and the stack for the new
process for the script probably involves more than 4KBytes of I/O (of
course generally not to the disk).

It is likely that there is no difference between writing a new log file
of 100 bytes and writing a new log file of 4 KBytes, whether you
use a memory file system or classic disk.
Both will use at most data block and the same amount of inode and
indirect I/O in a classic filesystem.  In a journaling filesystem, you
are also unlikely to be able to measure a difference between 100 bytes
and 4 KBytes.

Yes, I've encountered byte copy issues, bus occupancy, cache thrashing,
and other issues.  However, they don't apply to the relatively small
amounts of data handled even by a busy mail system.


> >  If you use dccm+sendmail,
> >   
> I use postfix+dccm, I do not know yet when postfix writes message-id, 
> before or after milter.

How are you using postfix+dccm?  That last time I checked, I found
that the postfix milter interface incompatible with the sendmail milter
interface as far as dccm is concerned.

Why not use postfix with dccifd as a before-queue filter?  That's
the recommended DCC configuration with postfix.


> Any advice about code hook place?

The best thing about open source is that you can read the source and
make needed changes.  That is also the worst thing about open source.
People with much experience try to make as few changes as if the source
were secret.  One reason is that local changes break the warrenty; admit
that you've changed the code and you'll find that any and all problems
you encounter are blamed on your changes.  Another reason is that
integrating local changes into the next version, the version after that,
and the version after that, and so on is no fun at all after you've
done it a few times.

Over the decades, I've accumulated a big box of tools to make it easier
to port my improvements to successive versions other people's programs.
However, my most powerful and most often used tool today is resisting
the urge to make changes.
I predict that if you do change dccm, then in 6 months or a year from
now you or your successor will discard those changes and probably stop
using DCC.  But of course, no one few who not been on the open source
merrygoround for decades sees it that way.


Vernon Schryver    vjs@rhyolite.com



More information about the DCC mailing list

Contact vjs@rhyolite.com by mail or use the form.