dccm timeout lets some spam through

Gary Mills mills@cc.UManitoba.CA
Fri Apr 25 04:39:57 UTC 2003

On Thu, Apr 24, 2003 at 09:28:28AM -0600, Vernon Schryver wrote:
> > From: Gary Mills <mills@cc.UManitoba.CA>
> > Apr 24 02:46:48 electra dccm[26800]: [ID 702911 mail.error] skip asking DCC 4.000 seconds more after failure
> > Apr 24 02:46:49 electra dccm[26800]: [ID 109917 mail.error] DCC, mi_rd_cmd: read returned -1: Connection reset by peer
> > ...
> Note that mi_rd_cmd() is a function in the sendmail libmilter library.
> Those two messages suggest something was seriously wrong.

Now, this is during the dccd database rebuild.  Is it possible that
dccd closed the connection?  That particular error seems to show up
every morning during the rebuild.  I suspect that it's unrelated to
the other error.

> > Hmm, I'm just looking at sendmail's libmilter code.  It checks to see if
> > the file descriptor is larger than FD_SETSIZE.  If it is, it closes the
> > socket and sets the error number to ERANGE.  I also notice that sendmail
> > has a symbol _FFR_USE_POLL that tells libmiter to use poll() rather than
> > select().  `poll()' doesn't have the restriction to 1024 file descriptors
> > that select() has.  Maybe _FFR_USE_POLL is my solution?

I asked about this on comp.mail.sendmail.  Several people said that
it was safe to use, and works well.  It's enabled by adding:

	APPENDDEF(`conf_libmilter_ENVDEF', `-D_FFR_USE_POLL')

to site.config.m4, and rebuilding libmilter.  I haven't tested it yet,
but I did build the library.

> The DCC client library used in dccm and elsewhere also uses select(). 
> If socket() is returning file desciptors larger than select() can handle,
> then very bad things will happen.  The least bad I can think of is that
> dccm won't be able to hear dccd.

Yes, that is a problem.

> I guess the long run fix is for me to:
>   - add yet more auto-conf and #ifdef stuff to have the DCC client library
>       use poll() on Solaris
>   - add documentation to suggest the use of _FFR_USE_POLL
> Until then, you should probably set dccm to use -j220 if you use
> per-user whitelists and -j490 if not.

The current setting is `-j 400'.  I don't use per-user whitelists.
I have `ulimit -n 2048' in /etc/init.d/rcDCC.  I'll have to lower
that until all the select()s are fixed.

> The trigger for the problem may be that something gets too slow causing
> your system to have many simultaneous SMTP transactions.  Or perhaps
> someone hits your system with hundreds of simultaneous messages.

Yes, that happens.  I do have a connection rate limit set in

> I trust you are not running the cron-job on your two DCC servers at the
> same time, so that you always have a non-busy DCC server.

No, they are staggered.

-Gary Mills-    -Unix Support-    -U of M Academic Computing and Networking-

More information about the DCC mailing list

Contact vjs@rhyolite.com by mail or use the form.