dccm timeout lets some spam through

Vernon Schryver vjs@calcite.rhyolite.com
Tue Apr 22 15:54:42 UTC 2003


> From: Gary Mills <mills@cc.UManitoba.CA>

> ...
> Apr 22 02:46:38 electra dccm[401]: [ID 125918 mail.error] DCC: accept() returned invalid socket (Result too large), try again
> Apr 22 02:46:39 electra dccm[401]: [ID 125918 mail.error] DCC: accept() returned invalid socket (Result too large), try again
> Apr 22 02:46:40 electra dccm[401]: [ID 702911 mail.error] no answer from naos.cc.umanitoba.ca (130.179.16.122,6277) after 0 ms
> Apr 22 02:46:40 electra dccm[401]: [ID 702911 mail.error] skip asking DCC 1.000 seconds more after failure

> The result was that `dccm' would time out attempting to contact `dccd'.
> Here's an example from a DCC log file several hours after the beginning
> of the incident:
>
>   skip asking DCC 160.704 seconds more after failure
>   ...
>   result: accept
> ...

That confounds two different symptoms.   The "skip asking" and "no answer"
messages concerns dccm's failure to hear from dccd.  The "invalid socket"
complaint is from the sendmail libmilter code.  The two problems might
have a common cause, but they are superficially independent.

What ended the problem, restarting dccm?

I assume (based in part on `grep -i 'too large' /usr/include/sys/errno.h`
on a Solaris systm) that "Result too large" means that libmilter.a
was told EOVERFLOW by accept().  However, I cannot find any clue in
`man -s 3xnet accept` or `man -s 3socket socket` why Solaris would 
whine about overflow when doing an accept.

My best but wild guess is that the dccm process ran out of file
descriptors or something similar.  Some hints in /usr/include/sys/select.h
on a Solaris system may sometimes be limited to 1024.  It might be
worthwhile to limit the number of concurrent messages handled by
sendmail to 512.


Vernon Schryver    vjs@rhyolite.com



More information about the DCC mailing list

Contact vjs@rhyolite.com by mail or use the form.