dccm timeout lets some spam through

Gary Mills mills@cc.UManitoba.CA
Thu Apr 24 14:05:34 UTC 2003


We had another incident this morning, evident by some e-mail without
X-DCC headers.  When I checked, at about 8:15, dccm was using 304
threads, which is certainly higher than normal.

Errors were first logged during database cleaning on the local dccd:

Apr 24 02:46:44 electra dccm[26800]: [ID 702911 mail.error] skip asking DCC 1.000 seconds more after failure
Apr 24 02:46:45 electra dccm[26800]: [ID 702911 mail.error] skip asking DCC 2.000 seconds more after failure
Apr 24 02:46:48 electra dccm[26800]: [ID 702911 mail.error] skip asking DCC 4.000 seconds more after failure
Apr 24 02:46:49 electra dccm[26800]: [ID 109917 mail.error] DCC, mi_rd_cmd: read returned -1: Connection reset by peer
...
Apr 24 02:46:56 electra dccm[26800]: [ID 702911 mail.error] skip asking DCC 3.991 seconds more after failure
...
Apr 24 02:47:23 electra dccm[26800]: [ID 109917 mail.error] DCC, mi_rd_cmd: read returned -1: Connection reset by peer

Should dccm not have changed over to the other dccd, instead of attempting
to use the busy one?  Maybe it did, and didn't log that information.

The main problem started later:

Apr 24 07:16:19 electra dccm[26800]: [ID 125918 mail.error] DCC: accept() returned invalid socket (Result too large), try again
Apr 24 07:16:21 electra dccm[26800]: [ID 125918 mail.error] DCC: accept() returned invalid socket (Result too large), try again
...
Apr 24 07:16:44 electra dccm[26800]: [ID 702911 mail.error] no answer from naos.cc.umanitoba.ca (130.179.16.122,6277) after 0 ms
Apr 24 07:16:44 electra dccm[26800]: [ID 702911 mail.error] skip asking DCC 4.000 seconds more after failure
...
Apr 24 08:14:21 electra dccm[26800]: [ID 702911 mail.error] skip asking DCC 300.000 seconds more after failure
Apr 24 08:14:39 electra dccm[24803]: [ID 943104 mail.notice] 1.1.35 listening to inet:xxxx with /usr/local/dcc

This suggests that the problem is not related to a busy dccd, but is more
likely a result of overload of dccm.  Now that this has happened again,
I'm going to see if sendmail can limit the load it imposes on dccm.

Hmm, I'm just looking at sendmail's libmilter code.  It checks to see if
the file descriptor is larger than FD_SETSIZE.  If it is, it closes the
socket and sets the error number to ERANGE.  I also notice that sendmail
has a symbol _FFR_USE_POLL that tells libmiter to use poll() rather than
select().  `poll()' doesn't have the restriction to 1024 file descriptors
that select() has.  Maybe _FFR_USE_POLL is my solution?

-- 
-Gary Mills-    -Unix Support-    -U of M Academic Computing and Networking-



More information about the DCC mailing list

Contact vjs@rhyolite.com by mail or use the form.