dccm running out of file descriptors

Vernon Schryver vjs@calcite.rhyolite.com
Mon Feb 2 16:32:49 UTC 2004


> From: Gary Mills <mills@cc.UManitoba.CA>

> ...
> 2832 IDLE

>       *.*                  *.*                0      0 24576      0 IDLE

> COMMAND   PID   USER   FD   TYPE        DEVICE SIZE/OFF    NODE NAME
> dccm    14294 daemon 3667u  IPv4 0x3002d0361f0      0t0     TCP electra.cc.umanitoba.ca:*->electra.cc.umanitoba.ca:* (IDLE)

That seems significantly different from the previous lsof lines:

dccm    9546 daemon 3910u  IPv4 0x30002868bd0      0t0     TCP electra.cc.umanitoba.ca:*->naos.cc.umanitoba.ca:* (IDLE)

The new lines that have electra.cc.umanitoba.ca as the remote host
do look like idle sockets that have been created with socket() system
calls but not used.  Perhaps the previous lsof lines were bogus
due to a confusion in lsof about what it should say for INADDR_ANY or 0.

> ...
> > It would be nice to hear that there were 3000 of those messages, perhaps
> > collapsed by syslog into a few "last message repeated 1000 times" entries.
>
> Only 491, starting at 01:14:29.

That's disappointing.


> 2285 TIME_WAIT

That's odd.  Why would a socket that was in the TCP IDLE state
need to go to the TIME_WAIT state when it is closed?   That casts
doubt on the idea that the IDLE states are real, and bolsters the
notion of a missing close() in some error path in either dccm or
the milter library.  Since it is the milter library that closes
the sockets, I've two reasons for suspecting the library.


Vernon Schryver    vjs@rhyolite.com



More information about the DCC mailing list

Contact vjs@rhyolite.com by mail or use the form.