802 copies of dccm running

Gary Mills mills@cc.umanitoba.ca
Mon Jan 1 16:04:02 UTC 2007


On Sun, Dec 31, 2006 at 10:14:23AM -0700, Vernon Schryver wrote:
> > From: Gary Mills 
> 
> > I stopped the test at this point because it would surely run away in
> > time, or when the system got busy.  Why are all those `dccm' processes
> > even needed?
> 
> I bet that the excess dccm threads and processes are waiting for
> DNS requests to be finished.  If I'm right about that, then stopping
> sendmail for at most 91 seconds would make all but 3 or 4 dccm child
> threads disappear.

It occurred to me that even though the RBL lookups would be extremely
quick, `dccm' would have to convert many hostnames into IP addresses
before looking them up in the RBL.  The hostname lookup could be very
slow and would time out.  So, I did a triage by setting `no-body' and
`no-MX' within DNSBL_ARGS in dcc_conf.  The difference was dramatic!
The number of `dccm' processes stabilized at 5.  It does go higher
when the traffic increases, but it settles back to 5.  Now I have to
wait for a busy e-mail day to see how high it goes.  I'm only using
the XBL blocklist just now, so really only the SMTP peer address needs
to be checked.

> Does your /etc/resolv.conf set the resolver library timeouts?

Nope.

> Does /var/dcc/build/dcc/include/dcc_config.h say that ./configure
> found the BIND resolver hooks with lines like these?:
>     /* BIND resolver library */
>     #define HAVE_RESOLV_H 1
>     #define HAVE_ARPA_NAMESER_H 1
>     #define HAVE__RES 1
>     #define HAVE_RES_INIT 1
>     #define HAVE_RES_QUERY 1
>     #define HAVE_DN_EXPAND 1

Yes, those are all present.

> I assume you have not set any of the -B timeouts with -B:set:xxx

No, I haven't set those.

> Have you tried -Bset:debug=4 to see what is happening?  That will make
> a lot of noise on a busy system, perhaps too much to determine anything.

No, but I did run `truss' on some of the DNS helper processes.  There
were many lines like this:

poll(0xFFBFEAD8, 2, 60000)      (sleeping...)
poll(0xFFBFEAD8, 2, 60000)                      = 1
recvfrom(37, 0xFFBFEEA8, 1936, 0, 0xFFBFECF8, 0xFFBFECF4) Err#11 EAGAIN
poll(0xFFBFEAD8, 2, 60000)      (sleeping...)
poll(0xFFBFEAD8, 2, 60000)                      = 1
recvfrom(37, 0xFFBFEEA8, 1936, 0, 0xFFBFECF8, 0xFFBFECF4) Err#11 EAGAIN
poll(0xFFBFEAD8, 2, 60000)      (sleeping...)
poll(0xFFBFEAD8, 2, 60000)                      = 1

The intervals between poll() returns were quite random, varying
between a small fraction of a second up to several seconds.  I don't
know why poll() is returning like that.  Eventually, recvfrom()
returns a positive number, it does several DNS lookups, and does a
sendto() to return the result.  Then, it goes back to the curious
poll() behavior.  The poll() system call is used internally by the
select() function.  File descriptor 37 is a UDP socket which, I
assume, communicates with the parent.

-- 
-Gary Mills-    -Unix Support-    -U of M Academic Computing and Networking-



More information about the DCC mailing list

Contact vjs@rhyolite.com by mail or use the form.