dccm misbehaving on Solaris 9

Andy Rudoff andy@rudoff.com
Wed Mar 1 15:59:26 UTC 2006


>> 	dccm[2920]: [ID 702911 mail.error] fdopen(whiteclnt): Resource temporarily unavailable
[...]
>  I've been trying to reproduce the failure from fdopen().  That
> message from dccm only happens when fdopen() returns 0 and set errno
> to EAGAIN, but the Solaris `man fdopen` does not mention EAGAIN.

The man page might be lacking in a detail or two here :-)

I can think of two ways that errno can be EAGAIN on return from fdopen().
First, fdopen() can return NULL without setting errno (an ugly little
fact that is documented in the Solaris man page).  So if fdopen() finds
there are no stdio streams left and errno was already set to EAGAIN from
some previous syscall, it is technically possible to get the NULL/EAGAIN
combination.

But that's not what I think happened above.  fdopen() calls calloc()
and if it gets NULL back, the errno is preserved on return from fdopen()
(this detail is sadly missing from the Solaris man page).  calloc()
can indeed return EAGAIN.  If calloc() fails because the process memory
limit is hit, then it returns ENOMEM, but if it fails because the system
is out of swap space, it returns EAGAIN.  The idea being, I guess, that
the resource exhaustion may be temporary so the application can try again
later.

That's my best guess at what caused the above problem: the system may
have been out of swap space.  If you catch it in the act, you can use
"swap -l" to look at the status of the swap space.

-andy



More information about the DCC mailing list

Contact vjs@rhyolite.com by mail or use the form.