dccifd: restart after signal 6

Vernon Schryver vjs@calcite.rhyolite.com
Fri Jun 5 21:52:05 UTC 2009


> From: MrC <lists-dcc@cappella.us>

> As expected, this is the result of a kill(2) call.
>
> #0  0xbbaf923f in kill () from /usr/lib/libc.so.12
> #1  0xbbb95a64 in abort () from /usr/lib/libc.so.12
> #2  0xbbbdc60c in __res_state () from /usr/lib/libpthread.so.0
> #3  0x0806b9ca in dcc_res_delays (budget=4) at get_port.c:476
> #4  0x080660f2 in dcc_clnt_rdy (emsg=0xb91fff50 "", ctxt=0x80c9000, 
> clnt_fgs=8 '\b') at clnt_send.c:1740
> #5  0x080544dc in clnt_resolve_thread (arg=0x0) at clnt_threaded.c:394
> #6  0xbbbe562d in pthread_join () from /usr/lib/libpthread.so.0
> #7  0xbbb1aa2c in swapcontext () from /usr/lib/libc.so.12

Now that you mention it, I saw an instance of it a week or two ago,
but hoped it was a fluke.  I've been unable to reproduce it then or today.

That's an ugly one, because it's not in my code.
This is the relevant part of my get_port.c:

	if (!dcc_host_locked)
		dcc_logbad(EX_SOFTWARE, "dcc_get_host() not locked");

	/* get the current value */
	if (!(_res.options & RES_INIT))
		res_init();

dcc_logbad() calls abort() after syslog().
Because I assume the resolver is not thread safe and check that it's
locked, it can't be a simple, valid locking problem.

I guess I'll have to look for NetBSD's version of the resolver library
to see what NetBSD has done to it.  There are no abort() calls in the
FreeBSD 7.1 version of res_state.c


Have I mentioned that I'm not a fan of the clean target in
the NetBSD bsd.prog.mk because it deletes .gdbinit?


thanks,
Vernon Schryver    vjs@rhyolite.com



More information about the DCC mailing list

Contact vjs@rhyolite.com by mail or use the form.