dccifd: restart after signal 6

MrC lists-dcc@cappella.us
Fri Jun 5 23:16:23 UTC 2009



On 6/5/2009 2:52 PM, Vernon Schryver wrote:
>> From: MrC<lists-dcc@cappella.us>
>
>> As expected, this is the result of a kill(2) call.
>>
>> #0  0xbbaf923f in kill () from /usr/lib/libc.so.12
>> #1  0xbbb95a64 in abort () from /usr/lib/libc.so.12
>> #2  0xbbbdc60c in __res_state () from /usr/lib/libpthread.so.0
>> #3  0x0806b9ca in dcc_res_delays (budget=4) at get_port.c:476
>> #4  0x080660f2 in dcc_clnt_rdy (emsg=0xb91fff50 "", ctxt=0x80c9000,
>> clnt_fgs=8 '\b') at clnt_send.c:1740
>> #5  0x080544dc in clnt_resolve_thread (arg=0x0) at clnt_threaded.c:394
>> #6  0xbbbe562d in pthread_join () from /usr/lib/libpthread.so.0
>> #7  0xbbb1aa2c in swapcontext () from /usr/lib/libc.so.12
>
> Now that you mention it, I saw an instance of it a week or two ago,
> but hoped it was a fluke.  I've been unable to reproduce it then or today.
>
> That's an ugly one, because it's not in my code.
> This is the relevant part of my get_port.c:
>
> 	if (!dcc_host_locked)
> 		dcc_logbad(EX_SOFTWARE, "dcc_get_host() not locked");
>
> 	/* get the current value */
> 	if (!(_res.options&  RES_INIT))
> 		res_init();
>
> dcc_logbad() calls abort() after syslog().
> Because I assume the resolver is not thread safe and check that it's
> locked, it can't be a simple, valid locking problem.
>
> I guess I'll have to look for NetBSD's version of the resolver library
> to see what NetBSD has done to it.  There are no abort() calls in the
> FreeBSD 7.1 version of res_state.c
>

I don't see anything that immediately jumps out between changes in 4.0 
and 4.0.1, but I'm long since familiar with libc code.


Here's the abort() in the libpthread version of res_state:

libpthread/res_state.c

     /*
      * This is aliased via a macro to _res; don't allow multi-threaded 
programs
      * to use it.
      */
     res_state
     __res_state(void)
     {
             static const char res[] = "_res is not supported for 
multi-threaded"
                 " programs.\n";
             (void)write(STDERR_FILENO, res, sizeof(res) - 1);
             abort();
             return NULL;
     }


The libc version uses weak aliases:

     #include <sys/cdefs.h>
     #if defined(LIBC_SCCS) && !defined(lint)
     __RCSID("$NetBSD: res_state.c,v 1.5.10.1 2007/05/17 21:25:19 jdc 
Exp $");
     #endif

     #include <sys/types.h>
     #include <arpa/inet.h>
     #include <arpa/nameser.h>
     #include <netdb.h>
     #include <resolv.h>

     struct __res_state _nres
     # if defined(__BIND_RES_TEXT)
             = { .retrans = RES_TIMEOUT, }   /*%< Motorola, et al. */
     # endif
             ;

     res_state __res_get_state_nothread(void);
     void __res_put_state_nothread(res_state);

     #ifdef __weak_alias
     __weak_alias(__res_get_state, __res_get_state_nothread)
     __weak_alias(__res_put_state, __res_put_state_nothread)
     /* Source compatibility; only for single threaded programs */
     __weak_alias(__res_state, __res_get_state_nothread)
     #endif

     res_state
     __res_get_state_nothread(void)
     {
             if ((_nres.options & RES_INIT) == 0 && res_ninit(&_nres) == 
-1) {
                     h_errno = NETDB_INTERNAL;
                     return NULL;
             }
             return &_nres;
     }

And dccifd is linked against libpthread:

     $ ldd /var/dcc/libexec/dccifd
     /var/dcc/libexec/dccifd:
             -lpthread.0 => /usr/lib/libpthread.so.0
             -lm.0 => /usr/lib/libm387.so.0
             -lm.0 => /usr/lib/libm.so.0
             -lc.12 => /usr/lib/libc.so.12


Let me know if there is something I can do to help.


> Have I mentioned that I'm not a fan of the clean target in
> the NetBSD bsd.prog.mk because it deletes .gdbinit?
>

Oh, boy.  They really mean *squeaky clean*.

Thanks for your time,
Mike



More information about the DCC mailing list

Contact vjs@rhyolite.com by mail or use the form.