Vernon Schryver
vjs@calcite.rhyolite.com
Mon Jan 23 19:59:56 UTC 2006
> From: Gary Mills
> This happened twice today. The first instance was about 24 hours
> after upgrading to dcc-1.3.25. The second was about three hours
> later:
>
> Jan 23 07:55:30 electra dccm[15639]: [ID 702911 mail.error] fdopen(whiteclnt): Resource temporarily unavailable
> Jan 23 08:06:19 electra dccm[15639]: [ID 404106 mail.error] msync(whiteclnt.dccw): Not enough space
> Jan 23 08:07:28 electra dccm[15639]: [ID 702911 mail.error] stat(whiteclnt.dccw): Bad file number
> Jan 23 08:07:28 electra dccm[15639]: [ID 702911 mail.error] fdopen(whiteclnt): Bad file number
> Jan 23 08:08:07 electra dccm[15638]: [ID 615584 mail.error] restart after signal #11
> Jan 23 08:08:07 electra dccm[29912]: [ID 553385 mail.notice] 1.3.25 listening to inet:3331 with /usr/local/dcc
>
> In each case, the restarter did its job nicely, so there was no disruption
> in service. `dccm' is running on Solaris 9. What could be causing this
> behavior?
It looks like a serious bug. Was dccm built with -g? If sow, what does the stack trace from gdb of the core file look like?
If not, it would be good to rebuild with -b by
/usr/local/dcc/libexec/updatedcc -eDBGFLAGS=-g
(or wherever you put the DCC libexec directory)
> There was a presumably unrelated error early this morning, just after
> the database rebuild completed:
>
> Jan 23 03:55:30 electra dccd[15609]: [ID 471644 mail.notice] 1.3.25 database /usr/local/dcc/dcc_db reopened with 2037 MByte window
> Jan 23 03:55:32 electra dccm[15639]: [ID 702911 mail.error] fcntl(F_SETLKW F_WRLCK resolve lock /usr/local/dcc/map 3): Deadlock situation detected/avoided
That should also be impossible, but it was some kind of dccm locking
problem again.
Is /usr/local/dcc an NFS file system?
If so, did you build with `./configure --with-bad-locks`?
You can tell by looking at the `./configure` command near the end of
libexec/updatedcc.
Vernon Schryver vjs@rhyolite.com
More information about the DCC
mailing list