dccm -x

Vernon Schryver vjs@calcite.rhyolite.com
Wed May 14 15:44:55 UTC 2003


> From: Valentin Chopov <valentin@valcho.net>

> yes everything is as you wrote.
> It was:
> localhost   RTT-1200   32768 <pass>
> backup         RTT-1000   32768 <pass>
> dcc.dcc-servers.net RTT+1200
>
> But it seems that when the localhost RTT is high (e.g. 1400) and others
> are even higher then that, dccm is passing the spam. I just removed
> RTT+1200 from the dcc.dcc-servers.net amd now if localhost and backup fail
> it goes to piblic dcc servers.
> I'm getting very often:
> "skip asking DCC x.xxxx seconds more after failure" dccm message in the
> log file.

I don't understand why an RTT of only 1400 would cause the DCC client
code to give up, unless those 1.4 second delays happen suddenly when
the RTT was previously tiny.


> I think I asked once this question but I ask again: why when the localhost
> or backup dcc servers are not active rheir RTT is high.
>
> localhost.,-    RTT-1200 ms  .........
> #   127.0.0.1,-      .........
> #      88% of 32 requests ok 1545.81-1200 ms RTT       567 ms queue wait
>
> backup,-        RTT-1000 ms  .....
> # * 10.10.10.10,-       ...........
> #     100% of 32 requests ok    4.90-1000 ms RTT       550 ms queue wait

If something stalls the server, such very slow disk accesses, the 
server's measure of the average delay in its queue gets large.  The server
reports its average queue delay or queue wait time to clients.  Clients
include the queue wait when updating the average RTT when they use 
NOPs to test servers instead of real DCC operations that get queued.
That's all good when the server is busy.  When server gets very busy,
its queue wait increases and clients start using other servers.
However, If the server is idle except for NOPs, it doesn't get a
chance to reduce its average queue wait because it has no operations.

Thus, if something stalls a server for several seconds such as the
bugs in mmap() files in BSD/OS 4.2 and before, the stall will cause
its queue wait to explode and clients will switch to other servers.
If there are only a few clients for the server, their NOPs every hour
to see if it's sick won't let it reduce its queue wait.

I plan to fix that by making the queue wait decay toward a minimum
even when the sever is idle.


> BTW, this is not happening when the cron jobs are running.

What is stalling dccd?


Vernon Schryver    vjs@rhyolite.com



More information about the DCC mailing list

Contact vjs@rhyolite.com by mail or use the form.