timestamp off by xx seconds

Vernon Schryver vjs@calcite.rhyolite.com
Mon Jan 9 14:59:39 UTC 2006

> From: Rob McMahon 

> I've been getting these recently, is it anything to be worried about ?

> ADMN FLOD RESUME: rejected ADMN FLOD RESUME request; timestamp off by 36 seconds

> These servers are all ntp'ed up, and the time's looking healthy enough:

All DCC client-server requests include timestamps.  Their main purpose
is to support retransmissions by clients.

Requests also include an authenticator based on the ID and password.
Administrative requests to the server to do things like stop flooding
must use the server's server-ID and password.  To make replay attacks
harder, the server requires that timestamps not be many seconds off.
Because administrative commands almost always come from the local
system, if the local clock is chiming strange ticks, the timestamps
are almost certain to be ok.

This has nothing to do with the checks on the timestamps on reports
of bulk mail flooded from other servers.  Timestamps on flooded reports
are only required to seem reasonable to within an hour or two, because
not all systems use NTP.

What is probably happening in this case is that

  1. dbclean told dccd to unlock the database and flush its buffers to
     disk so that dbclean could work on it

  2. dccd complied by using munmap() to release all of its window into
      the database and then unlocking.  For the duration of the database
      cleaning, on each request dccd will lock, mmap(), do the work,
      munmap(), and unlock the database.  It will also artificially
      inflate the queue delay it reports to DCC clients so that they'll
      prefer some other server.

  3. Typical UNIX-like kernel mmap() code decided to push a GByte of
     data from the kernel buffer cache to the disk all at once.  This
     intermittently stalled several processes including dccd.

  4. dbclean got a chance to run a little during #3, noticed that dccd
      had not answered its request, and retransmitted.

  5. 36 seconds later dccd got a chance to run, found the waiting
      retransmissions, and in a fit a paranoia complained about a
      replay attack.

As long as dbclean did its work, it's nothing to worry about.
By coincidence, yesterday I saw the same complaint on a system,
and found a bug in the code that generates the system log message.
I bet that "ADMN FLOD RESUME" should have been "ADMN FLOD SHUTDOWN"

I've been heard to grumble over the years that the variations of #3
in all flavors of UNIX-like kernels are bugs.  Some UNIX-like systems
such as BSD/OS claim to flush the buffer cache gradually, and may but
go crazy in other ways with large mmap() regions.

I'm testing a new version that should reduce the size of the database,
which should help many DCC servers taht I suspect are suffering from
the recent increase in the size of the consensus database.  An ISP
has turned on dccm -B  to check URLs in mail message bodies.  That
has increased the size of the database, but seems to have increased the
effectiveness of the DCC checks everywhere.

Vernon Schryver    vjs@rhyolite.com

More information about the DCC mailing list

Contact vjs@rhyolite.com by mail or use the form.