What is the right DCC database expiration?

Vernon Schryver vjs@calcite.rhyolite.com
Wed Mar 3 22:36:01 UTC 2004


> From: Spike Ilacqua 

> Just a random though, but what about changing the time stamp from the
> time the record was created to the time last it was updated?  Checksums
> of messages that are still going around would stay and ones that had
> stopped circulating could be expired on a shortened time frame.

That's how it already works...well, actually, new records are kept if
the accumulated total is bulky...well, actually both are "compressed"
into new records with the timestamp of the latest sighting and the
overall total count...sort of.

Not that I expect anyone to know any of that.  I'm happy with the way
the consensus database copes with the incoming stream of ~12 GBytes/day
of checksums for perhaps 500 GBytes/day of mail.  30 days at 12
GBytes/day would be a lot of bits if handled in a straightforward
fashion.  I'm not happy about how it is documented or of the enormous
pile of kludges that make it work.

> A potential downside I can think of would be an expansion of the test
> message problem.  Perhaps a common enough non-bulk message could linger
> long enough in the database to cross the bulk threshold?

That does happen, and is probably why I've answered some questions
from recent DCC server operators with a pointer to the script that
fetches John Levin's list of empty message checksums.  That script is
in /var/dcc/libexec/fetch-testmsg-whitelist starting with version 1.2.8.


Perhaps I need to rephrase my question:

 Some of the DCC servers are showing signs of stress in dealing with
 the current nearly 1 GByte of of data.  Some operators have chosen
 dbclean -e and -E values smaller than the defaults.  Should the
 defaults be reduced?
 I suspect so.   But is that right?  If so, how much?  Why?


Vernon Schryver    vjs@rhyolite.com



More information about the DCC mailing list

Contact vjs@rhyolite.com by mail or use the form.