collision reported by dblist

Vernon Schryver vjs@calcite.rhyolite.com
Thu Oct 18 16:08:42 UTC 2001


> From: Levent Serinol <lserinol@yahoo.com>

> I have a test dcc server. When I run dblist command
> with verbose option it reports collisions. How can I
> get rid of these collided entries ?

Like Ethernet collisions, collisions in most hash table schemes,
including that used in the DCC server database, are generally not
problems.  As long as the collision rate is not high, everything is fine.

The DCC server code notices and either complains about or deals
appropriately with "duplicate" database entries.  For white-list data,
the appropriate thing is to ignore redundant data.  For reported checksums
of mail, the appropriate thing is to count the additional reports.


> P.S. I tried ./dblist -R but didn't work out. 
> Do I have duplicate entries in white* files or in db ?

What is `dblist -R` though to do?  The current dblist man page says nothing
about -R.  The source knows nothing about -R and there is no sign I
ever had -R doing anything for dblist in my RCS logs.


The DCC server database uses hashing with internal chaining.  As long
as the load factor or "alpha" is less than 90%, the average hash chain
length should be 1.1 or less.  The collision rate is an easy to compute
proxy for the average chain length, which is more expensive to measure.
`dblist -vvv | tail` measures and displays the actual average chain length.

What all of the lines from `dblist -vvv` mean is probably not
interesting unless you're reading the code.


> here is what dblist reports:
>
> 8189 hash entries total  312 or 3% used  7877 free 
> 0.12% collisions

That looks like a almost empty database.
This is what `dblist -vvv` says about my server's database:

  90109 hash entries total  64519 or 71% used  25590 free  20.39% collisions
  46147 hash chains  max length 6  average length 1.4

That average chain length is longer than I like for an alpha of only
0.7, but I think it is tolerable.


For information about hash tables, see "The Art of Computer Programming,
Sorting and Searching", by D.E.Knuth.


Vernon Schryver    vjs@rhyolite.com



More information about the DCC mailing list

Contact vjs@rhyolite.com by mail or use the form.