dccd memory usage

Vernon Schryver vjs@calcite.rhyolite.com
Mon Oct 4 14:14:57 UTC 2004

> From: Richard Underwood 

> 	I'm curious about the memory usage of dccd. Three times now since
> starting to run dccd, I've had problems with the server dying. From what
> info I've been able to recover, I believe this is due to memory exhaustion
> of some sort.
> 	I'm not by any means blaming this on dccd, but it does form part of
> the bigger picture.

That sounds strange.  Memory exhaustion could reasonably lead to
thrashing, but should not kill dccd.

> 	I'm guessing that this is designed to speed up the hash file access
> by exchanging memory for cpu/IO performance. 

No, it is to speed up hash file access by exchanging RAM for disk
accesses.  There is not much you can do with a CPU to help searching
a heap of 40 million records.  Whatever you do, you have to poke through
the heap.  If you choose a database with some kind of tree or hierarchial
organization, you'll make about O(ln(n)) probes of the database to
(fail to) find a record, for n=database size.  If you use a hash table,
you'll make about O(1) or constant number of probes, assuming you don't
overload the hash table.  If the database is on disk, each of those
probes will be somewhat more than the time required to access a random
disk block--somewhat more because on average you'll need to access
more than one block per probe.  25 years ago, you could expect to spend
about 30 milliseconds fetching a random disk block on typical (e.g.
"Winchester") disk drives.  That number is lower today but it is still
a matter of milliseconds.  If the database is in RAM, then each probe
will cost somewhat more than the average time require to access a
random word of RAM.  If your hash table is small, that will be very
few microseconds.  If it is large, it will be the length of a page
fault that does not need to go to disk, or a matter of at most hundreds
of microseconds on mondern CPUs.

The easiest measure of dccd speed is "ms delay" value from `cdcc stats`
That value is the recent avarage time from reception of a request
to transmission of the correspondig answer.

>                                              However, 800M/1501M is quite a
> chunk on a machine with 2G, particularly as, during normal operation, the
> CPU usage is negligible.
> 	Is there any way to tune this figure? Or have I missed the point
> completely?

The only knob that I can see is on the size of the database.  Letting
the database get so large that it does not fit in RAM will reduce the
performance of dccd by a factor of perhaps 100X.

The only control on the size of the database is what dbclean imposes.
Dbclean tries to fit the database into about half of available RAM,
with some fudge factors for systems with more than 1GByte.

Are better knobs needed?

Vernon Schryver    vjs@rhyolite.com

More information about the DCC mailing list

Contact vjs@rhyolite.com by mail or use the form.