dccd memory usage

Vernon Schryver vjs@calcite.rhyolite.com
Tue Oct 5 14:07:10 UTC 2004

> From: Richard Underwood 

> > The only control on the size of the database is what dbclean 
> > imposes. Dbclean tries to fit the database into about half of 
> > available RAM, with some fudge factors for systems with more 
> > than 1GByte.
> > 
> 	Ah, I see - so the database is limited to the "window" of 1501M?

sort of.  Dbclean tries to expire enough records so that by the end
of the next 24 hours, the database will still fit in the RAM "window."

> 	I think it would be useful to know a bit more about it. If dbclean
> limits the db size to 1501M for my server, and 500M for a server with 1G,
> does it mean one server is more effective as a dcc server?

The idea of the DCC is to recognize copies of bulk mail.  Spam that
it reported every day to a server in the network of DCC servers needs
a database that covers only that last 24 hours.  Spam that is sent
slowly or repeated once a week needs a database that covers the last
seven days, and so forth.
So when all else is equal, more memory is better.

> 	I'm also not 100% convinced about locking the database into memory -
> modern operating systems have pretty efficient disk caches which will have
> much the same effect when there's spare memory, but give up the memory when
> required. I'm open to being persuaded, though.

On the contrary, most modern free UNIX systems are absolutely terrible
in how they handle their disk caches for large files.  They generally
do no have unified page and file caches, but prefer to pump pages among
disk files, swap, buffer cache, and VM.  They generally cannot be
convinced to swap directly to and from memory mapped files, which is
why many regularly stall and busily pump pages of the DCC database
among RAM, the DCC files, and swap space!  Linux is particularly bad
in this area.

Dccd does not lock anything into RAM.  In many UNIX flavors that would
require root permissions and would risk deadlocking the system.  Dccd
merely uses mmap() to map as much of the files as seems likely to fit
into virutual memory.  The DCC databse window is merely the limit on
the mmap()'ed buffers.  If modern free UNIX systems did not have such
terrible VM systems, that would be enough.
It's not enough, so there is code in dccd that tries to second guess
the VM system.  Dccd hints to the kernel when blocks of the database
are less likely to be needed so that the kernel should start pushing
them to the file.

> 	The real issue is that this server isn't just a dcc server. If it
> was, I don't think there'd be a problem. If it was required, I could
> increase the memory to 3G, but if this triggers dccd to use 2.5G, then I get
> no overall benefit (other than dccd having more memory - which I don't know
> enough about to judge) - the other processes are still limited to 500M.

Increasing physical memory to 3 GByte should increase the window, but
unless the database wants to be bigger than it is given the default
expiration times, it won't use more memory.

I'll look into adding a knob for the upper bound of the database window.

Vernon Schryver    vjs@rhyolite.com

More information about the DCC mailing list

Contact vjs@rhyolite.com by mail or use the form.