Gary Mills mills@cc.UManitoba.CA
Mon Oct 14 20:15:00 UTC 2002

On Mon, Oct 14, 2002 at 10:16:00AM -0600, Vernon Schryver wrote:
> > From: Gary Mills <mills@cc.UManitoba.CA>
> > > > I'm setting up a procedure so that users can nominate bulk mail for
> > > > inclusion in a central whitelist.  They will provide the name of a
> > > > DCC log file.  A script will then extract the appropriate information
> > > > from the collected log files to build a file in whitelist format.
> Unless that "nominating" involves a person checking the submissions, I'd
> do something a little different.  I'd use something like the CGI scripts
> in the DCC source to let users modify a whitelist file (or several files),
> and then use scripts to collect the whitelist for dccd or dccm.
> If people act as gatekeepers, I'd still probably have them use something
> like the CGI scripts, since they could point-and-click to select
> among all of the possible white-listing stigmata in those DCC log files.

No, it will be done by some end users.  I need to provide an anti-spam
facility that works for everyone, without requiring action by all of
the users.  Most of them won't know the difference between the
envelope and the header, let alone different types of checksums.  They
just want spam to stop, but still have their mailing list mail.  I
will be providing a web interface.  It will show them a daily summary
of their rejected mail, with a link to view the headers and body, and
another link to nominate the message as legitimate bulk e-mail.  That
will copy the entire log file to a place where it can be used to build
a whitelist file.

> > > Adding white list entries for all checksums of a sample
> > > message might too quickly exhaust the 80,000 limit on the size of the
> > > client white list hash tables.
> >
> > Should I be adding them to the server whitelist, then?
> Only if you will have more than a few 10,000 entries, including IP addreses.

I recently added 131072 IP addresses, for our class B networks.  How
about the `ok' entries?  Which whitelist is best for them?

> The biggest problem with using the server whitelists is that you must
> ensure that all of your servers have the same whitelist.  That's easy
> if you control all of your servers, but also implies you cannot use
> the servers of other organizations for backup.

That shouldn't be a problem here.

> > Checking just now, on one mail server, both `dccd' and `dccm' are
> > working correctly.  `dccm' is using 460 of 472 file descriptors.  It
> > has 88 threads.
> That seems a little high given the modest loads that `cdcc stats`
> here says are seen by your two servers.  I hope you've configured
> client-IDs so that your dccm processes do not have to wait the default
> `dccd -u` delay imposed on anonymous clients.  In other words, I hope
> that when run on your servers, `cdcc info` talks about a "queue wait"
> of less than 10 milliseconds instead of more than 50.

It says: `1 ms queue wait'.  `dccm' is down to 397 file descriptors
now, and is using 67 threads.  By contrast, the Trend Micro milter has
29 of 1024 file descriptors and is using 41 threads.

-Gary Mills-    -Unix Support-    -U of M Academic Computing and Networking-

