dccm failure under load

Vernon Schryver vjs@calcite.rhyolite.com
Sat Jan 4 23:34:31 UTC 2003

> From: Gary Mills <mills@cc.UManitoba.CA>

> `dccm' on our main mail server has been failing recently.  The likely
> cause was an e-mail address harvesting attack from a compromised
> workstation on a cable modem.  It was using hundreds of connections
> to check random user names.

> ...
> Jan  4 01:34:56 electra dccm[21819]: [ID 109917 mail.error] DCC, mi_rd_cmd: read returned -1: Connection reset by peer

That is not a problem with dccm but with sendmail aborting things.

> Jan  4 09:25:31 electra dccm[21819]: [ID 125918 mail.error] DCC: accept() returned invalid socket (Too many open files), try again
> Jan  4 09:25:31 electra dccm[21819]: [ID 925838 mail.error] dcc_mkstemp(/var/dcc/log/004/09/tmp.37CTm2): Too many open files

Since 1.1.13 that your DCC servers are running, I found and fixed an error
in the count of the total open files used when per-user log files are
used.  Installing 1.1.17 or using `dccm -j` to reduce the number of
simultaneous dccm jobs might stop the "too many open files" problems.

> Jan  2 00:24:28 electra sm-mta[27811]: [ID 801593 mail.error] h026MGQe027811: Milter read(dcc): timeout before data read
> Jan  2 00:24:28 electra sm-mta[27811]: [ID 801593 mail.info] h026MGQe027811: Milter (dcc): to error state
> Jan  2 00:24:28 electra sm-mta[27811]: [ID 801593 mail.info] h026MGQe027811: Milter: from=<97rok@hotmail.com>, reject=451 4.7.1 Please try again later
> Jan  2 00:24:28 electra sm-mta[27811]: [ID 801593 mail.info] h026MGQe027811: from=<97rok@hotmail.com>, size=0, class=0, nrcpts=0, proto=ESMTP, daemon=MTA, relay=h24-66-73-149.wp.shawcable.net []
> During the attacks, sendmail limits connections to 4 per second.  This would
> be sufficient protection, if `dccm' wouldn't fall over.  Is there a way to
> make `dccm' more resilient?

I don't see any evidencing of "failover," which is a good thing.  Instead
there are signs of failure.

The rate of new connections does not matter.   What matters is the
total number of active connections at any one time.

I'm not sure there is a problem in that second example that should be
fixed.  The SMTP client was told "451 4.7.1 Please try again later."
If it was not a spammer, it surely will try again later.  If it was
a spammer, it probably won't.  In either case, the result seems ok.

Vernon Schryver    vjs@rhyolite.com

More information about the DCC mailing list

Contact vjs@rhyolite.com by mail or use the form.