Interaction between greylisting and persistent host status

Gary Mills mills@cc.UManitoba.CA
Tue Nov 18 03:40:25 UTC 2003


On Mon, Nov 17, 2003 at 04:13:13PM -0700, Vernon Schryver wrote:
> > From: "John R Levine" <johnl@iecc.com>
> > To: "Gary Mills" <mills@cc.UManitoba.CA>
> > Cc: "dcc@rhyolite.com" <dcc@rhyolite.com>
> 
> > > The persistent host status feature is extremely useful to
> > > streamline queue processing on a large mail server.
> >
> > Yes.  qmail has a global host status table that does just what you say, it
> > keeps it from wasting time trying to contact hosts that have recently been
> > observed not to answer.
> 
> Not answering differs significantly from answering with a 4yz.  The
> differences include not only the likely causes but the likely results
> of a retry and the costs of trying again.  For example, a failure by
> an SMTP server to answer makes the SMTP client pay the costs of keeping
> a context alive and waiting for a timeout for perhaps 10 times longer
> than the typical duration of an SMTP transaction.

Yes, I agree.  From the queue runner point of view, failure to answer
is a major problem, particularly because the recommended timeouts
are so long.  Knowing that a server failed to answer last time around
is a big help.  Responses of temporary failure are less of a problem.

> > > The other question is:  Is the SMTP temporary failure response
> > > a server property or a recipient property?
> >
> > Wouldn't that depend on when the response happens?  If it's in response to
> > RCPT TO, it's the receipient.  Any other time, its's the server.
> 
> I disagree.  A 4yz in response to 
>   - a Rcpt_To is probably but not necessary directly related to
>       the parameters in the Rcpt_To command. 
>   - a HELO/EHLO or Mail_From command cannot be tied to any Rcpt_To
>       value because there hasn't been one.
>   - a DATA command might be related to one of the Rcpt_To values or
>       something else.
>   - any other command might be related to anything.
> 
> An SMTP client that presumes to assume about what made the SMTP server
> say 4yz without understanding the accompanying text or at least the
> extended code is too smart by half.  This is as true of the SMTP
> clients that retry within 5 seconds after getting a 4yz for a DATA
> command as those that retry 24 hours later.  Those that retry within
> seconds (e.g. Cox and AOL) are as broken as those that retry 24 hours
> later.  Unfortunately for greylisting, there is no hope of fixing the
> many too smart by half SMTP clients.

Servers or clients are not supposed to parse the text.  It may not
even be in English.  They are supposed to know the meaning of the
codes.  I haven't checked the relevant RFCs to be able to say more.

> The suggestion in RFC 2821 of about 30 minutes is the best you can do.
> It makes little sense to assume that an SMTP server that was busy and
> wanted your SMTP client to come back later either 5 seconds or 24 hours
> will be happy now.  It makes no sense to assume the server's objection
> will be fixed in 5 seconds or won't have gone and returned over 24 hours.

Apparently, sendmail writes a persistent host status file at the end
of every delivery attempt.  I haven't found the code where sendmail
checks the host status.  However, I understand that it does ignore
some temporary failures.  For example, this is pretty common here:

	dsn=4.2.2, stat=Deferred: 452 4.2.2 Over quota

In that case, sendmail does not skip future delivery attempts.  However,
for my grey list test, sendmail logged:

	dsn=4.0.0, stat=Deferred: 451 4.7.1 mail hAGGQ9Bf002801 from 130.179.16.23 temporary greylist embargoed by DCC

and did skip delivery attempts until the host status timed out.
I wonder if DCC can just use different status codes for grey listing?

> I think 3 hours is almost as much too long as 24 hours, but your
> mileage way vary.  It seems to me that if you are trying to shed load,
> then you should shed load.  That means not running the queue if your
> SMTP client is too busy.  Whether a given SMTP server blew off your
> client several hours ago is irrelevant to whether your client has time
> to try again now and implies nothing about whether the server will
> again be too busy now.

No, the issue not load shedding.  It's being able to traverse a large
mail queue in a reasonable time.  Just now, our main mail server has
10,000 messages in the queue, which is divided across nine
directories.  The queue runners must process all of them within an
hour to maintain reasonable service.  The problem is that lots of spam
has a return address that points to a non-responding server.  The
queue runner can't afford to repeatedly attempt to connect to such a
server, or the queue run would never complete.

-- 
-Gary Mills-    -Unix Support-    -U of M Academic Computing and Networking-



More information about the DCC mailing list

Contact vjs@rhyolite.com by mail or use the form.