Vernon Schryver
Sat Sep 25 03:05:47 UTC 2004

> From: chris albert 

> Does anyone have statistics on retransmission times?

> In my environment, there is a certain class of dedicated, determined,
> irrational, powerful, kvetching, megomaniacial user who thinks that
> boss^2 is their service desk. With 50K users, even if the latter class

> Thus, understanding the distribution of 'restransmission times' will
> give me the ability, under some unrealistic assumptions, to make
> predictions about the kind of crap that will arrive at 2 levels above
> in the organigramme, and thus, under some realistic assumtions about
> the exponential growth rate of the severity of crap under bureacratic
> gravitation, estimate my choke point, in the potential rollout of
> greylisitng for my user base.

Statistics don't sound useful in such cases.  In your case, as everywhere,
there will be people who consider even greylisting an "Unconstitutional
Restriction of the Freedom of Speach (sic)."  (Yes, I know the Bill
of Rights is of mere academic interest in Canada.)  Some people will
shout at your Boss^2 if their mail is delated even the half hour that
RFC 2821 recommends for retransmissions after a 4yz rejection.

That's why it's handy, when possible, to let users control greylisting
of their own mail.  I think 50K users are not too many for the per-user
whitelists, DCC, and greylist controls supported by dccm and dccifd.

That's also why some ISPs like turning off logging of greylist
rejections.  Megomaniacial users are less likely to complain about
things they don't know happened.

> 2. Seeding, training -- rollout considerations.
> Suppose that despite my survival instincts, that I am so sick of spam
> that I want to stop it, but not to the point of suicide.
> So I'd like to seed, train my graylisting regime, prior to its global
> implementation, so as to reduce the impact of stochastic
> retransmission times.
> Is there a way to do that?

Yes, you could set the embargo to 0 seconds.

> Can I use historical data? ( whitelist certain senders from the
> previous 60 days ...)

Not easily.

> Can I implement (dccm) greylisting, just recording triples for a
> period of time, to reduce the impact of retransmission delays.

If you mean turn on greylisting but with an embargo of 0 seconds,
then yes.

> Can I use a forward feedback mechanism like described in
> ?
> ( for example, greylist just those emails whose checksum appears as
> MANY in a dcc server?, ...).

I don't see the profit in that.  A message that is bulk because
it has a large target count according to the DCC (e.g. "MANY") 
doesn't need more filtering.  It is either solicited bulk mail
and so not spam, or it is unsolicited bulk mail and so spam.

That HP URL suggests something like adjusting the greylist embargo
based on recently received spam.  If you run `dccd -G weak-IP`, the
something similar happens.  Spam causes greylist-triples as well as
greylist-weak-IP addresses to be deleted from the greylist database,
and that restores the embargo on subsequent mail for those triples or
IP addresses.

Vernon Schryver

