How to whitelist well-managed mailing lists

Gary Mills mills@cc.UManitoba.CA
Tue May 28 19:44:56 UTC 2002

On Mon, May 27, 2002 at 09:44:13PM -0600, Vernon Schryver wrote:
> > From: Gary Mills <mills@cc.UManitoba.CA>
> However, those problems don't matter.  You don't need an unforgeable
> mark, because forgery is so widely viewed as unacceptable and often
> a crime.  Forgery would bypass almost all current or conceivable spam
> defenses, but essentially no spammers are doing it.

I assume that you mean that spammers have not been forging complete
messages so that they appear to have come from well-known mailing
lists.  I certainly haven't seen this.  Spammers do forge return
addresses, of course.  Klez sometimes even uses mailing list addresses
as the forged return address.

> You need only a list of legitimate mailing lists and their markers.

This is encouraging.  It would be quite valuable to have a reasonably
reliable way to pick out legitimate bulk mail from among all the spam.

> > It's not possible to identify all possible sources of spam, but it
> > may be possible to identify all possible sources of legitimate bulk
> > e-mail.  The magnitude would be much less.  Then, DCC would only need
> > to reject or mark everything not identified as legitimate.
> It is impossible to identify all possible source of legitimate bulk 
> e-mail, because there are so many and they come and go.  Any one with
> a UNIX box can start a legitimate mailing list. 

I agree, of course.  However, the magnitude of the problem is much
smaller, which makes it possible to come closer to a complete answer.

> Worse, there is no single definition of "legitimate."  For example,
> not long ago an article in mentioned some
> mailing lists that the author valued.  In my view, several were
> exceptionally bad and hopeless spammers.
> However, that problem is also not fatal.  An "80% solution" would be
> very valuable.  Those users who insist on receiving mail from spammers
> or controversial sources can be accommodated with individual white
> list entries.  Users with very unusual tastes in bulk mail may not
> have their mail tagged as bulk because it is so unusual and can have
> their lists sanctified if it is.

This, also, is encouraging.  In the anti-spam world, there are very
few solutions that are both effective and discriminate well between
spam and legitimate e-mail.  Open relay blocking is one of these, but
the conservative database (MAPS RSS) that we use here probably only
blocks 20% of the spam.  DCC is very promising, because it focuses on
the bulkiness aspect of spam.  Teaming DCC with a reliable database of
solicited bulk mail would be a wonderful addition to our spam defenses.

> > What do you think of this suggestion?
> I think you're looking at a lot of work.  However, trying can't do
> any harm other than burning yourself out.  Even something far short
> of an 80% solution could be quite valuable, and not just for the DCC
> but for other spam defenses that uses white lists.

I'm not proposing to do this all myself.  I'm thinking of a shared
distributed database, modeled on the DCC checksum database, or the
DNS-based mail relay blacklists.  The fact that it would be a whitelist,
rather than a blacklist, would appeal to many people.  DCC comes with
a sample whitelist, which is really just a sample.  Maintaining this
is a major problem.  It would be so much nicer if DCC could use a
shared whitelist of major mailing lists.  There are many issues to
be considered, and obstacles to overcome, of course.  What I'm looking
for now are just suggestions on how to make this whole thing work,
work with the efficiency that is needed for real-time spam blocking.

-Gary Mills-    -Unix Support-    -U of M Academic Computing and Networking-

More information about the DCC mailing list

Contact by mail or use the form.