whitelisting mailing lists with dccproc

Vernon Schryver vjs@calcite.rhyolite.com
Tue Sep 11 18:53:04 UTC 2001

> From: "Brian J. Murrell" <dcc-list@interlinx.bc.ca>

> ...
> > I understood the comment about ignoring dccproc results to concern
> > white-listed mail.  The DCC clients do not send the checksums of
> > locally white-list mail to the DCC server.
> Why not?  What if there is a spammer mailing legitimately opted-in
> recipients along with scraped recipients?

When there is any question about your local users, don't white-list them.

>                                            Perhaps DCC's motive shoudl
> be to determine the "bulkiness" of mail rather than trying to target
> specifically spam. ...

That's what I think the DCC does best, determine "bulkness" and not spam.
It is possible in theory to use only certified spam traps and detect
spam with the DCC, but I've doubts about keeping the traps sufficiently
secret to make that work in practice.

> ...
> Privacy?  Don't the DCC clients just send checksums?  Could anyone
> really determine that a privacy leak?

DCC clients only send checksums.  If you have even the slightest
doubt about that, and even if you have not doubts, you should
check the source to see that it is true.

The words in a message are not the only private things that one 
might want to shield.  The fact that something was said can matter.

> > since it keeps the checksums of mail that you know
> > is otherwise entirely private from getting outside your network.
> Sure, if they really are local, don't DCC them, but otherwise...

That's what I meant, provided they are trusted to never be spam.

However, what if you are running a large ISP with 100,000 customers?
Then it might make sense to run local mail past a DCC server to detect
unexpectedly bulky local senders.  You might (I think better not) use
a purely private DCC server and database.  You could white-list known
to be legitimate bulk senders such as mailing lists run by your customers
and your own auto-responders.

> ...
> > Not only might you know that it's not spam, but you might not want to
> > let bad guys snoop on it.
> Snoop in what way?  Am I incorrect that DCC calulates checksums
> locally and does NOT send enitre e-mails to DCC servers?

You are correct that the DCC server only hears about checksums, and
cryptographically secure checksums at that.  

> > Imagine asking a DCC server about checksums
> > for the From value "bgates@microsoft.com" and then about the Subject
> > line "screw netscape".
> So DCC does send actual text, not just checksums?
> ...

No, the DCC ***NEVER*** sends text.  It would be absolutely, positively
intolerable and worse than useless if it did.  Please check the source
or at least snoop on the network traffic!

However, you could feed a message that you bgates@microsoft.com might
have sent to `dccproc -Q` or `dccproc -QC`.  If you get non-zero
answers, then you know that messages with the checksums of your test
have been seen.  You can't know from that test whether a single message
contained both "From: bgates@microsoft.com" and "Subject: screw
netscape", but only that those checksums have been seen among 1 or
more messages.  If you are running DCC server, can use `dblist -v` to
ask about messages with both a given From and subject.  For example, you
could look a database for a record like this one:

  09/11/01 11:55:02.876352    4        auth 101                  316518
   *IP           ok         e475b896 492c60fc efecb432 6e29e3c5   31644c 12c72
   *env_From     ok         908fa2ba aa84d686 418ea742 e74ab030   31644c 1066f
    From         26         43e0e286 cad55550 08e0cff2 82a3a0d0   31644c 8158
    Subject      158        ac0f825d 6c6566fd 48b3c035 da6c1ab3   31644c 12e6b
    Message-ID   7          63c39712 21a78109 2e7bcf7d 515924f9   31644c 7e99
    Received     7          548779af 4a00c062 75987db9 3c264f79   31644c 7f03
    Body         6          bae52a62 f03dc879 eea0d648 8a18a09c   31644c 2083
    Fuz1         6          f1704633 0df0aed5 b19eaff5 43c4f6a2   31644c 118d2

If you feed your message through `dccproc -QC` or `cdcc pck body`,
you'll find that it's Fuz1 checksum was f1704633 0df0aed5 b19eaff5 43c4f6a2.
 From the records matching that key, you can know how many people
receievd it.  Thus, the DCC database contains no text, but it does
allow you to confirm guesses, such as about who received your message.
That's not generally very useful, but usually neither is what is called
"traffic analysis."  The questions that the DCC data can answer are
a trivial subset of those that can be answered by the FBI's Carnivore
or any other packet snooper, but they are worth worrying about.

The To value is not in the DCC database because I think it would be
an intolerable privacy risk.  It would only only let someone ask "did
bgates@microsoft.com send a message to smcnealy@sun.com?" but even
"did bgates@microsoft.com send a message to smcnealy@sun.com with the
subject 'screw HP and Compaq'?"

Vernon Schryver    vjs@rhyolite.com

More information about the DCC mailing list

Contact vjs@rhyolite.com by mail or use the form.