Vernon Schryver vjs@calcite.rhyolite.com
Mon Apr 16 15:56:12 UTC 2007

> From: Gary Mills 

> It's partly an issue of `invisible content'.  Users never see all of the
> MIME and HTML goo in an e-mail message.  Some form of checksumming
> could ignore this as well, and perhaps it already does. 

the truly invisible MIME and HTML goo is ignored by the main, FUZ2 checksum.

>                                                          That way,
> completely empty messages could be given special treatment.

The trouble is that the messages are not completely empty but contain
a little text that the recipient presumably wants to see or is to be
forced to be seen by free mail providers and other advertisers.   After
you ignore random looking strings of digits and letters and other
cybercrud from a return receipt, each is identical to all others and
so has the same checksum.  Nominally empty messages from free mail
providers have advertising crud that is the same as zillions of other copies.

> Alternatively, is there something in the invisible portion of these
> troublesome messages that could be used to identify them and exclude
> them from DCC rejection?

You can whitelist on any SMTP header that is constant for a class of messages.

> Otherwise, we would need some automated procedure to detect these
> messages and whitelist them by their conventional checksums.

There is John Levine's semi-automatic list of checksums of empty
messages.  The script /var/dcc/libexec/fetch-testmsg-whitelist
can be run from cron to fetch 
and/or other lists.

Vernon Schryver    vjs@rhyolite.com

