How to improve DCC handling of attachments?

Vernon Schryver vjs@calcite.rhyolite.com
Sat Jul 8 23:14:32 UTC 2006


> From: Gary Mills 

> > > fuz2 checksum, and rejected it as bulk mail.  This behavior is quite
> > > confusing to users, and difficult for me to explain to them.

> > What if you ask such users to look at the text and ask themselves
> > whether it is substantially identical to a zillion other messages?
>
> The sender may not have control over the format.  I don't know, but
> in this case it may have been entirely generated by some Exchange
> facility for sending files.

That sounds plausible or even likely.

>                              The recipient can only go by what is
> captured in the DCC logs.  MIME-encoded binary files, and HTML for
> that matter, are gibberish to most people.

I was talking about how to explain the behavior to local recipients
instead of a solution to the main problem.




> >      Don't many sites block such mail because it is so often a Microsoft
> >      worm/virus?
>
> That's a policy issue.  Virus e-mail often contains a ZIP file, but
> many ZIP files in e-mail contain legitimate files.

My question (for a change) was not rhetorical.  Do many sites block
all mail with ZIP attachements?


> >  2. tell users to whitelist their correspondents that do that.
>
> That's what we do now.  However, inter-personal e-mail messages
> containing unique binary attachments are clearly not bulk mail.
> By that definition, DCC should not be identifying them as such.
> Is this technically possible?

I don't see how DCC can distinguish such mail messages.


There is a 4th tactic I forgot to mention.  
I have the impression that some versions of Windows execute ZIP
files received via mail even without games played with file names
and extensions.  If that is true, that reasonable senders of such
messages will include unique text with each message to convince
recipients to open the ZIP attachments.  Such unique text should
generate unique FUZ2 checksums.


> >      That checksum might already be in John Levine's list of checksums of
>>      empty and test messages at http://www.iecc.com/dcc-testmsg-whitelist.txt

>
> That's a good idea.  However, we shouldn't be relying on one person to
> identify messages that DCC treats as empty, extract the relevant checksums
> and update that file.  That's too much to ask of anyone.

Perhaps so, but you also don't want to let just anyone add to such a list.
John Levine is somehow maintaining with I've no idea how much or little
help the very popular list of abuse addresses at abuse.net.


> For an example of the latter, I get many copies of a new form of spam
> sent to our `abuse' address.  It consists of a single line of text,
> often just one word, along with a small image file that advertizes a
> performance-enhancing drug.  Clearly, this is intended to get past
> spam filters that attempt to identify spam by scanning the header and
> body text.  It's going to take image analysis to handle this spam.

I think image analysis is hopeless there.  CAPTCHA is not a reliable
defense for many of the places where it is sold, such as preventing
abuse of free mail provider services, but that is because a drop-box
is worth a perhaps a dollar to a spammer.  The cost of doing enough
image analysis to detect all possible computerized variations of drug
advertisements and stock pump-and-dump images is at the other end of
the spectrum.  There's also the difference that a bad guy can look at
Yahoo's CAPTCHA before attacking it, while a good guy gets no warning
about new versions of spam images until the first 30 million spam have
been sent.


Vernon Schryver    vjs@rhyolite.com



More information about the DCC mailing list

Contact vjs@rhyolite.com by mail or use the form.