HTML vs. bulk

Vernon Schryver
Fri Jan 3 23:22:49 UTC 2003

All of that is interesting.  Thanks.

I'm thinking about defining a FUZ3 checksum that would ignore text
bounded by <html>...</html> and otherwise be similar to the FUZ2
checksum.  When what remains of the message is too little to generate
a checksum, then as with the other fuzzy checksums, no checksum would
be reported to the DCC server.  However, like some of the SMTP header
and envelope checksums, a constant checksum for the null string would
be generated for local blacklisting (or even white-listing).

This would allow DCC clients (e.g. entire enterprises) or individual
users at enterprises using per-user whitelists (e.g. with dccproc
or the dccm per-user whitelists) to blacklist all messages without
enough plaintext to generate a FUZ3 checksum.

What do you think?

