Distributed Checksum Clearinghouses Reputations

The current version of the DCC source is version 2.3.168, April 24, 2021.

Introduction

The Distributed Checksum Clearinghouses or DCC is an anti-spam content filter that is based on a distributed database that collects real time reports about mail from a global network of servers. Each report consists of several checksums. The most important checksums are of message bodies, the IP address of the SMTP client or mail sender, bits of envelope, and so forth. Each time a DCC client sends a report to a DCC server, the server records the report (modulo data reduction/compression) and answers with how many times it has heard of each checksum in the report. A DCC client can take that answer and decide "this message is bulk because the DCC network has heard of it more than X times."

DCC Reputations

DCC Reputation graphs DCC Reputation graphs
click for more graphs

A DCC server also computes reputations by counting the total number of mail messages sent from particularly active IP addresses, and the number of bulk messages each sends. The percentage of bulk mail seen from an IP address is its DCC reputation. For example, a DCC Reputation of 50% means that 50% of the mail that the global DCC Reputation network has seen from an IP address has been bulk. Because the DCC network is not omniscient, the DCC Reputation of an IP address tends to understate the probability that the next message from an IP address will be bulk.

Reputations are flooded among DCC reputation servers along with DCC checksums. Thus DCC Reputations are like DCC counts, more reliable as more systems participate.

DCC reputation servers detect bulk mail to compute reputations using counts of DCC body checksums reported to all DCC servers in the global network of DCC servers. Mail messages rejected because of a DCC reputation are reported a second time to the global network with counts of MANY. This ensures that the messages will be rejected if sent from some other IP address to any mail system using a DCC client.

Like any reputation system, DCC reputations can have false positives. They can react more quickly than manual DNS blacklists to the appearance of new "trojan proxies" and "cracked" PHP-Nuke sites. DCC Reputations are less effective than greylisting, but many sites are unable to use greylisting. It is profitable to use both greylisting and DCC Reputations.

Mechanisms

DCC clients add the string bulk rep to X-DCC headers of mail messages that are not bulk mail but that come from IP addresses with reputations worse than a local configured threshold. There are two thresholds for DCC reputations. Unless they are set, bulk rep is not added by DCC clients and mail is not rejected, but DCC servers accumulate data. Rep-total is the minimum number of mail messages, good as well as bulk, that must have been seen to compute a DCC reputation. This threshold is needed to avoid computing a reputation based on only a few mail messages. Rep is the percentage of bulk mail that gives a DCC Reputation for sending bulk mail.

Because reputations can involve more false positives, dccm and dccifd do not reject mail unless allowed by DCC-reps-on in the client whitelist file. That setting is normally in per-user whiteclnt files, but can be in the system's global /var/dcc/whiteclnt file. The proof of concept CGI scripts in the source include support for turning DCC reputations on and off and setting the Rep threshold by individual end users for their own mail.

DCC Reputations can be turned off for clients using a given client-ID by add "no-reps" to the line for that client-ID in /var/dcc/ids on all DCC servers used by clients with that ID.

MX Servers

DCC Reputations are about IP addresses, and so it is important for the system to recognize an installation's MX servers or mail systems receive mail from the Internet and forward it internally and not blame them for spam. Their IP addresses must be listed in the global whiteclnt file with MX or MXDCC entries.

Configuring DCC Reputations

Because any sort of a bad reputation is not a guarantee of bad behavior, rejecting, discarding, or segregating mail from IP addresses with bad reputations (not just DCC Reputations) in "junk folders" can result in false positives. So DCC Reputations must be explicitly turned on for all mailboxes that should have DCC Reputation filtering on DCC clients, or mail systems using dccproc, dccifd, or dccm. To enable DCC Reputations:

Query the DCC Reputation Database

the DCC Reputation database about a name or address


Contact Vernon Scrhyver of Rhyolite Software, LLC at vjs@rhyolite.com or use the form

   script $Date: 2019/09/27 13:17:20 $