Vernon Schryver
vjs@calcite.rhyolite.com
Sun, 28 Oct 2001 18:48:17 -0700 (MST)
> From: "Sam Leffler" <sam@errno.com> > I'm running dccm+dccd 1.0.34. I'm just starting up and want to monitor the > software carefully before applying reject actions and the like. I've got > dccm running with -a IGNORE so no messages are rejected and then using > procmail to segregate "bad mail". > > 1. I want dccm to log a summary message for every piece of mail it > processes. This summary should tell me the X-DCC header or similar. Is > there a -d value that's appropriate or some other way to get this? You could set dccm to log all mail in the dccm log directory, often /var/dcc/log, by setting DCCM_LOG_AT=1 in dcc_conf. That logs the entire message, which is either a characteristic or bug depending on whether you want to see the body to decide whether to adjust your whitelist (or whether things are working right). Besides the message, the SMTP envelope, the checksums, what sendmail had to say about the message (usually from access_db), and any messages dccm sent the system log, the dccm log files include the X-DCC header line that is (or would be with `dccm -N`) added to the message. If that's not sufficient, I could make some level of `dccm -d`s send the X-DCC to the sysetm log. If so, what level of -d's? > 2. I want to verify my whitelists have proper coverage. Incoming mail from > some mailing lists that should be covered by my whitelists are tagged with > X-DCC headers by dccm. Does this mean their checksums are also being fed > back into the database that's flooded to my neighbors? How do I log enough > info to tell whether I'm not polluting the checksum database with mail > that's ok? You can see what you are sending your neighbors by looking for reports with counts above your dccd -t threshold and with your server-ID in `dlist -v`. (I really ought to document what `dblist` has to say, right after I document `cdcc stats`.) There are two ways to run the DCC. In one, you feed the DCC the checksums of only certified unsolicited bulk mail. In this mode the database contains checksums of spam, and you must worry about unauthorized people reporting checksums, but you might be able to avoid needing white lists. I'm not enthused about this mode, because I don't trust people to never err. In this mode you lose the problem of whitelists, but gain the problem delegating the decision of what is spam. The other mode has the DCC detecting mere bulk mail instead of unsolicited bulk mail. It consists of sending the checksums of every message that might be spam, even if you are 99% certain it is not. In this mode, you don't worry about polluting the checksum database, because it is supposed to be able to handle 100,000,000's of mail messages per day (albeit not on a tiny computer). You also don't worry about polluting the database with mail that's not bulk, because if it really is not bulk, then no one else will see the message to be able to ask the DCC about it. Finally, if you are running your own server, consider the `dccd -t` thresholds. By default your server won't flood checksums to your neighbors unless those checksums are interesting. Note that the start-dccd script starts dccd with a `dccd -t` threshold that is min(default,dccm-rejection-threshold). > And also I'm wondering: > > 3. If a user explicitly marks a message as "bulk" with dccproc it seems this > can cause other sites to reject mail that might be ok. The whitelists > presumably help avoid this but this always work? What's to keep one > individual from maliciously or accidentially bouncing mail to other users. > I see only trust in this system; no authentication or control mechanisms > (e.g. I see no way to invalidate a checksum once it's been entered). If you run the DCC network in the first mode mentioned above to detect unsolicited bulk mail, then you must trust people. That's why I'm not enthused about that mode. However, it is mistaken to say there are no authentication, control, or invalidation mechanisms. Non-anonymous client-IDs have passwords and dccd has -Q precisely to limit checksum reports to people you trust. In addition, you can delete checksum reports from with the `cdcc delck` commands. Those commands are honored by all servers, subject to "no-del" in the /var/dcc/flod file. (See the dccd man page). (See also the rpt-ok keyword for /var/dcc/flod entries.) If you run the DCC network in the second mode, to only detect bulk mail, then you need not worry about trusting people. You can assume that people do dumb things, and not care. Whether someone reports having seen a message once or 1,000,000 times is not significant. If you receive the same message that someone else has marked, there are two cases. Since someone else saw it to mark it, the message is either unambiguously bulky, or it is at least not entirely private. In the first case, if the bulk mail is solicited, you must DCC whitelist the source. In the second case, if one of your correspondents is marking your mutual mail as extremely bulky, you need to consider shrinking your circle of friends or adjusting your DCC whitelist. > 4. How do people manually mark mail as bulk? Does everyone currently use > dccproc from the command line or are there easy ways to add a button to the > most popular GUI MUAs? Has anyone looked at adding to IMAP or POP protocols > since this would have to happen on a server machine? My GUI MUA is the ancient BSD mail program in a shell. To mark spam, I tell mail to edit the message with vi and then type "!Gdccproc -t many" I sometimes first type "!Gdccproc -t Q" to ask without reporting the message. That's probably not GUI enough for most people. There is no law that says dccproc could not be run on an IMAP or POP client to mark mail as spam. If I knew anything about browser or other WIN32 mail clients, I would write a "plug-in" equivalent to dccproc. If you participate in a DCC network with enough participants, you do not need people to manually mark mail as spam. All except the first handlful of targets will be told by the DCC that bulk mail is bulk. Vernon Schryver vjs@rhyolite.com