From f7761560d3e83b85cd1240857eb48da4@interlinx.bc.ca Sun Jul 15 19:45:52 2001 Received: from linux.interlinx.bc.ca (adsl-63-206-118-108.dsl.snfc21.pacbell.net [63.206.118.108]) by calcite.rhyolite.com (8.12.0.Beta11/8.12.0.Beta11) with ESMTP id f6G1jmt3017379 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO) for env-from ; Sun, 15 Jul 2001 19:45:50 -0600 (MDT) Received: (from nobody@localhost) by linux.interlinx.bc.ca (8.11.3/8.11.3) id f6G1jlC24411 for ; Sun, 15 Jul 2001 18:45:47 -0700 Received: from pc.ilinx(10.75.2.1), claiming to be "pc.interlinx.bc.ca" via SMTP by linux.ilinx, id smtpdlGjs4a; Sun Jul 15 21:45:42 2001 Received: by pc.interlinx.bc.ca (Postfix, from userid 1001) id BF5A1231BF; Sun, 15 Jul 2001 18:45:42 -0700 (PDT) Date: Sun, 15 Jul 2001 18:45:42 -0700 To: dcc@rhyolite.com Subject: Any DCC servers available? Message-ID: <20010715184542.B21463@pc.ilinx> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.3.19i From: "Brian J. Murrell" X-DCC-RHYOLITE-Metrics: calcite.rhyolite.com 101; IP=6 env_From=1 From=3 Subject=1 Message-ID=1 Received=1 Body=1 Fuz1=1 Sender: dcc-admin@rhyolite.com Errors-To: dcc-admin@rhyolite.com X-BeenThere: dcc@rhyolite.com X-Mailman-Version: 2.0.5 Precedence: bulk List-Help: List-Post: List-Subscribe: , List-Id: Distributed Checksum Clearinghouse source List-Unsubscribe: , List-Archive: Hello all, I want to test DCC on a whole mailbox of spam I have here but no server to test against. I could set one up here but for my little single domain, heck single recipient domain I am not going to see much. :-) Does anyone have a DCC server with a decent number of submitters yet? b. From vjs@calcite.rhyolite.com Sun Jul 15 21:19:23 2001 Received: (from vjs@localhost) by calcite.rhyolite.com (8.12.0.Beta11/8.12.0.Beta11) id f6G3JNS7022085 for dcc@rhyolite.com env-from ; Sun, 15 Jul 2001 21:19:23 -0600 (MDT) Date: Sun, 15 Jul 2001 21:19:23 -0600 (MDT) From: Vernon Schryver Message-Id: <200107160319.f6G3JNS7022085@calcite.rhyolite.com> References: <20010715184542.B21463@pc.ilinx> To: dcc@rhyolite.com Subject: Re: Any DCC servers available? Sender: dcc-admin@rhyolite.com Errors-To: dcc-admin@rhyolite.com X-BeenThere: dcc@rhyolite.com X-Mailman-Version: 2.0.5 Precedence: bulk List-Help: List-Post: List-Subscribe: , List-Id: Distributed Checksum Clearinghouse source List-Unsubscribe: , List-Archive: > To: dcc@rhyolite.com > Subject: Any DCC servers available? > From: "Brian J. Murrell" > I want to test DCC on a whole mailbox of spam I have here but no > server to test against. I could set one up here but for my little > single domain, heck single recipient domain I am not going to see > much. :-) > > Does anyone have a DCC server with a decent number of submitters yet? You may point a client at the server at dcc.rhyolite.com, but for more reliability and less network traffic, it would be better to run your own server. One of the features of the servers is that they flood reports of checksums of bulk mail among each other. When a flooding connection is created from one server to another, all bulk reports in the first server's database are sent to the second server. To point a client at a server, use "add host" in the `cdcc` program. An anonymous client does not need a password and should use the anonymous client-ID 1. Vernon Schryver vjs@rhyolite.com From vjs@calcite.rhyolite.com Sun Jul 15 22:03:34 2001 Received: (from vjs@localhost) by calcite.rhyolite.com (8.12.0.Beta11/8.12.0.Beta11) id f6G43YBu025259 for dcc@rhyolite.com env-from ; Sun, 15 Jul 2001 22:03:34 -0600 (MDT) Date: Sun, 15 Jul 2001 22:03:34 -0600 (MDT) From: Vernon Schryver Message-Id: <200107160403.f6G43YBu025259@calcite.rhyolite.com> To: dcc@rhyolite.com Subject: Re: Any DCC servers available? References: <20010715184542.B21463@pc.ilinx> Sender: dcc-admin@rhyolite.com Errors-To: dcc-admin@rhyolite.com X-BeenThere: dcc@rhyolite.com X-Mailman-Version: 2.0.5 Precedence: bulk List-Help: List-Post: List-Subscribe: , List-Id: Distributed Checksum Clearinghouse source List-Unsubscribe: , List-Archive: I forgot to welcome anyone who wants to contact me to exchange server IDs and passwords. It's easiest if servers have unique IDs, but there is an ID translating mechanism in the inter-server flooding that is intended to cope with duplicates by transalting. I'm also perfectly willing to set up a real client ID and password for anyone who really wants to use my server directly instead indirectly through flooding to their own server. If you have a shell account it might be hard to get a server going compared to using a .forward file with dccproc and procmail. Testing is easiest done with the anonymous ID, but note the default 2 second value of -u for dccd. That feature is to help an outfit like MAPS provide free DCC service to individuals but strongly encourage large outfits to sign up for a real ID. An individual user is unlikely to notice even a 10 second delay, but a big ISP could not tolerate slowing down a 1,000,000 messages/day mail server by that much. Vernon Schryver vjs@rhyolite.com From d48d4f904793f2eb08874cd379d98ca2@interlinx.bc.ca Sun Jul 15 22:36:55 2001 Received: from linux.interlinx.bc.ca (adsl-63-206-118-108.dsl.snfc21.pacbell.net [63.206.118.108]) by calcite.rhyolite.com (8.12.0.Beta11/8.12.0.Beta11) with ESMTP id f6G4aoDJ027572 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO) for env-from ; Sun, 15 Jul 2001 22:36:54 -0600 (MDT) Received: (from nobody@localhost) by linux.interlinx.bc.ca (8.11.3/8.11.3) id f6G4an124873 for ; Sun, 15 Jul 2001 21:36:49 -0700 Received: from pc.ilinx(10.75.2.1), claiming to be "pc.interlinx.bc.ca" via SMTP by linux.ilinx, id smtpdpUJJAG; Mon Jul 16 00:36:48 2001 Received: by pc.interlinx.bc.ca (Postfix, from userid 1001) id 4D208231BF; Sun, 15 Jul 2001 21:36:48 -0700 (PDT) Date: Sun, 15 Jul 2001 21:36:48 -0700 To: dcc@rhyolite.com Subject: Re: Any DCC servers available? Message-ID: <20010715213648.E21463@pc.ilinx> References: <200107160319.f6G3JNS7022085@calcite.rhyolite.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200107160319.f6G3JNS7022085@calcite.rhyolite.com> User-Agent: Mutt/1.3.19i From: "Brian J. Murrell" X-DCC-RHYOLITE-Metrics: calcite.rhyolite.com 101; IP=7 env_From=1 From=4 Subject=1 Message-ID=1 Received=1 Body=1 Fuz1=1 Sender: dcc-admin@rhyolite.com Errors-To: dcc-admin@rhyolite.com X-BeenThere: dcc@rhyolite.com X-Mailman-Version: 2.0.5 Precedence: bulk List-Help: List-Post: List-Subscribe: , List-Id: Distributed Checksum Clearinghouse source List-Unsubscribe: , List-Archive: On Sun, Jul 15, 2001 at 09:19:23PM -0600, Vernon Schryver wrote: > > You may point a client at the server at dcc.rhyolite.com, but for > more reliability and less network traffic, it would be better to > run your own server. OK. Do you mind if I point my client at it for a few weeks (too long?) until I see how well it does? Does your server see a lot of spam to checksum or are you like me in that you are only processing e-mail for a small sample of users at your site? Of course it would help if I whitelist all of my real bulk traffic like mailing lists etc. right? dccproc[1] does not even make a connection to the server for whitelisted traffice correct? > One of the features of the servers is that they flood reports of > checksums of bulk mail among each other. When a flooding connection > is created from one server to another, all bulk reports in the > first server's database are sent to the second server. The idea is to create a network of servers exchanging checksum databases then, not unlike the way usenet servers work with articles? Is there provision in the protcol for knowing when duplicate information is being received in the case of one server being connected to several other servers? > To point a client at a server, use "add host" in the `cdcc` program. > An anonymous client does not need a password and should use the > anonymous client-ID 1. Got 'er working to your server! Thanx, b. [1] I am only interested in seeing the metrics while I am evaluating -- if I like the results I will set up the Sendmail hooks. From 22d7a1d94073df30032e8110f63f9af7@interlinx.bc.ca Sun Jul 15 22:52:14 2001 Received: from linux.interlinx.bc.ca (adsl-63-206-118-108.dsl.snfc21.pacbell.net [63.206.118.108]) by calcite.rhyolite.com (8.12.0.Beta11/8.12.0.Beta11) with ESMTP id f6G4qADJ028655 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO) for env-from <22d7a1d94073df30032e8110f63f9af7@interlinx.bc.ca>; Sun, 15 Jul 2001 22:52:13 -0600 (MDT) Received: (from nobody@localhost) by linux.interlinx.bc.ca (8.11.3/8.11.3) id f6G4qAV24969 for ; Sun, 15 Jul 2001 21:52:10 -0700 Received: from pc.ilinx(10.75.2.1), claiming to be "pc.interlinx.bc.ca" via SMTP by linux.ilinx, id smtpdDVk4hj; Mon Jul 16 00:52:00 2001 Received: by pc.interlinx.bc.ca (Postfix, from userid 1001) id B8404231BF; Sun, 15 Jul 2001 21:52:00 -0700 (PDT) Date: Sun, 15 Jul 2001 21:52:00 -0700 To: dcc@rhyolite.com Subject: dccproc -t many? Message-ID: <20010715215200.F21463@pc.ilinx> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.3.19i From: "Brian J. Murrell" X-DCC-RHYOLITE-Metrics: calcite.rhyolite.com 101; IP=8 env_From=1 From=6 Subject=1 Message-ID=1 Received=1 Body=1 Fuz1=1 Sender: dcc-admin@rhyolite.com Errors-To: dcc-admin@rhyolite.com X-BeenThere: dcc@rhyolite.com X-Mailman-Version: 2.0.5 Precedence: bulk List-Help: List-Post: List-Subscribe: , List-Id: Distributed Checksum Clearinghouse source List-Unsubscribe: , List-Archive: If I have a piece of spam in my mailbox and I know it was not targetted at me, should I do a "dccproc -t many" on it to tell the DCC server(s) that this is definately spam? i.e. as a community service to other users of the DCC server. Even if I don't know for sure that there were any other recipients (as a single user, I can't know for absolutely sure that there were "many" recipients) should I still? Does the fact that the spam was already queried/submitted to the DCC server change that in any way? i.e. doe a query for an e-mail using "dccproc" followed by an "assertion" (using dccproc -t many) that it was spam skew/screw the database in any way? b. From vjs@calcite.rhyolite.com Sun Jul 15 23:50:49 2001 Received: (from vjs@localhost) by calcite.rhyolite.com (8.12.0.Beta11/8.12.0.Beta11) id f6G5onOo001939 for dcc@rhyolite.com env-from ; Sun, 15 Jul 2001 23:50:49 -0600 (MDT) Date: Sun, 15 Jul 2001 23:50:49 -0600 (MDT) From: Vernon Schryver References: <200107160319.f6G3JNS7022085@calcite.rhyolite.com> <20010715213648.E21463@pc.ilinx> Message-Id: <200107160550.f6G5onOo001939@calcite.rhyolite.com> To: dcc@rhyolite.com Subject: Re: Any DCC servers available? Sender: dcc-admin@rhyolite.com Errors-To: dcc-admin@rhyolite.com X-BeenThere: dcc@rhyolite.com X-Mailman-Version: 2.0.5 Precedence: bulk List-Help: List-Post: List-Subscribe: , List-Id: Distributed Checksum Clearinghouse source List-Unsubscribe: , List-Archive: > From: "Brian J. Murrell" > ... > OK. Do you mind if I point my client at it for a few weeks (too > long?) until I see how well it does? Feel free. > Does your server see a lot of > spam to checksum or are you like me in that you are only processing > e-mail for a small sample of users at your site? The command `cdcc "host dcc.rhyolite.com; stats"` displays a bunch of stuff about my server. That recently said: dcc.rhyolite.com calcite.rhyolite.com 192.188.61.3,6277 server-ID 101 /var/dcc/map 23:04:43 version 1.0.19 DB locked tracing ANON CLNT 73725 hash entries 57139 used 2634096 DB bytes 5 ms delay 224 NOPs 12014 ADMN 8 query 5 clients 240 reports 0>10 0>100 0>1000 40 many answers 25>10 0>100 0>1000 54 many 101 whitelisted 0 bad IDs 0 passwds 0 error responses 4 retransmitted 0 answers rate-limited 0 anonymous 0 rejected reports flood on 6 streams 1 out active 5 in 32658 total flooded in 9461 accepted 5 stale 23101 dup 91 white 0 delete 0 bad id 189 mmap 378227 hashed 189688 records mapped 9711 added since Jul 12 19:05:15.215415 MDT From that you can see - I last restarted dccd July 12 - my database has 57,139 checksums (and you can deduce that's mostly bulk mail) - since July 12, it has had 240 direct reports from DCC clients and 32,658 reports of more or less bulky mail from at least 6 other DCC servers, of which 9,461 were unique. The flooding algorithm does not have an equalvalent of the netnews Path: line and so a star-connected network of servers sees duplicates. > Of course it would help if I whitelist all of my real bulk traffic > like mailing lists etc. right? yes, unless you want to reject absolutely all bulk traffic, including mailing lists including this one. (which reminds me to add this list to my white lists) > dccproc[1] does not even make a > connection to the server for whitelisted traffice correct? yes, provided you mean a client-side whitelist instead of the server-side whitelist. > > One of the features of the servers is that they flood reports of > > checksums of bulk mail among each other. ... > The idea is to create a network of servers exchanging checksum > databases then, not unlike the way usenet servers work with articles? exactly. > Is there provision in the protcol for knowing when duplicate > information is being received in the case of one server being > connected to several other servers? Each report ought to be uniquely identified by a server ID and a timestamp applied by that server. "Ought to" because there is a server-ID translating mechanism to deal with duplicate IDs and other issues. > ... > Got 'er working to your server! If your reverse DNS name starts with "adsl" and you've sent 27 queries, yes, you have. (See `cdcc clients`) > ... > [1] I am only interested in seeing the metrics while I am evaluating -- > if I like the results I will set up the Sendmail hooks. The promise of the DCC is in lots of checksum reports. I suspect, but can't really tell that the current volume I've current access to is 20,000-50,000 messages/day, which is a good start but only a start. ...... ] From: "Brian J. Murrell" ] If I have a piece of spam in my mailbox and I know it was not ] targetted at me, should I do a "dccproc -t many" on it to tell the DCC ] server(s) that this is definately spam? i.e. as a community service ] to other users of the DCC server. Yes, that's what I do. I start the traceroutes, begin the queries to whois.abuse.net, and open web pages URLs that may be innocent. While those drag along, I use `dccproc -t many` to get the spam into the database. I sometimes first use `dccproc -Q` to see if someone has beaten me too it or to check various things, such as the operation of the de-quoted-printable machinery in the checksumming. For now, most of that 20-50K messages is does not involve spam traps, but people report very good look with thresholds of 50-200. ] Even if I don't know for sure that there were any other recipients ] (as a single user, I can't know for absolutely sure that there were ] "many" recipients) should I still? Yes. If you were the only recipient, the worst that does to add a record to the databases of coooperating (flooding) servers. If no one else gets a copy of that message, then no one else can have that message rejected with the help of that record. If you send a message to two friends and one of them reports it as spam with `dccproc -t many` so the other doesn't receive it, then maybe you need to change friends. ] Does the fact that the spam was already queried/submitted to the DCC ] server change that in any way? i.e. doe a query for an e-mail using ] "dccproc" followed by an "assertion" (using dccproc -t many) that it ] was spam skew/screw the database in any way? `dccproc -Q` does nothing to the database. `dccproc` (or equivalently `dccproc -t 1`) adds a record with a target count of 1. `dccproc -t many` is simply a shorthand for doing `dccproc -t 1` so many times that the fixed width total count overflows. It's about 24 bits wide. Again, the DCC has a 0% false positive rate for detecting bulk mail if you define bulk as >1 recipient and assume no misconfiguration such as not knowing you are reporting the same message twice. How much of that bulk mail is spam depends on white lists. Vernon Schryver vjs@rhyolite.com From ab19127a3e229c58cfd2355455a7527e@interlinx.bc.ca Mon Jul 16 01:25:16 2001 Received: from linux.interlinx.bc.ca (adsl-63-206-118-108.dsl.snfc21.pacbell.net [63.206.118.108]) by calcite.rhyolite.com (8.12.0.Beta11/8.12.0.Beta11) with ESMTP id f6G7PBDJ004959 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO) for env-from ; Mon, 16 Jul 2001 01:25:15 -0600 (MDT) Received: (from nobody@localhost) by linux.interlinx.bc.ca (8.11.3/8.11.3) id f6G7PAO26162 for ; Mon, 16 Jul 2001 00:25:10 -0700 Received: from pc.ilinx(10.75.2.1), claiming to be "pc.interlinx.bc.ca" via SMTP by linux.ilinx, id smtpde84gsR; Mon Jul 16 03:25:00 2001 Received: by pc.interlinx.bc.ca (Postfix, from userid 1001) id DF4FF231BF; Mon, 16 Jul 2001 00:24:59 -0700 (PDT) Date: Mon, 16 Jul 2001 00:24:59 -0700 To: dcc@rhyolite.com Subject: More headers for whitelisting? Message-ID: <20010716002459.K21463@pc.ilinx> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.3.19i From: "Brian J. Murrell" X-DCC-RHYOLITE-Metrics: calcite.rhyolite.com 101; IP=9 env_From=1 From=8 Subject=1 Message-ID=1 Received=1 Body=1 Fuz1=1 Sender: dcc-admin@rhyolite.com Errors-To: dcc-admin@rhyolite.com X-BeenThere: dcc@rhyolite.com X-Mailman-Version: 2.0.5 Precedence: bulk List-Help: List-Post: List-Subscribe: , List-Id: Distributed Checksum Clearinghouse source List-Unsubscribe: , List-Archive: I am doing dccproc for the time being. Of course by the time an e-mail makes it to a point where dccproc can use it (i.e. procmail or other local delivery means) one loses all the envlope information, save for "Return-Path:" and/or "Delivered-To:" headers if one's MTA is so gracious. I am wondering if it would not be useful in whitelisting (at least) to at least recognize locally more headers. Personally, I would ideally like to see Return-Path: and Sender: matching for whitelisting purposes. I am a hacker. If somebody wants to point me to the right place to hack, I will hack it in. b. From gustavf@initio.no Mon Jul 16 01:35:36 2001 Received: from mail.initio.no (nalle.initio.no [62.92.112.203]) by calcite.rhyolite.com (8.12.0.Beta11/8.12.0.Beta11) with SMTP id f6G7ZXDI005193 for env-from ; Mon, 16 Jul 2001 01:35:35 -0600 (MDT) Received: (qmail 29628 invoked by uid 28673); 16 Jul 2001 07:35:33 -0000 Date: Mon, 16 Jul 2001 09:35:33 +0200 From: Gustav Foseid To: dcc@rhyolite.com Subject: Problem creating database Message-ID: <20010716093533.B29520@initio.no> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Mailer: Mutt 1.0.1i X-DCC-RHYOLITE-Metrics: calcite.rhyolite.com 101; IP=3 env_From=3 From=3 Subject=1 Message-ID=1 Received=1 Body=1 Fuz1=1 Sender: dcc-admin@rhyolite.com Errors-To: dcc-admin@rhyolite.com X-BeenThere: dcc@rhyolite.com X-Mailman-Version: 2.0.5 Precedence: bulk List-Help: List-Post: List-Subscribe: , List-Id: Distributed Checksum Clearinghouse source List-Unsubscribe: , List-Archive: Hi! I have a problem creating a dcc database for a server running og Linux (Debian) i get the following error: # /var/dcc/libexec/dbclean -N -S 1.0.18 clearing database /var/dcc/dcc_db invalid database address 0 could not start database /var/dcc/dcc_db-new Am I doing something wrong, or is this a Debian problem? -- Gustav Foseid, Initio IT-løsninger AS gustavf@initio.no From vjs@calcite.rhyolite.com Mon Jul 16 01:49:58 2001 Received: (from vjs@localhost) by calcite.rhyolite.com (8.12.0.Beta11/8.12.0.Beta11) id f6G7nwSf005410 for dcc@rhyolite.com env-from ; Mon, 16 Jul 2001 01:49:58 -0600 (MDT) Date: Mon, 16 Jul 2001 01:49:58 -0600 (MDT) From: Vernon Schryver Message-Id: <200107160749.f6G7nwSf005410@calcite.rhyolite.com> References: <20010716093533.B29520@initio.no> To: dcc@rhyolite.com Subject: Re: Problem creating database Sender: dcc-admin@rhyolite.com Errors-To: dcc-admin@rhyolite.com X-BeenThere: dcc@rhyolite.com X-Mailman-Version: 2.0.5 Precedence: bulk List-Help: List-Post: List-Subscribe: , List-Id: Distributed Checksum Clearinghouse source List-Unsubscribe: , List-Archive: > From: Gustav Foseid > I have a problem creating a dcc database for a server running og Linux > (Debian) i get the following error: > > # /var/dcc/libexec/dbclean -N -S > 1.0.18 clearing database /var/dcc/dcc_db > invalid database address 0 > could not start database /var/dcc/dcc_db-new > > Am I doing something wrong, or is this a Debian problem? I fiddled around a little trying to reproduce the problem and found I could if I created an empty /var/dcc/whitelist with `touch`. Is there any chance that your whitelist is an empty file? If so, it is unlikely that you really want an empty whitelist. Most people probably want at least their own IP addresses in their whitelists. Most probably want the contents of the sample files homedir/whitelist and the included homedir/whitecommon. Of course, it is a bug that an empty whitelist file doesn't work. If I've guessed wrong about the trigger for the bug, please let me know. Vernon Schryver vjs@rhyolite.com From gustavf@initio.no Mon Jul 16 03:04:43 2001 Received: from mail.initio.no (nalle.initio.no [62.92.112.203]) by calcite.rhyolite.com (8.12.0.Beta11/8.12.0.Beta11) with SMTP id f6G94eDI006730 for env-from ; Mon, 16 Jul 2001 03:04:42 -0600 (MDT) Received: (qmail 1880 invoked by uid 28673); 16 Jul 2001 09:04:38 -0000 Date: Mon, 16 Jul 2001 11:04:38 +0200 From: Gustav Foseid To: dcc@rhyolite.com Subject: Re: Problem creating database Message-ID: <20010716110438.E29520@initio.no> References: <200107160749.f6G7nwSf005410@calcite.rhyolite.com> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Mailer: Mutt 1.0.1i In-Reply-To: <200107160749.f6G7nwSf005410@calcite.rhyolite.com>; from vjs@calcite.rhyolite.com on Mon, Jul 16, 2001 at 01:49:58AM -0600 X-DCC-RHYOLITE-Metrics: calcite.rhyolite.com 101; IP=4 env_From=4 From=5 Subject=2 Message-ID=1 Received=1 Body=1 Fuz1=1 Sender: dcc-admin@rhyolite.com Errors-To: dcc-admin@rhyolite.com X-BeenThere: dcc@rhyolite.com X-Mailman-Version: 2.0.5 Precedence: bulk List-Help: List-Post: List-Subscribe: , List-Id: Distributed Checksum Clearinghouse source List-Unsubscribe: , List-Archive: Vernon Schryver: > I fiddled around a little trying to reproduce the problem and found > I could if I created an empty /var/dcc/whitelist with `touch`. > > Is there any chance that your whitelist is an empty file? Yes, it was. When I used whitecommon it worked. > If so, it is unlikely that you really want an empty whitelist. > Most people probably want at least their own IP addresses in their > whitelists. Most probably want the contents of the sample > files homedir/whitelist and the included homedir/whitecommon. For testing purposes I wanted a blank whitelist. > Of course, it is a bug that an empty whitelist file doesn't work. > > If I've guessed wrong about the trigger for the bug, please let me know. You were right :-) -- Gustav Foseid, Initio IT-løsninger AS gustavf@initio.no From vjs@calcite.rhyolite.com Mon Jul 16 08:58:13 2001 Received: (from vjs@localhost) by calcite.rhyolite.com (8.12.0.Beta11/8.12.0.Beta11) id f6GEwD5s013002 for dcc env-from ; Mon, 16 Jul 2001 08:58:13 -0600 (MDT) Date: Mon, 16 Jul 2001 08:58:13 -0600 (MDT) From: Vernon Schryver Message-Id: <200107161458.f6GEwD5s013002@calcite.rhyolite.com> References: <20010716002459.K21463@pc.ilinx> To: dcc@calcite.rhyolite.com Subject: Re: More headers for whitelisting? Sender: dcc-admin@rhyolite.com Errors-To: dcc-admin@rhyolite.com X-BeenThere: dcc@rhyolite.com X-Mailman-Version: 2.0.5 Precedence: bulk List-Help: List-Post: List-Subscribe: , List-Id: Distributed Checksum Clearinghouse source List-Unsubscribe: , List-Archive: > From: "Brian J. Murrell" > I am doing dccproc for the time being. Of course by the time an > e-mail makes it to a point where dccproc can use it (i.e. procmail or > other local delivery means) one loses all the envlope information, > save for "Return-Path:" and/or "Delivered-To:" headers if one's MTA is > so gracious. > > I am wondering if it would not be useful in whitelisting (at least) to > at least recognize locally more headers. Personally, I would ideally > like to see Return-Path: and Sender: matching for whitelisting > purposes. > ... If the SMTP client's IP address is reliably represented in a header added by the last MTA, then it could be picked out and given to dccproc with as the value of -a. RFC 2821 says that Return-Path should contain the value of the envelope Mail_From command. I will make the next version of dccproc optionally (or maybe by default?) use the value of a Return-Path header instead of -f (or maybe when -f is absent?). I can't see a compelling use for the value of sender header, because according to section 3.6.2 of RFC 2822, it is approximately the same as the header From value. Personally, I'd not whitelist except on values that are unlikely to be forged, including the envelope Rcpt_To value and the IP address of the SMTP client. The checksum types used by dccproc for whitelisting use the same very precious namespace as checksum types in the on-the-wire protocol. That space is precious because it is tiny (I think 4 bits) to keep the database used by the DCC server small. That matters if you want to allow a single database to have checksums for a noticable fraction of the mail messages in the Internet. Given the environment in which dccproc is used, this should not be a problem. It should be possible to use familiar tools to avoid asking dccproc about messages with stigmata that dccproc doesn't notice. Vernon Schryver vjs@rhyolite.com From 28a9dbb6826bb5eadc65849aa349b56f@interlinx.bc.ca Mon Jul 16 10:33:43 2001 Received: from linux.interlinx.bc.ca (adsl-63-206-118-108.dsl.snfc21.pacbell.net [63.206.118.108]) by calcite.rhyolite.com (8.12.0.Beta11/8.12.0.Beta11) with ESMTP id f6GGXcDJ015243 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO) for env-from <28a9dbb6826bb5eadc65849aa349b56f@interlinx.bc.ca>; Mon, 16 Jul 2001 10:33:41 -0600 (MDT) Received: (from nobody@localhost) by linux.interlinx.bc.ca (8.11.3/8.11.3) id f6GGXbf30174 for ; Mon, 16 Jul 2001 09:33:37 -0700 Received: from pc.ilinx(10.75.2.1), claiming to be "pc.interlinx.bc.ca" via SMTP by linux.ilinx, id smtpd5zegMG; Mon Jul 16 12:33:31 2001 Received: by pc.interlinx.bc.ca (Postfix, from userid 1001) id C2E99231BF; Mon, 16 Jul 2001 09:33:30 -0700 (PDT) Date: Mon, 16 Jul 2001 09:33:30 -0700 To: dcc@calcite.rhyolite.com Subject: Re: More headers for whitelisting? Message-ID: <20010716093330.T21463@pc.ilinx> References: <200107161458.f6GEwD5s013002@calcite.rhyolite.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200107161458.f6GEwD5s013002@calcite.rhyolite.com> User-Agent: Mutt/1.3.19i From: "Brian J. Murrell" X-DCC-RHYOLITE-Metrics: calcite.rhyolite.com 101; IP=10 env_From=1 From=10 Subject=2 Message-ID=1 Received=1 Body=1 Fuz1=1 Sender: dcc-admin@rhyolite.com Errors-To: dcc-admin@rhyolite.com X-BeenThere: dcc@rhyolite.com X-Mailman-Version: 2.0.5 Precedence: bulk List-Help: List-Post: List-Subscribe: , List-Id: Distributed Checksum Clearinghouse source List-Unsubscribe: , List-Archive: On Mon, Jul 16, 2001 at 08:58:13AM -0600, Vernon Schryver wrote: > > If the SMTP client's IP address is reliably represented in a > header added by the last MTA, then it could be picked > out and given to dccproc with as the value of -a. Hmmmm. Maybe I will cobble up some procmail to yank it out of the Received: header that my MTA adds. > RFC 2821 says that Return-Path should contain the value of the envelope > Mail_From command. Indeed. > I will make the next version of dccproc optionally > (or maybe by default?) use the value of a Return-Path header instead > of -f (or maybe when -f is absent?). In absense of -f sounds good. > I can't see a compelling use for the value of sender header, because > according to section 3.6.2 of RFC 2822, it is approximately the same > as the header From value. Personally, I'd not whitelist except on > values that are unlikely to be forged, including the envelope Rcpt_To > value and the IP address of the SMTP client. Indeed, and I agree. But in dccproc (which is less than optimal itself) those are not available. The Sender is forgable yes, but it is also pretty reliable for whitelisting mailing lists. > The checksum types used by dccproc for whitelisting use the same > very precious namespace as checksum types in the on-the-wire protocol. > That space is precious because it is tiny (I think 4 bits) to keep > the database used by the DCC server small. That matters if you want > to allow a single database to have checksums for a noticable fraction > of the mail messages in the Internet. I don't think I was thinking about checksumming them, just using them to tell dccproc not to checksum/database file/lookup the e-mail. > Given the environment in which dccproc is used, this should not be a > problem. It should be possible to use familiar tools to avoid asking > dccproc about messages with stigmata that dccproc doesn't notice. I supposed I could whitelist mailing lists in procmail itself. I was just hoping to do it with DCC itself so that porting to the SMTP initiated DCC would be painless. b. From vjs@calcite.rhyolite.com Mon Jul 16 10:56:00 2001 Received: (from vjs@localhost) by calcite.rhyolite.com (8.12.0.Beta11/8.12.0.Beta11) id f6GGu0Rl015692 for dcc@calcite.rhyolite.com env-from ; Mon, 16 Jul 2001 10:56:00 -0600 (MDT) Date: Mon, 16 Jul 2001 10:56:00 -0600 (MDT) From: Vernon Schryver Message-Id: <200107161656.f6GGu0Rl015692@calcite.rhyolite.com> To: dcc@calcite.rhyolite.com Subject: Re: More headers for whitelisting? references: <20010716093330.T21463@pc.ilinx> Sender: dcc-admin@rhyolite.com Errors-To: dcc-admin@rhyolite.com X-BeenThere: dcc@rhyolite.com X-Mailman-Version: 2.0.5 Precedence: bulk List-Help: List-Post: List-Subscribe: , List-Id: Distributed Checksum Clearinghouse source List-Unsubscribe: , List-Archive: > From: "Brian J. Murrell" > ... > > as the header From value. Personally, I'd not whitelist except on > > values that are unlikely to be forged, including the envelope Rcpt_To > > value and the IP address of the SMTP client. > > Indeed, and I agree. But in dccproc (which is less than optimal > itself) those are not available. ... Generally, where dccproc is used, the Rcpt_to can be available, such as in the value of the $USER environment variable or the -d value given to procmail. > ... > I don't think I was thinking about checksumming them, just using them > to tell dccproc not to checksum/database file/lookup the e-mail. ... For speed, the client-side whitelisting uses much the same checksumming mechanism. Dccproc generates checksums and then looks them up in a local or private hash table. Based on those results it passes the message (when white-listed) or talks to the DCC server (when not listed or locally blacklisted). Vernon Schryver vjs@rhyolite.com From vjs@calcite.rhyolite.com Wed Jul 18 11:08:31 2001 Received: (from vjs@localhost) by calcite.rhyolite.com (8.12.0.Beta11/8.12.0.Beta11) id f6IH8VMX012070 for dcc env-from ; Wed, 18 Jul 2001 11:08:31 -0600 (MDT) Date: Wed, 18 Jul 2001 11:08:31 -0600 (MDT) From: Vernon Schryver Message-Id: <200107181708.f6IH8VMX012070@calcite.rhyolite.com> To: dcc@calcite.rhyolite.com Subject: version 1.0.19 of the DCC source Sender: dcc-admin@rhyolite.com Errors-To: dcc-admin@rhyolite.com X-BeenThere: dcc@rhyolite.com X-Mailman-Version: 2.0.5 Precedence: bulk List-Help: List-Post: List-Subscribe: , List-Id: Distributed Checksum Clearinghouse source List-Unsubscribe: , List-Archive: I've put the verison 1.0.19 DCC source in http://www.rhyolite.com/dcc/dcc-src/ Unless you are trying to make it work on NetBSD, the changes are probably not earth shaking. 1.0.19 improve `cdcc stats` flood formatting fix `cdcc "host domain.com; stats all"` change dccproc to use the value of the Return-Path: header for the envelope-From checksum if the header is present and -f is not used. fix `dbclean -S -N` when the whitelist is empty add rough support for NetBSD. mention dccd in the INSTALL file. fix for parsing "-L error,LOCAL1.ERR" from Vincent Schonau Vernon Schryver vjs@rhyolite.com From cshenton@Outbounder.com Wed Jul 18 11:29:26 2001 Received: from Samizdat.outbounder.com (samizdat.outbounder.com [198.202.217.54]) by calcite.rhyolite.com (8.12.0.Beta11/8.12.0.Beta11) with ESMTP id f6IHTN04012597 for env-from ; Wed, 18 Jul 2001 11:29:25 -0600 (MDT) Received: (from cshenton@localhost) by Samizdat.outbounder.com (8.9.3/8.9.3) id NAA14983; Wed, 18 Jul 2001 13:29:20 -0400 (EDT) To: dcc@calcite.rhyolite.com Subject: Re: version 1.0.19 of the DCC source References: <200107181708.f6IH8VMX012070@calcite.rhyolite.com> From: Chris Shenton Date: 18 Jul 2001 13:29:20 -0400 In-Reply-To: Vernon Schryver's message of "Wed, 18 Jul 2001 11:08:31 -0600 (MDT)" Message-ID: Lines: 29 User-Agent: Gnus/5.0807 (Gnus v5.8.7) Emacs/20.7 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-DCC-RHYOLITE-Metrics: calcite.rhyolite.com 101; IP=3 env_From=3 From=3 Subject=1 Message-ID=1 Received=1 Body=1 Fuz1=1 Sender: dcc-admin@rhyolite.com Errors-To: dcc-admin@rhyolite.com X-BeenThere: dcc@rhyolite.com X-Mailman-Version: 2.0.5 Precedence: bulk List-Help: List-Post: List-Subscribe: , List-Id: Distributed Checksum Clearinghouse source List-Unsubscribe: , List-Archive: Any plans to port to Slowaris? I am just starting to play with DCC and it bombed spectacularly when trying to build. :-) Actually, the first problem was with "./configure" where Slowaris "tr" hangs forever on the dash; fix: *** configure Tue Jul 17 11:39:31 2001 --- configure~ Tue Jul 10 16:11:52 2001 *************** *** 544,550 **** # sendmail thinks all AIX systems are "PPC" PLATFORM=`uname -vs | tr ' /' '.-'`.`uname -r`.PPC else ! PLATFORM=`uname -rsm | tr ' /' '.\-'` fi SENDMAIL_OBJ=$SENDMAIL/obj.$PLATFORM fi --- 544,550 ---- # sendmail thinks all AIX systems are "PPC" PLATFORM=`uname -vs | tr ' /' '.-'`.`uname -r`.PPC else ! PLATFORM=`uname -rsm | tr ' /' '.-'` fi SENDMAIL_OBJ=$SENDMAIL/obj.$PLATFORM fi I've got Slowaris with sendmail at work but FreeBSD with qmail at home so I'll try that next. Thanks. From vjs@calcite.rhyolite.com Wed Jul 18 12:37:22 2001 Received: (from vjs@localhost) by calcite.rhyolite.com (8.12.0.Beta11/8.12.0.Beta11) id f6IIbMLG020882 for dcc@calcite.rhyolite.com env-from ; Wed, 18 Jul 2001 12:37:22 -0600 (MDT) Date: Wed, 18 Jul 2001 12:37:22 -0600 (MDT) From: Vernon Schryver Message-Id: <200107181837.f6IIbMLG020882@calcite.rhyolite.com> To: dcc@calcite.rhyolite.com Subject: Re: version 1.0.19 of the DCC source references: <200107181708.f6IH8VMX012070@calcite.rhyolite.com> Sender: dcc-admin@rhyolite.com Errors-To: dcc-admin@rhyolite.com X-BeenThere: dcc@rhyolite.com X-Mailman-Version: 2.0.5 Precedence: bulk List-Help: List-Post: List-Subscribe: , List-Id: Distributed Checksum Clearinghouse source List-Unsubscribe: , List-Archive: > From: Chris Shenton > Any plans to port to Slowaris? ... If I had access to a system running Solaris, I'd try to port it. Note that dccm requires not just sendmail but sendmail with the milter interface. That appeared somewhere in 8.10.*, but is better in 8.11 and best in 8.12. Vernon Schryver vjs@rhyolite.com From gustavf@initio.no Thu Jul 19 01:43:19 2001 Received: from mail.initio.no (nalle.initio.no [62.92.112.203]) by calcite.rhyolite.com (8.12.0.Beta11/8.12.0.Beta11) with SMTP id f6J7hF04021910 for env-from ; Thu, 19 Jul 2001 01:43:18 -0600 (MDT) Received: (qmail 1649 invoked by uid 28673); 19 Jul 2001 07:43:14 -0000 Date: Thu, 19 Jul 2001 09:43:14 +0200 From: Gustav Foseid To: dcc@rhyolite.com Subject: Maintaining a whitelist Message-ID: <20010719094314.B30284@initio.no> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Mailer: Mutt 1.0.1i X-DCC-RHYOLITE-Metrics: calcite.rhyolite.com 101; IP=6 env_From=6 From=8 Subject=1 Message-ID=1 Received=1 Body=1 Fuz1=1 Sender: dcc-admin@rhyolite.com Errors-To: dcc-admin@rhyolite.com X-BeenThere: dcc@rhyolite.com X-Mailman-Version: 2.0.5 Precedence: bulk List-Help: List-Post: List-Subscribe: , List-Id: Distributed Checksum Clearinghouse source List-Unsubscribe: , List-Archive: I have a problem maintaining a whitelist... My DCC server is collecting checksums from a server acting as an UUCP relay (no, UUCP is not dead). A few hundred domains have their MX pointing to the UUCP server and download their mail with UUCP. This adds up to many thousand users. Maintaining a list of mailing lists these people are subscribing to is an impossible task. Most mailing lists are very unlikely to have many recipients here, but I am more concerned with auto responders off all kinds. I am interested in other peoples experiences with building whitelists. -- Gustav Foseid, Initio IT-løsninger AS gustavf@initio.no From vjs@calcite.rhyolite.com Thu Jul 19 09:46:10 2001 Received: (from vjs@localhost) by calcite.rhyolite.com (8.12.0.Beta11/8.12.0.Beta11) id f6JFkABc007714 for dcc@rhyolite.com env-from ; Thu, 19 Jul 2001 09:46:10 -0600 (MDT) Date: Thu, 19 Jul 2001 09:46:10 -0600 (MDT) From: Vernon Schryver Message-Id: <200107191546.f6JFkABc007714@calcite.rhyolite.com> To: dcc@rhyolite.com Subject: Re: Maintaining a whitelist references: <20010719094314.B30284@initio.no> Sender: dcc-admin@rhyolite.com Errors-To: dcc-admin@rhyolite.com X-BeenThere: dcc@rhyolite.com X-Mailman-Version: 2.0.5 Precedence: bulk List-Help: List-Post: List-Subscribe: , List-Id: Distributed Checksum Clearinghouse source List-Unsubscribe: , List-Archive: > From: Gustav Foseid > I have a problem maintaining a whitelist... > > My DCC server is collecting checksums from a server acting as an UUCP relay > (no, UUCP is not dead). A few hundred domains have their MX pointing to the > UUCP server and download their mail with UUCP. This adds up to many > thousand users. > > Maintaining a list of mailing lists these people are subscribing to is an > impossible task. Most mailing lists are very unlikely to have many > recipients here, but I am more concerned with auto responders off all > kinds. > > I am interested in other peoples experiences with building whitelists. I think avoiding false positives is more important than preventing false nagatives. It is better to pass spam than to reject legitimate mail. There are at least two ways to do that with the DCC: 1. only insert the headers and let end users do any rejecting using procmail or other tools. 2. use the To: whitelist mechanism in dccm. Mail sent to white-listed targets is passed regardless of the DCC counts. Perhaps use `dccm -W` and "OK2" entries to cause only addresses explicitly listed to be affected. (sheesh!--who wrote and proofread the description of -W?) Perhaps I need to change how dccm treates mail addressed simultaneously to white-listed and unlisted addressees. Currently, the message is rejected, discarded, or delivered to all addressees. I've resisted doing this, because it requires that dccm record the addressees at the start of the SMTP transaction and at the end remove those who should not receive the message. It is also not clear what SMTP reply status I should generate for a message that is partly delivered. What do you think? Vernon Schryver vjs@rhyolite.com From gustavf@initio.no Fri Jul 20 05:56:08 2001 Received: from mail.initio.no (nalle.initio.no [62.92.112.203]) by calcite.rhyolite.com (8.12.0.Beta11/8.12.0.Beta11) with SMTP id f6KBu204001161 for env-from ; Fri, 20 Jul 2001 05:56:07 -0600 (MDT) Received: (qmail 20035 invoked by uid 28673); 20 Jul 2001 11:56:02 -0000 Date: Fri, 20 Jul 2001 13:56:02 +0200 From: Gustav Foseid To: dcc@rhyolite.com Subject: Re: Maintaining a whitelist Message-ID: <20010720135602.L7757@initio.no> References: <20010719094314.B30284@initio.no> <200107191546.f6JFkABc007714@calcite.rhyolite.com> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Mailer: Mutt 1.0.1i In-Reply-To: <200107191546.f6JFkABc007714@calcite.rhyolite.com>; from vjs@calcite.rhyolite.com on Thu, Jul 19, 2001 at 09:46:10AM -0600 X-DCC-RHYOLITE-Metrics: calcite.rhyolite.com 101; IP=9 env_From=9 From=12 Subject=2 Message-ID=1 Received=1 Body=1 Fuz1=1 Sender: dcc-admin@rhyolite.com Errors-To: dcc-admin@rhyolite.com X-BeenThere: dcc@rhyolite.com X-Mailman-Version: 2.0.5 Precedence: bulk List-Help: List-Post: List-Subscribe: , List-Id: Distributed Checksum Clearinghouse source List-Unsubscribe: , List-Archive: Vernon Schryver: > > I am interested in other peoples experiences with building whitelists. > > I think avoiding false positives is more important than preventing > false nagatives. It is better to pass spam than to reject legitimate > mail. There are at least two ways to do that with the DCC: > > 1. only insert the headers and let end users do any rejecting using > procmail or other tools. This might work with "users" that know what a header is, with "lusers" the story is completely different. > 2. use the To: whitelist mechanism in dccm. Mail sent to white-listed > targets is passed regardless of the DCC counts. Perhaps use `dccm -W` > and "OK2" entries to cause only addresses explicitly listed to be > affected. (sheesh!--who wrote and proofread the description of -W?) This could be a solution. I will see how much time I have to write a client that will work for me (I am using mostly Postfix, but also Qmail at some sites). > Perhaps I need to change how dccm treates mail addressed simultaneously > to white-listed and unlisted addressees. Currently, the message > is rejected, discarded, or delivered to all addressees. I've resisted > doing this, because it requires that dccm record the addressees at > the start of the SMTP transaction and at the end remove those who should > not receive the message. It is also not clear what SMTP reply status > I should generate for a message that is partly delivered. > What do you think? I don't know very much about milter, but there is only one way to handle this, taht I can see. The message must be accepted with "250 OK" and a bounce message must be sent reporting a delivery failure to each of the recipients that did not receive the e-mail. -- Gustav Foseid, Initio IT-løsninger AS gustavf@initio.no From vjs@calcite.rhyolite.com Fri Jul 20 09:14:05 2001 Received: (from vjs@localhost) by calcite.rhyolite.com (8.12.0.Beta11/8.12.0.Beta11) id f6KFE5hm007736 for dcc@rhyolite.com env-from ; Fri, 20 Jul 2001 09:14:05 -0600 (MDT) Date: Fri, 20 Jul 2001 09:14:05 -0600 (MDT) From: Vernon Schryver Message-Id: <200107201514.f6KFE5hm007736@calcite.rhyolite.com> To: dcc@rhyolite.com Subject: Re: Maintaining a whitelist references: <20010720135602.L7757@initio.no> Sender: dcc-admin@rhyolite.com Errors-To: dcc-admin@rhyolite.com X-BeenThere: dcc@rhyolite.com X-Mailman-Version: 2.0.5 Precedence: bulk List-Help: List-Post: List-Subscribe: , List-Id: Distributed Checksum Clearinghouse source List-Unsubscribe: , List-Archive: > From: Gustav Foseid > ... > > Perhaps I need to change how dccm treates mail addressed simultaneously > > to white-listed and unlisted addressees. Currently, the message > > is rejected, discarded, or delivered to all addressees. I've resisted > > doing this, because it requires that dccm record the addressees at > > the start of the SMTP transaction and at the end remove those who should > > not receive the message. It is also not clear what SMTP reply status > > I should generate for a message that is partly delivered. > > What do you think? > > I don't know very much about milter, but there is only one way to handle > this, taht I can see. The message must be accepted with "250 OK" and a > bounce message must be sent reporting a delivery failure to each of the > recipients that did not receive the e-mail. Bounce messages for DCC rejections are sent by the remote SMTP client, not the system using the DCC. Why would those recipients who didn't get the message because they consider it spam by virtue of not being in the white list want a delivery failure message? It is possible to reject individual SMTP envelope Rcpt_To values, but the DCC cannot know it should tell sendmail to reject the message until after the body has been received as part of the DATA command. Sendmail can only say to the SMTP client "yes, your DATA command was ok" or "no, it was bogus." That single result code applies to the entire SMTP transaction and all recipients corresponding to Rcpt_to commands not previously rejected (in this case all of them). After thinking about it for a day, my inclination is to say that when there is a mixture of white-listed and not white-listed envelope To values, the message should be treated as "DISCARD" for the not white-listed values. Vernon Schryver vjs@rhyolite.com From vjs@calcite.rhyolite.com Fri Jul 20 09:24:46 2001 Received: (from vjs@localhost) by calcite.rhyolite.com (8.12.0.Beta11/8.12.0.Beta11) id f6KFOk0S008058 for dcc@rhyolite.com env-from ; Fri, 20 Jul 2001 09:24:46 -0600 (MDT) Date: Fri, 20 Jul 2001 09:24:46 -0600 (MDT) From: Vernon Schryver Message-Id: <200107201524.f6KFOk0S008058@calcite.rhyolite.com> To: dcc@rhyolite.com Subject: Re: Maintaining a whitelist references: <20010720135602.L7757@initio.no> Sender: dcc-admin@rhyolite.com Errors-To: dcc-admin@rhyolite.com X-BeenThere: dcc@rhyolite.com X-Mailman-Version: 2.0.5 Precedence: bulk List-Help: List-Post: List-Subscribe: , List-Id: Distributed Checksum Clearinghouse source List-Unsubscribe: , List-Archive: I forgot to mention an important but perhaps obvious answer to the original question. I think a good way to build a white list is to run for a while and see what shows up in the logs with thresholds higher than what you contemplate. If you are using dccproc, that probably requires some mechanism to capture the X-DCC headers. If you are using dccm, you can set the dccm logging threshold somewhat below your anticipated rejection threshold and set the current rejection threshold at "never." This should be done with an incoming flood of checksum reports to get the best idea of the sources of bulk mail. For example, if you have a single subscriber to the IETF's I-D announcment list and you don't receive reports from others, you might never know that the IETF is a sender of bulk mail and not know to white-list it. Once again, I'd be happy to flood reports from my DCC server. Vernon Schryver vjs@rhyolite.com From gustavf@initio.no Mon Jul 23 01:24:37 2001 Received: from mail.initio.no (nalle.initio.no [62.92.112.203]) by calcite.rhyolite.com (8.12.0.Beta15/8.12.0.Beta11) with SMTP id f6N7OY6u003300 for env-from ; Mon, 23 Jul 2001 01:24:36 -0600 (MDT) Received: (qmail 14664 invoked by uid 28673); 23 Jul 2001 07:24:28 -0000 Date: Mon, 23 Jul 2001 09:24:27 +0200 From: Gustav Foseid To: dcc@rhyolite.com Subject: Re: Maintaining a whitelist Message-ID: <20010723092427.A26802@initio.no> References: <20010720135602.L7757@initio.no> <200107201514.f6KFE5hm007736@calcite.rhyolite.com> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Mailer: Mutt 1.0.1i In-Reply-To: <200107201514.f6KFE5hm007736@calcite.rhyolite.com>; from vjs@calcite.rhyolite.com on Fri, Jul 20, 2001 at 09:14:05AM -0600 X-DCC-RHYOLITE-Metrics: calcite.rhyolite.com 101; IP=10 env_From=10 From=14 Subject=6 Message-ID=1 Received=1 Body=1 Fuz1=1 Sender: dcc-admin@rhyolite.com Errors-To: dcc-admin@rhyolite.com X-BeenThere: dcc@rhyolite.com X-Mailman-Version: 2.0.5 Precedence: bulk List-Help: List-Post: List-Subscribe: , List-Id: Distributed Checksum Clearinghouse source List-Unsubscribe: , List-Archive: Vernon Schryver: > Why would those recipients who didn't get the message because they > consider it spam by virtue of not being in the white list want a > delivery failure message? You did not understand what I meant. > It is possible to reject individual SMTP envelope Rcpt_To values, but > the DCC cannot know it should tell sendmail to reject the message > until after the body has been received as part of the DATA command. > Sendmail can only say to the SMTP client "yes, your DATA command was > ok" or "no, it was bogus." That single result code applies to the > entire SMTP transaction and all recipients corresponding to Rcpt_to > commands not previously rejected (in this case all of them). Think of the way a secondary MX or Qmail does this. If you deliver mail to a secondary MX it will be accepted for delivery if the domain used in the RCPT command is correct. If it later turns out that the message could not be delivered to one or more of the recipients, it will send a bounce message. Qmail works the same way even for local delivery. If you send a message to "DoesNotExist@QmailControlled.domain" you will receive a bounce message from Qmail at the remote end. The message will be accepted at SMTP level and a bounce message will be delivered by Qmail. You could do the same way with dccm. Since it is to late to reject one recipient after 250 is given as response to the RCPT command you can, as you said, reject or accept the entire message for all recipients. Accepting the message, however, is not the same as guaranteeing delivery to all recipients. You can still bounce the message, but it is your responsibility, not the responsibility of the remote system. You can build a list of the recipients as you receive the RCPT to commands. After the DATA command you can either reject the message, if none of the recipients should receive the message, or accept it, if all or some of the reciepients should get the message. After the message is accepted, a bounce message for the recipients that did not receive the message should be sent (this would have to be done by dccm). I do not know if this is possible with milter. -- Gustav Foseid, Initio IT-løsninger AS gustavf@initio.no From vjs@calcite.rhyolite.com Mon Jul 23 08:33:03 2001 Received: (from vjs@localhost) by calcite.rhyolite.com (8.12.0.Beta15/8.12.0.Beta11) id f6NEX3ND018513 for dcc@rhyolite.com env-from ; Mon, 23 Jul 2001 08:33:03 -0600 (MDT) Date: Mon, 23 Jul 2001 08:33:03 -0600 (MDT) From: Vernon Schryver Message-Id: <200107231433.f6NEX3ND018513@calcite.rhyolite.com> To: dcc@rhyolite.com Subject: Re: Maintaining a whitelist references: <20010723092427.A26802@initio.no> Sender: dcc-admin@rhyolite.com Errors-To: dcc-admin@rhyolite.com X-BeenThere: dcc@rhyolite.com X-Mailman-Version: 2.0.5 Precedence: bulk List-Help: List-Post: List-Subscribe: , List-Id: Distributed Checksum Clearinghouse source List-Unsubscribe: , List-Archive: > From: Gustav Foseid > ... > Qmail works the same way even for local delivery. If you send a message to > "DoesNotExist@QmailControlled.domain" you will receive a bounce message > from Qmail at the remote end. The message will be accepted at SMTP level > and a bounce message will be delivered by Qmail. I did not realize that Qmail is so badily broken and non-compliant with RFC 821 and RFC 2821. > ... > You can build a list of the recipients as you receive the RCPT to commands. > After the DATA command you can either reject the message, if none of the > recipients should receive the message, or accept it, if all or some of the > reciepients should get the message. After the message is accepted, a bounce > message for the recipients that did not receive the message should be sent > (this would have to be done by dccm). > > I do not know if this is possible with milter. Milter simply allows you to add code to the middle of sendmail without hacking on sendmail. There is still code running on a corporate gateway I controlled until a few years ago that does body filtering somewhat similar to the way dccm works. I modified sendmail itself. It would be a bad idea for for dccm to try to originate non-delivery messages or bounces. It's not just the hassle of fork()'ing a process to talk to sendmail to send the bounce. It is that in practice many and usually most of the bounces would not be deliverable and would themselves generate double-bounces which would land in the postmaster mailbox. Vernon Schryver vjs@rhyolite.com From gustavf@initio.no Mon Jul 23 08:46:49 2001 Received: from mail.initio.no (nalle.initio.no [62.92.112.203]) by calcite.rhyolite.com (8.12.0.Beta15/8.12.0.Beta11) with SMTP id f6NEkl6u019708 for env-from ; Mon, 23 Jul 2001 08:46:48 -0600 (MDT) Received: (qmail 19048 invoked by uid 28673); 23 Jul 2001 14:46:46 -0000 Date: Mon, 23 Jul 2001 16:46:46 +0200 From: Gustav Foseid To: dcc@rhyolite.com Subject: Re: Maintaining a whitelist Message-ID: <20010723164645.B18444@initio.no> References: <20010720135602.L7757@initio.no> <200107201524.f6KFOk0S008058@calcite.rhyolite.com> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Mailer: Mutt 1.0.1i In-Reply-To: <200107201524.f6KFOk0S008058@calcite.rhyolite.com>; from vjs@calcite.rhyolite.com on Fri, Jul 20, 2001 at 09:24:46AM -0600 X-DCC-RHYOLITE-Metrics: calcite.rhyolite.com 101; IP=8 env_From=8 From=12 Subject=9 Message-ID=1 Received=1 Body=1 Fuz1=1 Sender: dcc-admin@rhyolite.com Errors-To: dcc-admin@rhyolite.com X-BeenThere: dcc@rhyolite.com X-Mailman-Version: 2.0.5 Precedence: bulk List-Help: List-Post: List-Subscribe: , List-Id: Distributed Checksum Clearinghouse source List-Unsubscribe: , List-Archive: Vernon Schryver: > Once again, I'd be happy to flood reports from my DCC server. I am interested as soon as I get my system out of testing. But before that is going to happen, I am going to have a life and have couple og weeks of vacation :-) -- Gustav Foseid, Initio IT-løsninger AS gustavf@initio.no From vjs@calcite.rhyolite.com Mon Jul 23 09:16:35 2001 Received: (from vjs@localhost) by calcite.rhyolite.com (8.12.0.Beta15/8.12.0.Beta11) id f6NFGZ5o020738 for dcc@rhyolite.com env-from ; Mon, 23 Jul 2001 09:16:35 -0600 (MDT) Date: Mon, 23 Jul 2001 09:16:35 -0600 (MDT) From: Vernon Schryver Message-Id: <200107231516.f6NFGZ5o020738@calcite.rhyolite.com> To: dcc@rhyolite.com Subject: Re: Maintaining a whitelist references: <20010723164645.B18444@initio.no> Sender: dcc-admin@rhyolite.com Errors-To: dcc-admin@rhyolite.com X-BeenThere: dcc@rhyolite.com X-Mailman-Version: 2.0.5 Precedence: bulk List-Help: List-Post: List-Subscribe: , List-Id: Distributed Checksum Clearinghouse source List-Unsubscribe: , List-Archive: > From: Gustav Foseid > > Once again, I'd be happy to flood reports from my DCC server. > > I am interested as soon as I get my system out of testing. But before that > is going to happen, I am going to have a life and have couple og weeks of > vacation :-) It wouldn't be sane to delay a vacation or not live a life just to deal with spam, but in my book, "testing" is as close to real life as possible. If you don't have an incoming feed of checksums, and if that's how you intended to run it, then I don't see how whatever is happening now is testing. Maybe "preliminary smoke testing," but not "testing." Note that there are good. although less common reasons to run an isolated DCC server that neither sends nor receives flooded reports, as well as cases where only output or only input flooding makes sense. Vernon Schryver vjs@rhyolite.com From gustavf@initio.no Mon Jul 23 09:25:31 2001 Received: from mail.initio.no (nalle.initio.no [62.92.112.203]) by calcite.rhyolite.com (8.12.0.Beta15/8.12.0.Beta11) with SMTP id f6NFPS6u021070 for env-from ; Mon, 23 Jul 2001 09:25:30 -0600 (MDT) Received: (qmail 19365 invoked by uid 28673); 23 Jul 2001 15:25:28 -0000 Date: Mon, 23 Jul 2001 17:25:28 +0200 From: Gustav Foseid To: dcc@rhyolite.com Subject: Re: Maintaining a whitelist Message-ID: <20010723172528.B19076@initio.no> References: <20010723164645.B18444@initio.no> <200107231516.f6NFGZ5o020738@calcite.rhyolite.com> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Mailer: Mutt 1.0.1i In-Reply-To: <200107231516.f6NFGZ5o020738@calcite.rhyolite.com>; from vjs@calcite.rhyolite.com on Mon, Jul 23, 2001 at 09:16:35AM -0600 X-DCC-RHYOLITE-Metrics: calcite.rhyolite.com 101; IP=9 env_From=9 From=14 Subject=12 Message-ID=1 Received=1 Body=1 Fuz1=1 Sender: dcc-admin@rhyolite.com Errors-To: dcc-admin@rhyolite.com X-BeenThere: dcc@rhyolite.com X-Mailman-Version: 2.0.5 Precedence: bulk List-Help: List-Post: List-Subscribe: , List-Id: Distributed Checksum Clearinghouse source List-Unsubscribe: , List-Archive: Vernon Schryver: > > I am interested as soon as I get my system out of testing. But before that > > is going to happen, I am going to have a life and have couple og weeks of > > vacation :-) > > It wouldn't be sane to delay a vacation or not live a life just to deal > with spam, but in my book, "testing" is as close to real life as possible. > If you don't have an incoming feed of checksums, and if that's how you > intended to run it, then I don't see how whatever is happening now is > testing. Maybe "preliminary smoke testing," but not "testing." First of all, my clients do still not use DCC for actual filtering of e-mail. In my book that goes as testing. And I have not found a way to build a whitelist. > Note that there are good. although less common reasons to run an > isolated DCC server that neither sends nor receives flooded reports, > as well as cases where only output or only input flooding makes sense. I have a spamtrap that I could flood to others now. -- Gustav Foseid, Initio IT-løsninger AS gustavf@initio.no From vjs@calcite.rhyolite.com Thu Aug 2 16:12:30 2001 Received: (from vjs@localhost) by calcite.rhyolite.com (8.12.0.Beta16/8.12.0.Beta16) id f72MCUNG012742 for dcc@rhyolite.com env-from ; Thu, 2 Aug 2001 16:12:30 -0600 (MDT) Date: Thu, 2 Aug 2001 16:12:30 -0600 (MDT) From: Vernon Schryver Message-Id: <200108022212.f72MCUNG012742@calcite.rhyolite.com> To: dcc@rhyolite.com Subject: version 1.0.20 of the DCC source Sender: dcc-admin@rhyolite.com Errors-To: dcc-admin@rhyolite.com X-BeenThere: dcc@rhyolite.com X-Mailman-Version: 2.0.5 Precedence: bulk List-Help: List-Post: List-Subscribe: , List-Id: Distributed Checksum Clearinghouse source List-Unsubscribe: , List-Archive: I've put version 1.0.20 of the DCC source in the usual places. The most important change is support for Solaris. However, that involved so many changes that other things might be broken. It did involve discovering a byte-order mistake of mine that required a protocol-turn for the DCC server-to-server checksum flooding protocol. The new version of the DCC server understands the old version of the protocol on incoming connections, but must be told to use "version4" on outgoing connections to old DCC servers of version 1.0.19 and before. The CHANGES file starts with: 1.0.20 support for Solaris describe ways to connect spam traps to the DCC in INSTALL.html move parameters from start-dccd, start-dccm, and cron-dccd to a common file add misc/rcDCC start-up script for Solaris and Linux fix byte-order bug in flood header server ID which requires changing the flood protocol. To flood to version 1.0.19 or older versions of dccd, specifiy version 4 in the flod file line. removed locking file /var/dcc/map.lock change handling of spam sent simultaneously to white-listed and unlisted targets. See the discussion of the new "REJECT_ONLY" action in the dccm man page. Vernon Schryver vjs@rhyolite.com From vjs@calcite.rhyolite.com Thu Aug 2 21:44:57 2001 Received: (from vjs@localhost) by calcite.rhyolite.com (8.12.0.Beta16/8.12.0.Beta16) id f733ivRP002981 for dcc@rhyolite.com env-from ; Thu, 2 Aug 2001 21:44:57 -0600 (MDT) Date: Thu, 2 Aug 2001 21:44:57 -0600 (MDT) From: Vernon Schryver Message-Id: <200108030344.f733ivRP002981@calcite.rhyolite.com> To: dcc@rhyolite.com Subject: version 1.0.21 Sender: dcc-admin@rhyolite.com Errors-To: dcc-admin@rhyolite.com X-BeenThere: dcc@rhyolite.com X-Mailman-Version: 2.0.5 Precedence: bulk List-Help: List-Post: List-Subscribe: , List-Id: Distributed Checksum Clearinghouse source List-Unsubscribe: , List-Archive: So it goes. After releasing 1.0.20, I discovered that I'd broken white-listing spam for dccm by the DCC server when I added the changes for white-listed and unlisted targets of spam. I've put the 1.0.21 source in the usual places. It might be wiser to test ore, but I think the new bug is important to fix. Vernon Schryver vjs@rhyolite.com From vjs@calcite.rhyolite.com Wed Aug 8 21:51:13 2001 Received: (from vjs@localhost) by calcite.rhyolite.com (8.12.0.Beta17/8.12.0.Beta17) id f793pDYf021109 for dcc@rhyolite.com env-from ; Wed, 8 Aug 2001 21:51:13 -0600 (MDT) Date: Wed, 8 Aug 2001 21:51:13 -0600 (MDT) From: Vernon Schryver Message-Id: <200108090351.f793pDYf021109@calcite.rhyolite.com> To: dcc@rhyolite.com Subject: version 1.0.22 Sender: dcc-admin@rhyolite.com Errors-To: dcc-admin@rhyolite.com X-BeenThere: dcc@rhyolite.com X-Mailman-Version: 2.0.5 Precedence: bulk List-Help: List-Post: List-Subscribe: , List-Id: Distributed Checksum Clearinghouse source List-Unsubscribe: , List-Archive: I've put version 1.0.22 in the usual places. The main changes are a possible fix for a report of infinite packet spewing and teaching dccproc to find the IP address in headers. The CHANGES file starts with: 1.0.22 fix infinite loop and packet spew from dccproc when the clock jumps backward or jumps forward more than 1000 seconds. fix syslog process name on Solaris and AIX `dccproc -R` picks IP address out of standard Received: lines fix bugs in decoding quoted printable with broken soft ends of lines Vernon Schryver vjs@rhyolite.com From vjs@calcite.rhyolite.com Fri Aug 17 21:17:11 2001 Received: (from vjs@localhost) by calcite.rhyolite.com (8.12.0.Beta17/8.12.0.Beta17) id f7I3HBLb003682 for dcc@rhyolite.com env-from ; Fri, 17 Aug 2001 21:17:11 -0600 (MDT) Date: Fri, 17 Aug 2001 21:17:11 -0600 (MDT) From: Vernon Schryver Message-Id: <200108180317.f7I3HBLb003682@calcite.rhyolite.com> To: dcc@rhyolite.com Subject: version 1.0.25 Sender: dcc-admin@rhyolite.com Errors-To: dcc-admin@rhyolite.com X-BeenThere: dcc@rhyolite.com X-Mailman-Version: 2.0.5 Precedence: bulk List-Help: List-Post: List-Subscribe: , List-Id: Distributed Checksum Clearinghouse source List-Unsubscribe: , List-Archive: I've released version 1.0.25 of the DCC source. The main changes involve fixes to quoted-printable decoding for fuzzy checksums. Vernon Schryver vjs@rhyolite.com From vjs@calcite.rhyolite.com Thu Aug 23 22:30:57 2001 Received: (from vjs@localhost) by calcite.rhyolite.com (8.12.0.Beta17/8.12.0.Beta17) id f7O4UvlN007180 for dcc@rhyolite.com env-from ; Thu, 23 Aug 2001 22:30:57 -0600 (MDT) Date: Thu, 23 Aug 2001 22:30:57 -0600 (MDT) From: Vernon Schryver Message-Id: <200108240430.f7O4UvlN007180@calcite.rhyolite.com> To: dcc@rhyolite.com Subject: version 1.0.27 Sender: dcc-admin@rhyolite.com Errors-To: dcc-admin@rhyolite.com X-BeenThere: dcc@rhyolite.com X-Mailman-Version: 2.0.5 Precedence: bulk List-Help: List-Post: List-Subscribe: , List-Id: Distributed Checksum Clearinghouse source List-Unsubscribe: , List-Archive: The most important changes in the just released version 1.0.27 are fixes for client IDs larger that 65K. Vernon Schryver vjs@rhyolite.com From bharat@fusionone.com Sun Sep 2 17:38:50 2001 Received: from f1exch.fusionone.com ([63.204.23.32]) by calcite.rhyolite.com (8.12.0.Gamma0/8.12.0.Gamma0) with ESMTP id f82NcnKb013130 for env-from ; Sun, 2 Sep 2001 17:38:49 -0600 (MDT) Received: by f1exch.fusionone.com with Internet Mail Service (5.5.2653.19) id ; Sun, 2 Sep 2001 16:38:48 -0700 Message-ID: From: "Mediratta, Bharat" To: "'dcc@rhyolite.com'" Subject: DCC -- how do I effectively use it? Date: Sun, 2 Sep 2001 16:38:47 -0700 MIME-Version: 1.0 X-Mailer: Internet Mail Service (5.5.2653.19) Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01C13408.6962CD30" X-DCC-rhyolite-Metrics: calcite.rhyolite.com 101; IP=3 env_From=3 From=3 Subject=1 Message-ID=1 Received=1 Body=1 Fuz1=1 Sender: dcc-admin@rhyolite.com Errors-To: dcc-admin@rhyolite.com X-BeenThere: dcc@rhyolite.com X-Mailman-Version: 2.0.5 Precedence: bulk List-Help: List-Post: List-Subscribe: , List-Id: Distributed Checksum Clearinghouse source List-Unsubscribe: , List-Archive: This message is in MIME format. Since your mail reader does not understand this format, some or all of this message may not be legible. ------_=_NextPart_001_01C13408.6962CD30 Content-Type: text/plain; charset="iso-8859-1" Howdy. I'm working on a personal anti-spam project that I'd like to eventually distribute freely (probably under GPL). It is a thin IMAP client that can monitor a mailbox, detect any new spam and move it to a separate spam mailbox. It currently uses DCC as the arbiter for spamminess. The code is operational and probably worthy of being shipped as a beta. However, it's not as effective as I would have hoped. My problem is that I'm not getting very many positive hits from DCC. I know that I'm connected to DCC properly because it does identify certain spam messages correctly, but unfortunately it misses a large percentage of them. I ran it against a folder containing spam detected with spambouncer and other tools and in some (admittedly) small trials it had about a 25% hit rate. Perhaps I'm using DCC incorrectly? Since I'm in development, I've been using dcc.rhyolite.com in anonymous mode. I hope that I'm not imposing too much of a load there. My script calls dccproc, passes in the message and parses the results. Most of my results indicate that DCC has never seen the message before (ie, I get counts of 1 for all of the metrics). Any ideas? I've been using DCC for about 3 hours now so any/all suggestions are welcome. If you're interested in my script, I can make it available. -Bharat ------_=_NextPart_001_01C13408.6962CD30 Content-Type: text/html; charset="iso-8859-1" DCC -- how do I effectively use it?

Howdy.  I'm working on a personal anti-spam project that I'd like
to eventually distribute freely (probably under GPL).  It is a
thin IMAP client that can monitor a mailbox, detect any new
spam and move it to a separate spam mailbox.  It currently uses
DCC as the arbiter for spamminess.  The code is operational and
probably worthy of being shipped as a beta.  However, it's not as
effective as I would have hoped.

My problem is that I'm not getting very many positive hits from
DCC.  I know that I'm connected to DCC properly because it does
identify certain spam messages correctly, but unfortunately it
misses a large percentage of them.  I ran it against a folder
containing spam detected with spambouncer and other tools and
in some (admittedly) small trials it had about a 25% hit rate.

Perhaps I'm using DCC incorrectly?  Since I'm in development, I've
been using dcc.rhyolite.com in anonymous mode.  I hope that I'm not
imposing too much of a load there.  My script calls dccproc, passes
in the message and parses the results.  Most of my results indicate
that DCC has never seen the message before (ie, I get counts of
1 for all of the metrics).

Any ideas?  I've been using DCC for about 3 hours now so any/all
suggestions are welcome.  If you're interested in my script, I
can make it available.

-Bharat

------_=_NextPart_001_01C13408.6962CD30-- From cc99adf8d49363bdcbb3fb5a4fb2e5b3@interlinx.bc.ca Sun Sep 2 21:10:53 2001 Received: from linux.interlinx.bc.ca (d150-8-228.home.cgocable.net [24.150.8.228]) by calcite.rhyolite.com (8.12.0.Gamma0/8.12.0.Gamma0) with ESMTP id f833ApKc017742 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO) for env-from ; Sun, 2 Sep 2001 21:10:52 -0600 (MDT) Received: (from nobody@localhost) by linux.interlinx.bc.ca (8.11.3/8.11.3) id f833AnV22058 for ; Sun, 2 Sep 2001 23:10:49 -0400 Received: from pc.ilinx(10.75.22.1), claiming to be "pc.interlinx.bc.ca" via SMTP by linux.ilinx, id smtpdqkM9Rs; Sun Sep 2 23:10:44 2001 Received: by pc.interlinx.bc.ca (Postfix, from userid 1001) id 86403231C3; Sun, 2 Sep 2001 23:10:43 -0400 (EDT) Date: Sun, 2 Sep 2001 23:10:42 -0400 To: "'dcc@rhyolite.com'" Subject: Re: DCC -- how do I effectively use it? Message-ID: <20010902231042.Q3385@pc.ilinx> References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.3.19i From: "Brian J. Murrell" X-DCC-rhyolite-Metrics: calcite.rhyolite.com 101; IP=1 env_From=1 From=1 Subject=1 Message-ID=1 Received=1 Body=1 Fuz1=1 Sender: dcc-admin@rhyolite.com Errors-To: dcc-admin@rhyolite.com X-BeenThere: dcc@rhyolite.com X-Mailman-Version: 2.0.5 Precedence: bulk List-Help: List-Post: List-Subscribe: , List-Id: Distributed Checksum Clearinghouse source List-Unsubscribe: , List-Archive: On Sun, Sep 02, 2001 at 04:38:47PM -0700, Mediratta, Bharat wrote: > > Howdy. I'm working on a personal anti-spam project that I'd like > to eventually distribute freely (probably under GPL). It is a > thin IMAP client that can monitor a mailbox, detect any new > spam and move it to a separate spam mailbox. Interesting. > It currently uses > DCC as the arbiter for spamminess. The code is operational and > probably worthy of being shipped as a beta. However, it's not as > effective as I would have hoped. Probably is as effective as it is going to be for the time being. > My problem is that I'm not getting very many positive hits from > DCC. I know that I'm connected to DCC properly because it does > identify certain spam messages correctly, but unfortunately it > misses a large percentage of them. The database/userbase is just not large enough yet. > I ran it against a folder > containing spam detected with spambouncer and other tools and > in some (admittedly) small trials it had about a 25% hit rate. That would seem about right right now. > Perhaps I'm using DCC incorrectly? Well, if you are getting some >1 counts then you are most likely using it correctly. > Since I'm in development, I've > been using dcc.rhyolite.com in anonymous mode. I hope that I'm not > imposing too much of a load there. My script calls dccproc, passes > in the message and parses the results. Will you also support a mode of operation where the MTA has already "dcc"ed the message and put it's (DCC's) header in the message? i.e. simply parse the IMAP INBOX for messages with existing DCC headers with values of n>1 where n is some configurable values (rather than using dccproc on the messages)? > Most of my results indicate > that DCC has never seen the message before (ie, I get counts of > 1 for all of the metrics). Critical mass is not there yet. Be patient. Spread the word. The more users DCC has the more effective it's going to be. b. -- Brian J. Murrell From vjs@calcite.rhyolite.com Sun Sep 2 22:31:43 2001 Received: (from vjs@localhost) by calcite.rhyolite.com (8.12.0.Gamma0/8.12.0.Gamma0) id f834VheS022611 for dcc@rhyolite.com env-from ; Sun, 2 Sep 2001 22:31:43 -0600 (MDT) Date: Sun, 2 Sep 2001 22:31:43 -0600 (MDT) From: Vernon Schryver Message-Id: <200109030431.f834VheS022611@calcite.rhyolite.com> To: dcc@rhyolite.com Subject: Re: DCC -- how do I effectively use it? References: <20010902231042.Q3385@pc.ilinx> Sender: dcc-admin@rhyolite.com Errors-To: dcc-admin@rhyolite.com X-BeenThere: dcc@rhyolite.com X-Mailman-Version: 2.0.5 Precedence: bulk List-Help: List-Post: List-Subscribe: , List-Id: Distributed Checksum Clearinghouse source List-Unsubscribe: , List-Archive: > From: "Mediratta, Bharat" > ... > My problem is that I'm not getting very many positive hits from > DCC. I know that I'm connected to DCC properly because it does > identify certain spam messages correctly, but unfortunately it > misses a large percentage of them. I ran it against a folder > containing spam detected with spambouncer and other tools and > in some (admittedly) small trials it had about a 25% hit rate. > > Perhaps I'm using DCC incorrectly? Since I'm in development, I've > been using dcc.rhyolite.com in anonymous mode. I hope that I'm not > imposing too much of a load there. The load is fine. The DCC client chitchat is about the same as a DNS lookup. However, you'd be better served with your own DCC server exchanging "floods" of checksums with other DCC server servers. Besides being more robust, faster, and using even less bandwidth, with your own server you could look at your copy of the database of checksums with dblist. .................................................................. ] From: "Brian J. Murrell" ] ... ] > I ran it against a folder ] > containing spam detected with spambouncer and other tools and ] > in some (admittedly) small trials it had about a 25% hit rate. ] ] That would seem about right right now. Other people with access to the same checksums have seem to have had better luck. However, I think 25% is nothing to sneeze at. ] > Perhaps I'm using DCC incorrectly? ] ] Well, if you are getting some >1 counts then you are most likely using ] it correctly. There are various pssibilities: - bugs in the IMAP client code might be changing the messages so that their checksums don't match. - I'm still fighting hassles with quoted-printable and making dccproc get the same checksums as dccm. One often sees messages converted from convereted from quoted-printable and with CRLF converted to CR while the other doesn't. - as part of those hassles, I've changed the fuz1 checksum in version 1.0.28 to not ignore the last line. Until everyone starts using that code, the effectiveness of the fuz1 checksum will be reduced. - the spammers who like you differ from those who like DCC users - your name is early in the typical spammer's somewhat alphabetical lists - you are rejecting only on "many" instead of a threshold approprate for the number of your local users. (Yes, that wouldn't apply to checksums with counts of 1.) > ... ] Will you also support a mode of operation where the MTA has already ] "dcc"ed the message and put it's (DCC's) header in the message? i.e. ] simply parse the IMAP INBOX for messages with existing DCC headers ] with values of n>1 where n is some configurable values (rather than ] using dccproc on the messages)? It makes sense to have more than one X-DCC header on a message, with each header reflecting the counts seen by a different network of DCC servers. For example, one network of DCC servers might count only mail sent to secret spam traps and so not need much or any whitelisting, while another might accept reports from anyone (e.g. bad guys unhappy about CERT advisories) and so need a good whitelist. If all expected X-DCC headers are for a single DCC server network, it's probably best to ignore the existing header. You would not want to be fooled by a spammer adding an X-DCC header. Asking the DCC servers again costs little and can give higher checksum counts. The only real problem with asking multiple times is that each query increases the counts for a message (unless you use -Q). ] > Most of my results indicate ] > that DCC has never seen the message before (ie, I get counts of ] > 1 for all of the metrics). ] ] Critical mass is not there yet. Be patient. Spread the word. The ] more users DCC has the more effective it's going to be. Yes. Vernon Schryver vjs@rhyolite.com From bharat@fusionone.com Mon Sep 3 03:00:07 2001 Received: from f1exch.fusionone.com ([63.204.23.32]) by calcite.rhyolite.com (8.12.0.Gamma0/8.12.0.Gamma0) with ESMTP id f83906Kb028814 for env-from ; Mon, 3 Sep 2001 03:00:07 -0600 (MDT) Received: by f1exch.fusionone.com with Internet Mail Service (5.5.2653.19) id ; Mon, 3 Sep 2001 02:00:01 -0700 Message-ID: From: "Mediratta, Bharat" To: "'Brian J. Murrell'" , "'dcc@rhyolite.com'" Subject: RE: DCC -- how do I effectively use it? Date: Mon, 3 Sep 2001 01:59:52 -0700 MIME-Version: 1.0 X-Mailer: Internet Mail Service (5.5.2653.19) Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01C13456.CB4A4AC0" X-DCC-rhyolite-Metrics: calcite.rhyolite.com 101; IP=6 env_From=6 From=7 Subject=1 Message-ID=1 Received=1 Body=1 Fuz1=1 Sender: dcc-admin@rhyolite.com Errors-To: dcc-admin@rhyolite.com X-BeenThere: dcc@rhyolite.com X-Mailman-Version: 2.0.5 Precedence: bulk List-Help: List-Post: List-Subscribe: , List-Id: Distributed Checksum Clearinghouse source List-Unsubscribe: , List-Archive: This message is in MIME format. Since your mail reader does not understand this format, some or all of this message may not be legible. ------_=_NextPart_001_01C13456.CB4A4AC0 Content-Type: text/plain; charset="iso-8859-1" > From: Brian J. Murrell [mailto:dcc-list@interlinx.bc.ca] > > On Sun, Sep 02, 2001 at 04:38:47PM -0700, Mediratta, Bharat wrote: > > > > Howdy. I'm working on a personal anti-spam project that I'd like > > to eventually distribute freely (probably under GPL). It is a > > thin IMAP client that can monitor a mailbox, detect any new > > spam and move it to a separate spam mailbox. > > Interesting. I know what you're thinking -- why would I implement this as an IMAP client instead of interposing DCC on the server? My problem is that the server is a corporate Exchange server that has very little in the way of spam blocking and is not under my control. My guess is that this is a pretty common situation in the corporate environment. [snip] > Critical mass is not there yet. Be patient. Spread the word. The > more users DCC has the more effective it's going to be. That makes sense. From a total newcomers perspective I think that DCC is totally cool, but in order to get critical mass it really needs to be more accessible. The learning curve is steep. The documentation is copious and available, but you have to wade through a lot of it to set up a very simple client. A binary distribution with a configuration wizard would greatly ease the pain. I further think that we'd start to see some serious penetration if we made a plug-in for the common mail readers that people could just drop into their system. Is this where the MAPS DCC pilot is heading? -Bharat ------_=_NextPart_001_01C13456.CB4A4AC0 Content-Type: text/html; charset="iso-8859-1" RE: DCC -- how do I effectively use it?

> From: Brian J. Murrell [mailto:dcc-list@interlinx.bc.ca]
>
> On Sun, Sep 02, 2001 at 04:38:47PM -0700, Mediratta, Bharat wrote:
> >
> > Howdy.  I'm working on a personal anti-spam project that I'd like
> > to eventually distribute freely (probably under GPL).  It is a
> > thin IMAP client that can monitor a mailbox, detect any new
> > spam and move it to a separate spam mailbox.
>
> Interesting.

I know what you're thinking -- why would I implement this as an
IMAP client instead of interposing DCC on the server?  My problem
is that the server is a corporate Exchange server that has very
little in the way of spam blocking and is not under my control.
My guess is that this is a pretty common situation in the corporate
environment.

[snip]

> Critical mass is not there yet.  Be patient.  Spread the word.  The
> more users DCC has the more effective it's going to be.

That makes sense.  From a total newcomers perspective I think that
DCC is totally cool, but in order to get critical mass it really
needs to be more accessible.  The learning curve is steep.  The
documentation is copious and available, but you have to wade through
a lot of it to set up a very simple client.  A binary distribution
with a configuration wizard would greatly ease the pain.

I further think that we'd start to see some serious penetration if
we made a plug-in for the common mail readers that people could just
drop into their system.  Is this where the MAPS DCC pilot is heading?

-Bharat

------_=_NextPart_001_01C13456.CB4A4AC0-- From bharat@fusionone.com Mon Sep 3 03:15:13 2001 Received: from f1exch.fusionone.com ([63.204.23.32]) by calcite.rhyolite.com (8.12.0.Gamma0/8.12.0.Gamma0) with ESMTP id f839FCKb029158 for env-from ; Mon, 3 Sep 2001 03:15:12 -0600 (MDT) Received: by f1exch.fusionone.com with Internet Mail Service (5.5.2653.19) id ; Mon, 3 Sep 2001 02:15:12 -0700 Message-ID: From: "Mediratta, Bharat" To: dcc@rhyolite.com Subject: RE: DCC -- how do I effectively use it? Date: Mon, 3 Sep 2001 02:15:09 -0700 MIME-Version: 1.0 X-Mailer: Internet Mail Service (5.5.2653.19) Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01C13458.EE0F6FC0" X-DCC-rhyolite-Metrics: calcite.rhyolite.com 101; IP=7 env_From=7 From=10 Subject=4 Message-ID=1 Received=1 Body=1 Fuz1=1 Sender: dcc-admin@rhyolite.com Errors-To: dcc-admin@rhyolite.com X-BeenThere: dcc@rhyolite.com X-Mailman-Version: 2.0.5 Precedence: bulk List-Help: List-Post: List-Subscribe: , List-Id: Distributed Checksum Clearinghouse source List-Unsubscribe: , List-Archive: This message is in MIME format. Since your mail reader does not understand this format, some or all of this message may not be legible. ------_=_NextPart_001_01C13458.EE0F6FC0 Content-Type: text/plain; charset="iso-8859-1" > From: Vernon Schryver [mailto:vjs@calcite.rhyolite.com] > > However, you'd be better served with your own DCC server exchanging > "floods" of checksums with other DCC server servers. Besides being > more robust, faster, and using even less bandwidth, with your > own server you could look at your copy of the database of checksums > with dblist. I'm definitely heading in that direction. In fact I'll contact you privately to get an id and password. > Other people with access to the same checksums have seem to have > had better luck. However, I think 25% is nothing to sneeze at. Absolutely. And it will only get better with time. I just wanted to make sure that I wasn't pointing at the wrong database. > - bugs in the IMAP client code might be changing the messages so > that their checksums don't match. Entirely possible. I'm using Net::IMAP on top of cclient-0106191041 on FreeBSD 4.3. My code assembles the message by combining the raw rfc822.header and rfc822.text values and passes it to dccproc. > - I'm still fighting hassles with quoted-printable and making > dccproc get the same checksums as dccm. One often sees messages > converted from convereted from quoted-printable and with CRLF > converted to CR while the other doesn't. If I can help track this down, let me know. > - as part of those hassles, I've changed the fuz1 checksum in > version 1.0.28 to not ignore the last line. Until everyone starts > using that code, the effectiveness of the fuz1 checksum > will be reduced. Where can I get 1.0.28? > - the spammers who like you differ from those who like DCC users > > - your name is early in the typical spammer's somewhat alphabetical > lists > > - you are rejecting only on "many" instead of a threshold approprate > for the number of your local users. (Yes, that wouldn't apply to > checksums with counts of 1.) Right now my simplistic algorithm says that it's maybe spam if any of Message-ID, Received, Body or Fuz1 are greater than 10. Definitely spam if it's greater than 50 (or "many"). But yeah, mostly the problem is that the messages haven't been seen before. > ] Will you also support a mode of operation where the MTA has already > ] "dcc"ed the message and put it's (DCC's) header in the > message? i.e. > ] simply parse the IMAP INBOX for messages with existing DCC headers > ] with values of n>1 where n is some configurable values (rather than > ] using dccproc on the messages)? I figure that if the MTA has dcc'd the message (or spambounced it or used some other spam detection code), the mail client/server can do filtering as appropriate. My script is purely to glue DCC together with a system that has no inherent spam detection. By the way, y'all rock. It's nice to work with professionals. -Bharat ------_=_NextPart_001_01C13458.EE0F6FC0 Content-Type: text/html; charset="iso-8859-1" RE: DCC -- how do I effectively use it?

> From: Vernon Schryver [mailto:vjs@calcite.rhyolite.com]
>
> However, you'd be better served with your own DCC server exchanging
> "floods" of checksums with other DCC server servers.  Besides being
> more robust, faster, and using even less bandwidth, with your
> own server you could look at your copy of the database of checksums
> with dblist.

I'm definitely heading in that direction.  In fact I'll contact
you privately to get an id and password.

> Other people with access to the same checksums have seem to have
> had better luck.  However, I think 25% is nothing to sneeze at.

Absolutely.  And it will only get better with time.  I just wanted
to make sure that I wasn't pointing at the wrong database.

>   - bugs in the IMAP client code might be changing the messages so
>    that their checksums don't match. 

Entirely possible.  I'm using Net::IMAP on top of cclient-0106191041
on FreeBSD 4.3.  My code assembles the message by combining the
raw rfc822.header and rfc822.text values and passes it to dccproc.

>   - I'm still fighting hassles with quoted-printable and making
>    dccproc get the same checksums as dccm.  One often sees messages
>    converted from convereted from quoted-printable and with CRLF
>    converted to CR while the other doesn't.

If I can help track this down, let me know. 
 
>   - as part of those hassles, I've changed the fuz1 checksum in
>    version 1.0.28 to not ignore the last line.  Until everyone starts
>    using that code, the effectiveness of the fuz1 checksum
> will be reduced.

Where can I get 1.0.28?

>   - the spammers who like you differ from those who like DCC users
>
>   - your name is early in the typical spammer's somewhat alphabetical
>    lists
>
>   - you are rejecting only on "many" instead of a threshold approprate
>    for the number of your local users.  (Yes, that wouldn't apply to
>    checksums with counts of 1.)

Right now my simplistic algorithm says that it's maybe spam if any of
Message-ID, Received, Body or Fuz1 are greater than 10.  Definitely
spam if it's greater than 50 (or "many").  But yeah, mostly the problem
is that the messages haven't been seen before.

> ] Will you also support a mode of operation where the MTA has already
> ] "dcc"ed the message and put it's (DCC's) header in the
> message?  i.e.
> ] simply parse the IMAP INBOX for messages with existing DCC headers
> ] with values of n>1 where n is some configurable values (rather than
> ] using dccproc on the messages)?

I figure that if the MTA has dcc'd the message (or spambounced it or
used some other spam detection code), the mail client/server can do
filtering as appropriate.  My script is purely to glue DCC together
with a system that has no inherent spam detection.

By the way, y'all rock.  It's nice to work with professionals.

-Bharat

------_=_NextPart_001_01C13458.EE0F6FC0-- From vjs@calcite.rhyolite.com Mon Sep 3 08:17:35 2001 Received: (from vjs@localhost) by calcite.rhyolite.com (8.12.0.Gamma0/8.12.0.Gamma0) id f83EHZFT004325 for dcc@rhyolite.com env-from ; Mon, 3 Sep 2001 08:17:35 -0600 (MDT) Date: Mon, 3 Sep 2001 08:17:35 -0600 (MDT) From: Vernon Schryver Message-Id: <200109031417.f83EHZFT004325@calcite.rhyolite.com> To: dcc@rhyolite.com Subject: RE: DCC -- how do I effectively use it? references: Sender: dcc-admin@rhyolite.com Errors-To: dcc-admin@rhyolite.com X-BeenThere: dcc@rhyolite.com X-Mailman-Version: 2.0.5 Precedence: bulk List-Help: List-Post: List-Subscribe: , List-Id: Distributed Checksum Clearinghouse source List-Unsubscribe: , List-Archive: > From: "Mediratta, Bharat" > ... A binary distribution > with a configuration wizard would greatly ease the pain. Binary for how many versions of how many platforms? In other words, if you need a binary for one of more than half-dozen targets I current build for, then I can help, but probably not otherwise. > I further think that we'd start to see some serious penetration if > we made a plug-in for the common mail readers that people could just > drop into their system. Is this where the MAPS DCC pilot is heading? The long term plan is to support more platforms. I'm not sure about "plug-ins", what with Microsoft's new found opposition to Netscape style plug-ins and my ignorance of Exchange. > ... > ------_=_NextPart_001_01C13456.CB4A4AC0 > Content-Type: text/html; > charset="iso-8859-1" > > > > > > > RE: DCC -- how do I effectively use it? > > > >

> From: Brian J. Murrell [mailto:dcc-list@interlinx.bc.ca] > ... Some people think that sending both plaintext and HTML encoded versions of a single message is not good. Vernon Schryver vjs@rhyolite.com From vjs@calcite.rhyolite.com Mon Sep 3 12:10:38 2001 Received: (from vjs@localhost) by calcite.rhyolite.com (8.12.0.Gamma0/8.12.0.Gamma0) id f83IAcMV015437 for dcc@rhyolite.com env-from ; Mon, 3 Sep 2001 12:10:38 -0600 (MDT) Date: Mon, 3 Sep 2001 12:10:38 -0600 (MDT) From: Vernon Schryver Message-Id: <200109031810.f83IAcMV015437@calcite.rhyolite.com> To: dcc@rhyolite.com Subject: Re: DCC -- how do I effectively use it? references: <20010902231042.Q3385@pc.ilinx> <200109030431.f834VheS022611@calcite.rhyolite.com> Sender: dcc-admin@rhyolite.com Errors-To: dcc-admin@rhyolite.com X-BeenThere: dcc@rhyolite.com X-Mailman-Version: 2.0.5 Precedence: bulk List-Help: List-Post: List-Subscribe: , List-Id: Distributed Checksum Clearinghouse source List-Unsubscribe: , List-Archive: There is yet another possible cause for checksum counts of 1. I'm running my database with the default expiration parameters including deleting reports of checksums that are more than a week old and have counts of less than "many." Thus, queries of messages older than a week are likely to be answered with "1" Vernon Schryver vjs@rhyolite.com From bharat@fusionone.com Mon Sep 3 13:58:51 2001 Received: from f1exch.fusionone.com ([63.204.23.32]) by calcite.rhyolite.com (8.12.0.Gamma0/8.12.0.Gamma0) with ESMTP id f83JwoKb017282 for env-from ; Mon, 3 Sep 2001 13:58:51 -0600 (MDT) Received: by f1exch.fusionone.com with Internet Mail Service (5.5.2653.19) id ; Mon, 3 Sep 2001 12:58:50 -0700 Message-ID: From: "Mediratta, Bharat" To: "'Vernon Schryver'" , dcc@rhyolite.com Subject: RE: DCC -- how do I effectively use it? Date: Mon, 3 Sep 2001 12:58:45 -0700 MIME-Version: 1.0 X-Mailer: Internet Mail Service (5.5.2653.19) Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01C134B2.D6D91AE0" X-DCC-rhyolite-Metrics: calcite.rhyolite.com 101; IP=8 env_From=8 From=12 Subject=7 Message-ID=1 Received=1 Body=1 Fuz1=1 Sender: dcc-admin@rhyolite.com Errors-To: dcc-admin@rhyolite.com X-BeenThere: dcc@rhyolite.com X-Mailman-Version: 2.0.5 Precedence: bulk List-Help: List-Post: List-Subscribe: , List-Id: Distributed Checksum Clearinghouse source List-Unsubscribe: , List-Archive: This message is in MIME format. Since your mail reader does not understand this format, some or all of this message may not be legible. ------_=_NextPart_001_01C134B2.D6D91AE0 Content-Type: text/plain; charset="iso-8859-1" > > ... A binary distribution > > with a configuration wizard would greatly ease the pain. > > Binary for how many versions of how many platforms? > In other words, if you need a binary for one of more than half-dozen > targets I current build for, then I can help, but probably > not otherwise. What I'm envisioning is an installer for Win32 that puts the minimum set of Win32 binaries on your machine as well as a really simple plugin for Outlook that can be configured to run whenever new mail arrives and move spam into a specified folder. I suggest Outlook because it's popular and I use it. I've used a couple of plugins like this in the past. I don't know how difficult it is to write them, but I do know that you can write them in Visual Basic (so how hard can it be? ) Here are some references: http://www.microsoft.com/exchange/techinfo/development/55/automating.asp http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnout2k/htm l/extendingol2k.asp I'm not saying that you should run out and write this, I'm just brainstorming ways to easily build the user base. > Some people think that sending both plaintext and HTML > encoded versions of a single message is not good. I apologize. Our corporate Exchange server has a bug that ignores my plain text preferences. I've sent them instructions on how to fix it, but they have to schedule downtime etc so it'll take a little while. :-( -Bharat ------_=_NextPart_001_01C134B2.D6D91AE0 Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable RE: DCC -- how do I effectively use it?

> = >           &n= bsp;           &n= bsp;           &n= bsp;    ...   A binary distribution
> > with a configuration wizard would greatly = ease the pain.
>
> Binary for how many versions of how many = platforms? 
> In other words, if you need a binary for one of = more than half-dozen
> targets I current build for, then I can help, = but probably
> not otherwise.

What I'm envisioning is an installer for Win32 that = puts
the minimum set of Win32 binaries on your machine as = well as a
really simple plugin for Outlook that can be = configured
to run whenever new mail arrives and move spam into = a
specified folder.  I suggest Outlook because = it's popular
and I use it.

I've used a couple of plugins like this in the = past.  I don't
know how difficult it is to write them, but I do = know that you
can write them in Visual Basic (so how hard can it = be?  <grin>)

Here are some references:
http://www.microsoft.com/exchange/techinfo/development= /55/automating.asp
http://msdn.microsoft.com/library/default.asp?url=3D/l= ibrary/en-us/dnout2k/html/extendingol2k.asp

I'm not saying that you should run out and write = this, I'm just
brainstorming ways to easily build the user = base.

> Some people think that sending both plaintext = and HTML
> encoded versions of a single message is not = good.

I apologize.  Our corporate Exchange server has = a bug that ignores
my plain text preferences.  I've sent them = instructions on how to fix
it, but they have to schedule downtime etc so it'll = take a little
while.  :-(

-Bharat

------_=_NextPart_001_01C134B2.D6D91AE0-- From vjs@calcite.rhyolite.com Mon Sep 3 20:00:17 2001 Received: (from vjs@localhost) by calcite.rhyolite.com (8.12.0.Gamma0/8.12.0.Gamma0) id f8420HEj024217 for dcc env-from ; Mon, 3 Sep 2001 20:00:17 -0600 (MDT) Date: Mon, 3 Sep 2001 20:00:17 -0600 (MDT) From: Vernon Schryver Message-Id: <200109040200.f8420HEj024217@calcite.rhyolite.com> To: dcc@calcite.rhyolite.com Subject: 25% spam hit rates Sender: dcc-admin@rhyolite.com Errors-To: dcc-admin@rhyolite.com X-BeenThere: dcc@rhyolite.com X-Mailman-Version: 2.0.5 Precedence: bulk List-Help: List-Post: List-Subscribe: , List-Id: Distributed Checksum Clearinghouse source List-Unsubscribe: , List-Archive: `cdcc "host dcc.rhyolite.com; stats"` produces: dcc.rhyolite.com calcite.rhyolite.com 192.188.61.3,6277 server-ID 101 19:53:06 version 1.0.28 DB locked tracing ANON CLNT 90109 hash entries 66617 used 3460080 DB bytes 1 ms delay 236 NOPs 45 ADMN 108 query 12 clients 1056 reports 0>10 0>100 0>1000 39 many answers 385>10 107>100 0>1000 102 many 34 whitelisted 0 bad IDs 11 passwds 0 error responses 30 retransmitted 0 answers rate-limited 0 anonymous 0 rejected reports flood on 9 streams 5 out active 5 in 8766 total flooded in 2337 accepted 23 stale 6406 dup 0 white 0 delete 0 bad id 100 mmap 146453 hashed 68138 records mapped 3368 added since Sep 02 09:28:57.381686 MDT That says that in the 10.5 hours starting from 9:30 am until about 8 pm, my server has received 1056 reports of the checksums of messages. Of those, 385 or 36% look like spam because they have addressee counts above 10. (10 is a reasonable threshold for my vanity domain, and for the other clients of my DCC server, all of which seem similar.) Assuming that a substantial fraction of the remaing 64% were not spam implies a much better than 25% effectiveness against the spam seen around here. Vernon Schryver vjs@rhyolite.com From bharat@fusionone.com Tue Sep 4 10:52:38 2001 Received: from f1exch.fusionone.com ([63.204.23.32]) by calcite.rhyolite.com (8.12.0.Gamma0/8.12.0.Gamma0) with ESMTP id f84GqbKb017130 for env-from ; Tue, 4 Sep 2001 10:52:37 -0600 (MDT) Received: by f1exch.fusionone.com with Internet Mail Service (5.5.2653.19) id ; Tue, 4 Sep 2001 09:52:36 -0700 Message-ID: From: "Mediratta, Bharat" To: dcc@rhyolite.com Subject: RE: Exchanging dccd ids & passwords Date: Tue, 4 Sep 2001 09:52:30 -0700 MIME-Version: 1.0 X-Mailer: Internet Mail Service (5.5.2653.19) Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01C13561.FC370A90" X-DCC-rhyolite-Metrics: calcite.rhyolite.com 101; IP=12 env_From=12 From=17 Subject=2 Message-ID=1 Received=1 Body=1 Fuz1=1 Sender: dcc-admin@rhyolite.com Errors-To: dcc-admin@rhyolite.com X-BeenThere: dcc@rhyolite.com X-Mailman-Version: 2.0.5 Precedence: bulk List-Help: List-Post: List-Subscribe: , List-Id: Distributed Checksum Clearinghouse List-Unsubscribe: , List-Archive: This message is in MIME format. Since your mail reader does not understand this format, some or all of this message may not be legible. ------_=_NextPart_001_01C13561.FC370A90 Content-Type: text/plain; charset="iso-8859-1" Thanks to some help from Vernon, I've got my dccd up and running. I couldn't figure out why it wasn't flooding until I checked my security log and realized that I hadn't made a hole in my firewall (oops). So now, here are the results from a "stats all": dcc.rhyolite.com calcite.rhyolite.com 192.188.61.3,6277 server-ID 101 /usr/local/dcc/map 09:47:06 version 1.0.28 DB locked tracing ANON CLNT 90109 hash entries 64187 used 3307512 DB bytes 3 ms delay 343 NOPs 89 ADMN 111 query 11 clients 1420 reports 0>10 0>100 0>1000 50 many answers 518>10 183>100 0>1000 115 many 59 whitelisted 0 bad IDs 11 passwds 0 error responses 35 retransmitted 0 answers rate-limited 0 anonymous 0 rejected reports flood on 9 streams 6 out active 6 in 21302 total flooded in 3281 accepted 48 stale 17973 dup 0 white 0 delete 0 bad id 149 mmap 255230 hashed 181454 records mapped 4672 added since Sep 02 09:28:57.381686 MDT www.menalto.com gw-menalto 198.144.206.35,6277 server-ID 1003 /usr/local/dcc/map 09:47:06 version 1.0.27 DB locked tracing ANON CLNT 40957 hash entries 29815 used 1935960 DB bytes 0 ms delay 18 NOPs 22 ADMN 0 query 7 clients 17 reports 0>10 0>100 0>1000 0 many answers 2>10 0>100 0>1000 2 many 0 whitelisted 0 bad IDs 3 passwds 0 error responses 0 retransmitted 0 answers rate-limited 0 anonymous 0 rejected reports flood on 1 streams 1 out active 1 in 477 total flooded in 450 accepted 0 stale 27 dup 0 white 0 delete 0 bad id 10 mmap 16886 hashed 6955 records mapped 468 added since Sep 04 01:05:10.618836 PDT From the "hashed entries" lines it seems like I have a lot less data in my map than is on calcite. Why is that? Will my server be less effective than calcite because of it? Thanks. -Bharat ------_=_NextPart_001_01C13561.FC370A90 Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable RE: Exchanging dccd ids & passwords

Thanks to some help from Vernon, I've got my dccd up = and
running.  I couldn't figure out why it wasn't = flooding until
I checked my security log and realized that I hadn't = made a
hole in my firewall (oops).

So now, here are the results from a "stats = all":

dcc.rhyolite.com calcite.rhyolite.com = 192.188.61.3,6277
        server-ID = 101  /usr/local/dcc/map  09:47:06
    version 1.0.28  DB = locked  tracing ANON CLNT
  90109 hash entries  64187 = used   3307512 DB bytes
    3 ms delay 343 = NOPs     89 ADMN   111 query   = 11 clients
 1420 reports    = 0>10        = 0>100      0>1000    50 = many
     answers   = 518>10      = 183>100      0>1000   115 = many   59 whitelisted
    0 bad IDs   11 = passwds   0 error responses    35 = retransmitted
    0 answers = rate-limited   0 = anonymous           0 = rejected reports
    flood on   9 = streams   6 out active 6 in     21302 = total flooded in
 3281 accepted  48 stale 17973 = dup      0 white    0 = delete  0 bad id
  149 mmap   255230 hashed 181454 = records mapped   4672 added
  since Sep 02 09:28:57.381686 MDT

www.menalto.com gw-menalto 198.144.206.35,6277
        server-ID = 1003  /usr/local/dcc/map  09:47:06
    version 1.0.27  DB = locked  tracing ANON CLNT
  40957 hash entries  29815 = used   1935960 DB bytes
    0 ms delay  18 = NOPs     22 ADMN     0 = query    7 clients
   17 reports    = 0>10        = 0>100      = 0>1000     0 many
     = answers     = 2>10        = 0>100      = 0>1000     2 many    0 = whitelisted
    0 bad IDs    3 = passwds   0 error responses     0 = retransmitted
    0 answers = rate-limited   0 = anonymous           0 = rejected reports
    flood on   1 = streams   1 out active 1 in      = ; 477 total flooded in
  450 accepted   0 = stale    27 dup      0 = white    0 delete  0 bad id
   10 mmap   16886 hashed 6955 = records mapped    468 added
  since Sep 04 01:05:10.618836 PDT

From the "hashed entries" lines it seems = like I have a lot
less data in my map than is on calcite.  Why is = that?  Will my
server be less effective than calcite because of = it?  Thanks.

-Bharat

------_=_NextPart_001_01C13561.FC370A90-- From vjs@calcite.rhyolite.com Tue Sep 4 11:18:35 2001 Received: (from vjs@localhost) by calcite.rhyolite.com (8.12.0.Gamma0/8.12.0.Gamma0) id f84HIZHS017759 for dcc@rhyolite.com env-from ; Tue, 4 Sep 2001 11:18:35 -0600 (MDT) Date: Tue, 4 Sep 2001 11:18:35 -0600 (MDT) From: Vernon Schryver Message-Id: <200109041718.f84HIZHS017759@calcite.rhyolite.com> To: dcc@rhyolite.com Subject: RE: Exchanging dccd ids & passwords references: Sender: dcc-admin@rhyolite.com Errors-To: dcc-admin@rhyolite.com X-BeenThere: dcc@rhyolite.com X-Mailman-Version: 2.0.5 Precedence: bulk List-Help: List-Post: List-Subscribe: , List-Id: Distributed Checksum Clearinghouse List-Unsubscribe: , List-Archive: > From: "Mediratta, Bharat" > ... > 90109 hash entries 64187 used 3307512 DB bytes > www.menalto.com gw-menalto 198.144.206.35,6277 > server-ID 1003 /usr/local/dcc/map 09:47:06 > version 1.0.27 DB locked tracing ANON CLNT > 40957 hash entries 29815 used 1935960 DB bytes > >From the "hashed entries" lines it seems like I have a lot > less data in my map than i