From vjs@calcite.rhyolite.com Fri Jul 15 10:19:31 2005 Received: from calcite.rhyolite.com (localhost [127.0.0.1]) by calcite.rhyolite.com (8.13.4/8.13.4) with ESMTP id j6FGJVNB004826 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for env-from ; Fri, 15 Jul 2005 10:19:31 -0600 (MDT) Received: (from vjs@localhost) by calcite.rhyolite.com (8.13.4/8.13.4/Submit) id j6FGJVct004825 for dcc-reputations; Fri, 15 Jul 2005 10:19:31 -0600 (MDT) Date: Fri, 15 Jul 2005 10:19:31 -0600 (MDT) From: Vernon Schryver Message-Id: <200507151619.j6FGJVct004825@calcite.rhyolite.com> To: dcc-reputations@calcite.rhyolite.com Subject: test Sender: dcc-reputations-admin@rhyolite.com Errors-To: dcc-reputations-admin@rhyolite.com X-BeenThere: dcc-reputations@rhyolite.com X-Mailman-Version: 2.0.5 Precedence: bulk List-Unsubscribe: , List-Id: Commercial Distributed Checksum Clearinghouse List-Post: List-Help: List-Subscribe: , List-Archive: initial test message From vjs@calcite.rhyolite.com Wed Aug 10 21:49:24 2005 Received: from calcite.rhyolite.com (localhost [127.0.0.1]) by calcite.rhyolite.com (8.13.4/8.13.4) with ESMTP id j7B3nNbm062549 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for env-from ; Wed, 10 Aug 2005 21:49:23 -0600 (MDT) Received: (from vjs@localhost) by calcite.rhyolite.com (8.13.4/8.13.4/Submit) id j7B3nN0p062548 for DCC-reputations; Wed, 10 Aug 2005 21:49:23 -0600 (MDT) Date: Wed, 10 Aug 2005 21:49:23 -0600 (MDT) From: Vernon Schryver Message-Id: <200508110349.j7B3nN0p062548@calcite.rhyolite.com> To: DCC-reputations@calcite.rhyolite.com Subject: DCC version 2.3.15 released Sender: dcc-reputations-admin@rhyolite.com Errors-To: dcc-reputations-admin@rhyolite.com X-BeenThere: dcc-reputations@rhyolite.com X-Mailman-Version: 2.0.5 Precedence: bulk List-Unsubscribe: , List-Id: Commercial Distributed Checksum Clearinghouse List-Post: List-Help: List-Subscribe: , List-Archive: Version 1.3.15 of the DCC source is in http://www.dcc-servers.net/dcc/source/dcc.tar.Z and http://www.rhyolite.com/anti-spam/dcc/source/dcc.tar.Z http://www.dcc-servers.net/dcc/CHANGES starts with: When "option MTA-first" in a dccm and dccifd whiteclnt file, determinations of (not) spam by the MTA are consulted first and so can be overidden by the whiteclnt files. This allows individual users to override a sendmail access.db file. Correct the SMTP rejection message in per-user log files for dccm and dccifd, especially when dccifd is acting as a proxy. Fix bug reported by James Carlson that kept./configure from turning on SOCKS. Make "option dcc-reps-on/off" and "option dcc-on/off" independent. A mailbox can now use neither, either, or both of DCC Reputations and standard DCC filtering. Use distinct timestamps on reputation reports so they won't be seen as duplicates by flooding peers. /var/dcc/libexec/updatedcc should automagically fetch, build, and install this version. Vernon Schryver vjs@rhyolite.com From vjs@calcite.rhyolite.com Thu Sep 8 20:03:53 2005 Received: from calcite.rhyolite.com (localhost [127.0.0.1]) by calcite.rhyolite.com (8.13.4/8.13.4) with ESMTP id j8923rY6066267 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for env-from ; Thu, 8 Sep 2005 20:03:53 -0600 (MDT) Received: (from vjs@localhost) by calcite.rhyolite.com (8.13.4/8.13.4/Submit) id j8923rXH066266 for dcc-reputations; Thu, 8 Sep 2005 20:03:53 -0600 (MDT) Date: Thu, 8 Sep 2005 20:03:53 -0600 (MDT) From: Vernon Schryver Message-Id: <200509090203.j8923rXH066266@calcite.rhyolite.com> To: dcc-reputations@calcite.rhyolite.com Subject: -tREP,20 Sender: dcc-reputations-admin@rhyolite.com Errors-To: dcc-reputations-admin@rhyolite.com X-BeenThere: dcc-reputations@rhyolite.com X-Mailman-Version: 2.0.5 Precedence: bulk List-Unsubscribe: , List-Id: Commercial Distributed Checksum Clearinghouse List-Post: List-Help: List-Subscribe: , List-Archive: Since some of Earthlink's mail systems have DCC reputations between 5% and 10% and in one case 10%, it might be a good idea to change REP_ARGS=-tREP,20 from -tREP,10 in /var/dcc/dcc_conf and restart dccifd or dccm. 10% bulk mail seems like a lot to me. I don't know if that is really what Earthlink spews or if it is a measurement artifact. I've been watching a few other ISP's reputations. Those that I don't know to have spam problems have 0 or tiny DCC reputations. Those that are otherwise have big reputations. I don't know what's going on with Earthlink. Vernon Schryver vjs@rhyolite.com From georg.graf@wu-wien.ac.at Mon Sep 19 02:23:51 2005 Received: from schurli.wu-wien.ac.at (schurli.wu-wien.ac.at [137.208.16.32]) by calcite.rhyolite.com (8.13.4/8.13.4) with ESMTP id j8J8Nm57094315 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for env-from ; Mon, 19 Sep 2005 02:23:50 -0600 (MDT) Received: from schurli.wu-wien.ac.at (localhost [127.0.0.1]) by schurli.wu-wien.ac.at (8.13.3/8.13.3) with ESMTP id j8J8NiZ0096895; Mon, 19 Sep 2005 10:23:45 +0200 (CEST) (envelope-from georg.graf@wu-wien.ac.at) Received: (from graf@localhost) by schurli.wu-wien.ac.at (8.13.3/8.13.3/Submit) id j8J8Nina096894; Mon, 19 Sep 2005 10:23:44 +0200 (CEST) (envelope-from georg.graf@wu-wien.ac.at) Date: Mon, 19 Sep 2005 10:23:44 +0200 From: Georg Graf To: dcc-reputations@rhyolite.com Cc: oskar.schoepf@wu-wien.ac.at Subject: Tweaking Reputation Parameters Message-ID: <20050919082344.GF82111@wu-wien.ac.at> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.2.1i X-WU-wumi-status: clean v4.4.00/v4583 schurli wu 0ccc2c34e885f92eeee649080aba2dc2 X-DCC-Rhyolite-Metrics: calcite.rhyolite.com 101; Body=1 Fuz1=1 Fuz2=1 Sender: dcc-reputations-admin@rhyolite.com Errors-To: dcc-reputations-admin@rhyolite.com X-BeenThere: dcc-reputations@rhyolite.com X-Mailman-Version: 2.0.5 Precedence: bulk List-Unsubscribe: , List-Id: Commercial Distributed Checksum Clearinghouse Mailing List List-Post: List-Help: List-Subscribe: , List-Archive: Hi there, A Report of my reputation experiences: At first I ran dcc with -t rep,10. This yielded lots of false reputations positives. Quite some of them had reputations of 50% and above. Then I switched to -t rep,80. This went quite well for 2 weeks or so. Today I again got a false positive. One of them that hurt ;( Here we go: || ### end of message body ######################## || DCC Reputation-->spam || || X-DCC-wuwien-Metrics: samantha.wu-wien.ac.at 1290; bulk rep Body=many || Fuz1=many Fuz2=many rep=84% || reported: 11 checksum server || IP: 9f5d8e4e ff6f2dd0 0340200b 4112e97b || env_From: b6bc4569 7f6ddb23 b4a47d04 fac3ca9d || From: 2922251f 95a0d50f 9c6db2b9 d578cf63 || Message-ID: eecdbde7 95ad4423 7ea086f8 243e776d || Received: 76286203 9d782434 a0e824ef 57a454b0 || Body: 501894b5 cd8a5f8e 7306f4f9 c63e3693 0 || Fuz1: fc304a1c d50b8239 00162693 c59e687c 0 || Fuz2: 24ad7f8e e1afd694 62a7062d af7821cc 0 || rep-total: 9f5d8e4e ff6f2dd0 0340200b 4112e97b 2 || rep: 9f5d8e4e ff6f2dd0 0340200b 4112e97b 0 || || result: accept Ok. Since without setting rep-total manually, it takes as default the reject_at value, I think I'll set it to a higher value. But this is not very logical. Because when I raise the rep-total value, then I can be even more sure about the correctness of the reputation value. Hmm. I'm kind of clueless. I'll give this a try: || $ egrep -i '^(dccm|rep)' dcc_conf || REP_ARGS="-t rep,90 -t rep-total,1000" || DCCM_ENABLE=on || DCCM_ARGS="-p inet:25524@samantha -A -W -a REJECT -S Content-class -j 1500" || DCCM_LOGDIR=log || DCCM_WHITECLNT=whiteclnt || DCCM_USERDIRS=userdirs || DCCM_LOG_AT=25 || DCCM_REJECT_AT=50 || DCCM_CKSUMS= || DCCM_XTRA_CKSUMS= regards, George -- Vienna University of Economics and Business Administration Central and Internet Services Section Center for Computer Services UNIX Server Administration PGP/GPG Key ID: 0xa5232ad5 From vjs@calcite.rhyolite.com Mon Sep 19 08:32:17 2005 Received: from calcite.rhyolite.com (localhost [127.0.0.1]) by calcite.rhyolite.com (8.13.4/8.13.4) with ESMTP id j8JEWHjL072657 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) env-from ; Mon, 19 Sep 2005 08:32:17 -0600 (MDT) Received: (from vjs@localhost) by calcite.rhyolite.com (8.13.4/8.13.4/Submit) id j8JEWHBQ072656; Mon, 19 Sep 2005 08:32:17 -0600 (MDT) Date: Mon, 19 Sep 2005 08:32:17 -0600 (MDT) From: Vernon Schryver Message-Id: <200509191432.j8JEWHBQ072656@calcite.rhyolite.com> To: dcc-reputations@rhyolite.com, georg.graf@wu-wien.ac.at Subject: Re: Tweaking Reputation Parameters Cc: oskar.schoepf@wu-wien.ac.at In-Reply-To: <20050919082344.GF82111@wu-wien.ac.at> Sender: dcc-reputations-admin@rhyolite.com Errors-To: dcc-reputations-admin@rhyolite.com X-BeenThere: dcc-reputations@rhyolite.com X-Mailman-Version: 2.0.5 Precedence: bulk List-Unsubscribe: , List-Id: Commercial Distributed Checksum Clearinghouse Mailing List List-Post: List-Help: List-Subscribe: , List-Archive: > From: Georg Graf > To: dcc-reputations@rhyolite.com > Cc: oskar.schoepf@wu-wien.ac.at > A Report of my reputation experiences: > > At first I ran dcc with -t rep,10. This yielded lots of false > reputations positives. Quite some of them had reputations of 50% and > above. > > Then I switched to -t rep,80. This went quite well for 2 weeks or so. > Today I again got a false positive. One of them that hurt ;( Were the false positives bulk mail? If so, the sender or the messages should be whitelisted or those messages will be detected as bulk and rejected by the classic DCC mechanism. > || X-DCC-wuwien-Metrics: samantha.wu-wien.ac.at 1290; bulk rep Body=many > || Fuz1=many Fuz2=many rep=84% > || reported: 11 checksum server > || rep-total: 9f5d8e4e ff6f2dd0 0340200b 4112e97b 2 > || rep: 9f5d8e4e ff6f2dd0 0340200b 4112e97b 0 That message must have been sent to at least 11 mailboxes and so was somewhat bulk. > Ok. Since without setting rep-total manually, it takes as default > the reject_at value, I think I'll set it to a higher value. But > this is not very logical. Because when I raise the rep-total > value, then I can be even more sure about the correctness of the > reputation value. Hmm. I'm kind of clueless. I'll give this a > try: > || REP_ARGS="-t rep,90 -t rep-total,1000" 90% and 1000 seem rather high. There is another parameter that is hard-coded inside dccd. That is the number of substantially identical copies of a message that must be seen to make it "bulk" and so increase the "rep" count for an IP address. It is currently 10. Would your false positives have happened if it were 20? What threshold do you use for bulk mail? Vernon Schryver vjs@rhyolite.com From georg.graf@wu-wien.ac.at Tue Sep 20 01:57:55 2005 Received: from schurli.wu-wien.ac.at (schurli.wu-wien.ac.at [137.208.16.32]) by calcite.rhyolite.com (8.13.4/8.13.4) with ESMTP id j8K7vqBW082664 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for env-from ; Tue, 20 Sep 2005 01:57:54 -0600 (MDT) Received: from schurli.wu-wien.ac.at (localhost [127.0.0.1]) by schurli.wu-wien.ac.at (8.13.3/8.13.3) with ESMTP id j8K7vmnM017571 for ; Tue, 20 Sep 2005 09:57:49 +0200 (CEST) (envelope-from georg.graf@wu-wien.ac.at) Received: (from graf@localhost) by schurli.wu-wien.ac.at (8.13.3/8.13.3/Submit) id j8K7vmQd017570 for dcc-reputations@rhyolite.com; Tue, 20 Sep 2005 09:57:48 +0200 (CEST) (envelope-from georg.graf@wu-wien.ac.at) Date: Tue, 20 Sep 2005 09:57:48 +0200 From: Georg Graf To: dcc-reputations@rhyolite.com Subject: Re: Tweaking Reputation Parameters Message-ID: <20050920075748.GA17310@wu-wien.ac.at> Mail-Followup-To: Georg Graf , dcc-reputations@rhyolite.com References: <20050919082344.GF82111@wu-wien.ac.at> <200509191432.j8JEWHBQ072656@calcite.rhyolite.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200509191432.j8JEWHBQ072656@calcite.rhyolite.com> User-Agent: Mutt/1.4.2.1i X-WU-wumi-status: clean v4.4.00/v4585 schurli wu abce9f514915d12485254979fa7814ca X-DCC-Rhyolite-Metrics: calcite.rhyolite.com 101; Body=1 Fuz1=1 Fuz2=1 Sender: dcc-reputations-admin@rhyolite.com Errors-To: dcc-reputations-admin@rhyolite.com X-BeenThere: dcc-reputations@rhyolite.com X-Mailman-Version: 2.0.5 Precedence: bulk List-Unsubscribe: , List-Id: Commercial Distributed Checksum Clearinghouse Mailing List List-Post: List-Help: List-Subscribe: , List-Archive: On Mon, Sep 19, 2005 at 08:32:17AM -0600, Vernon Schryver wrote: > > From: Georg Graf [...] > Were the false positives bulk mail? If so, the sender or the messages > should be whitelisted or those messages will be detected as bulk and > rejected by the classic DCC mechanism. No, not a bulk mail. Just a mail that was sent from extern to 11 people in our domain. I have set the rej-thold to 50. I hold it impossible to whitelist something like that. [...] > That message must have been sent to at least 11 mailboxes and so was > somewhat bulk. well, yes. [...] > > || REP_ARGS="-t rep,90 -t rep-total,1000" > > 90% and 1000 seem rather high. You saw that in this case (only "-t rep,80") it did not work for me. What would you suggest next? My idea was Hmm. This comes from my effort to set the reputation parameters in a way that they do not yield "false positives" where "false positives" means mails that people want to get and that are not commercial. I am aware there is no way for the DCC to know that ;) I think I have a fundamental problem with reputations. The higher I set the rep-total value, the more I can be sure that (100-rep)% of mail from a host are not bulk messages. If I lower the rep-total value, then I trust the reputation values even if I dont know much about a host. What do you think about these arguments? > There is another parameter that is hard-coded inside dccd. That > is the number of substantially identical copies of a message that > must be seen to make it "bulk" and so increase the "rep" count for > an IP address. It is currently 10. Would your false positives have > happened if it were 20? What threshold do you use for bulk mail? I use the "common choice": "-t CMN,25,50". Since the mail really had only 11 recepients, this would have done the job, I think. thankyou, george -- Vienna University of Economics and Business Administration Central and Internet Services Section Center for Computer Services UNIX Server Administration PGP/GPG Key ID: 0xa5232ad5 From vjs@calcite.rhyolite.com Tue Sep 20 07:17:52 2005 Received: from calcite.rhyolite.com (localhost [127.0.0.1]) by calcite.rhyolite.com (8.13.4/8.13.4) with ESMTP id j8KDHpiT007430 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for env-from ; Tue, 20 Sep 2005 07:17:51 -0600 (MDT) Received: (from vjs@localhost) by calcite.rhyolite.com (8.13.4/8.13.4/Submit) id j8KDHp0F007429 for dcc-reputations@rhyolite.com; Tue, 20 Sep 2005 07:17:51 -0600 (MDT) Date: Tue, 20 Sep 2005 07:17:51 -0600 (MDT) From: Vernon Schryver Message-Id: <200509201317.j8KDHp0F007429@calcite.rhyolite.com> To: dcc-reputations@rhyolite.com Subject: Re: Tweaking Reputation Parameters In-Reply-To: <20050920075748.GA17310@wu-wien.ac.at> Sender: dcc-reputations-admin@rhyolite.com Errors-To: dcc-reputations-admin@rhyolite.com X-BeenThere: dcc-reputations@rhyolite.com X-Mailman-Version: 2.0.5 Precedence: bulk List-Unsubscribe: , List-Id: Commercial Distributed Checksum Clearinghouse Mailing List List-Post: List-Help: List-Subscribe: , List-Archive: > From: Georg Graf > > > || REP_ARGS="-t rep,90 -t rep-total,1000" > > > > 90% and 1000 seem rather high. > > You saw that in this case (only "-t rep,80") it did not work for > me. What would you suggest next? My idea was Based on your choice of "-t CMN,25,50" I would try "-t rep,30 -t rep-total,50" I would also turn off DCC Reputations for the mailboxes that cannot tolerate false positives. > Hmm. This comes from my effort to set the reputation parameters > in a way that they do not yield "false positives" where "false > positives" means mails that people want to get and that are not > commercial. I am aware there is no way for the DCC to know that > ;) I define spam as "unsolicited bulk mail," not "unsolicited commcial mail." When used with per-user whitelists, the DCC can detect spam using that definition. The target counts detect "bulk" and each user's individual whiteclnt file defines "(un)solicited." > I think I have a fundamental problem with reputations. The higher > I set the rep-total value, the more I can be sure that (100-rep)% > of mail from a host are not bulk messages. If I lower the > rep-total value, then I trust the reputation values even if I > dont know much about a host. > > What do you think about these arguments? The (100-rep)% from an IP address are only not detected as bulk by the DCC when delivered. They might have been detected as bulk if they had been delivered later. Or they might have had better "hash busting." Second, that argument misses the idea of reputations. If you refuse to believe someone who tells lies 90% of time, you know you will not crediting the lier's 10% true statements. If you do not hire convicted embezzlers as accountants, it is not because you think that all embezzlers always steal all of the time, but that you think the chances of a new crime are high. A X% DCC Reputation does not mean "the next message from 10.2.3.4 is spam" but (I hope) "the next message from 10.2.3.4 is spam with probabilty at least X%". A mailbox that cannot tolerate false positives should not use reputations. It should also not use SpamAssassin or Bayesian filters because those also are merely probabilistic detectors of spam. It should also not use the DCC without a real per-mailbox whitelist, because what is legitimate, solicited bulk mail for one mailbox is spam for another. > I use the "common choice": "-t CMN,25,50". Since the mail really > had only 11 recepients, this would have done the job, I think. For that particular message, yes, but so would "-t rep-total,50" Vernon Schryver vjs@rhyolite.com From vjs@calcite.rhyolite.com Mon Oct 3 14:59:06 2005 Received: from calcite.rhyolite.com (localhost [127.0.0.1]) by calcite.rhyolite.com (8.13.4/8.13.4) with ESMTP id j93Kx5p5002952 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for env-from ; Mon, 3 Oct 2005 14:59:06 -0600 (MDT) Received: (from vjs@localhost) by calcite.rhyolite.com (8.13.4/8.13.4/Submit) id j93Kx5jF002951 for DCC-reputations; Mon, 3 Oct 2005 14:59:05 -0600 (MDT) Date: Mon, 3 Oct 2005 14:59:05 -0600 (MDT) From: Vernon Schryver Message-Id: <200510032059.j93Kx5jF002951@calcite.rhyolite.com> To: DCC-reputations@calcite.rhyolite.com Subject: DCC Reputations 2.3.20 released Sender: dcc-reputations-admin@rhyolite.com Errors-To: dcc-reputations-admin@rhyolite.com X-BeenThere: dcc-reputations@rhyolite.com X-Mailman-Version: 2.0.5 Precedence: bulk List-Unsubscribe: , List-Id: Commercial Distributed Checksum Clearinghouse Mailing List List-Post: List-Help: List-Subscribe: , List-Archive: Version 2.3.20 of the DCC source is available via FTP, generally with /var/dcc/libexec/updatedcc I just discovered that versions 1/2.3.17, 18, and 19 had a serious bug that turned off expiration of reputation checksums. Vernon Schryver vjs@rhyolite.com From sven@dmv.com Tue Nov 15 08:52:02 2005 Received: from smtp-gw-cl-d.dmv.com (smtp-gw-cl-d.dmv.com [216.240.97.42]) by calcite.rhyolite.com (8.13.4/8.13.4) with ESMTP id jAFFq0Lj080145 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for env-from ; Tue, 15 Nov 2005 08:52:01 -0700 (MST) Received: from mail-gw-cl-a.dmv.com (mail-gw-cl-a.dmv.com [216.240.97.38]) by smtp-gw-cl-d.dmv.com (8.12.10/8.12.10) with ESMTP id jAFFpxft058763 for ; Tue, 15 Nov 2005 10:51:59 -0500 (EST) (envelope-from sven@dmv.com) Received: from lanshark.dmv.com (lanshark.dmv.com [216.240.97.46]) by mail-gw-cl-a.dmv.com (8.12.9/8.12.9) with ESMTP id jAFFpwQU061992 for ; Tue, 15 Nov 2005 10:51:59 -0500 (EST) (envelope-from sven@dmv.com) Subject: Adding IP to Metrics Header From: Sven Willenberger To: DCC-reputations@calcite.rhyolite.com Content-Type: text/plain Date: Tue, 15 Nov 2005 10:53:21 -0500 Message-Id: <1132070001.10715.9.camel@lanshark.dmv.com> Mime-Version: 1.0 X-Mailer: Evolution 2.4.1 Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.48 on 216.240.97.42 X-Scanned-By: MIMEDefang 2.48 on 216.240.97.38 X-DCC-Rhyolite-Metrics: calcite.rhyolite.com 101; Body=1 Fuz1=1 Fuz2=1 rep=3% Sender: dcc-reputations-admin@rhyolite.com Errors-To: dcc-reputations-admin@rhyolite.com X-BeenThere: dcc-reputations@rhyolite.com X-Mailman-Version: 2.0.5 Precedence: bulk List-Unsubscribe: , List-Id: Commercial Distributed Checksum Clearinghouse Mailing List List-Post: List-Help: List-Subscribe: , List-Archive: In searching the logfiles for those messages that have hits on rep, I would like to be able to get an idea of what servers (IPs) are sending these. Since the dcc checking mailservers are internal (i.e. MX IP entries in whiteclnt) I cannot use the relay information in the maillog files. As such, what would be involved with [optionally] adding IP to the Metrics header that is added so that it would resemble X-DCC-brand-Metrics: chost server-ID; bulk chknm1=count ... IP=[relay IP that is checked by rep] Just trying to get some useful statistics gathering in one grep/awk pass of the maillog :-) Sven From vjs@calcite.rhyolite.com Tue Nov 15 09:05:25 2005 Received: from calcite.rhyolite.com (localhost [127.0.0.1]) by calcite.rhyolite.com (8.13.4/8.13.4) with ESMTP id jAFG5P9s098640 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for env-from ; Tue, 15 Nov 2005 09:05:25 -0700 (MST) Received: (from vjs@localhost) by calcite.rhyolite.com (8.13.4/8.13.4/Submit) id jAFG5PI1098639 for DCC-reputations@calcite.rhyolite.com; Tue, 15 Nov 2005 09:05:25 -0700 (MST) Date: Tue, 15 Nov 2005 09:05:25 -0700 (MST) From: Vernon Schryver Message-Id: <200511151605.jAFG5PI1098639@calcite.rhyolite.com> To: DCC-reputations@calcite.rhyolite.com Subject: Re: Adding IP to Metrics Header In-Reply-To: <1132070001.10715.9.camel@lanshark.dmv.com> Sender: dcc-reputations-admin@rhyolite.com Errors-To: dcc-reputations-admin@rhyolite.com X-BeenThere: dcc-reputations@rhyolite.com X-Mailman-Version: 2.0.5 Precedence: bulk List-Unsubscribe: , List-Id: Commercial Distributed Checksum Clearinghouse Mailing List List-Post: List-Help: List-Subscribe: , List-Archive: > From: Sven Willenberger > In searching the logfiles for those messages that have hits on rep, I > would like to be able to get an idea of what servers (IPs) are sending > these. Since the dcc checking mailservers are internal (i.e. MX IP > entries in whiteclnt) I cannot use the relay information in the maillog > files. > > As such, what would be involved with [optionally] adding IP to the > Metrics header that is added so that it would resemble > > X-DCC-brand-Metrics: chost server-ID; bulk chknm1=count ... IP=[relay IP > that is checked by rep] > > Just trying to get some useful statistics gathering in one grep/awk pass > of the maillog :-) I'm affraid to change the X-DCC header lest I break filters that depend on it to detect bulk mail. Adding "bulk rep" may have been too much. I suppose another X- header could be added. What about looking in the DCC log files? The third line contains the IP address that gets the blame. I tend to use /var/dcc/libexec/dblist -C 'rep 12345678 12345678 12345678 12345678' or /var/dcc/libexec/dblist -C 'rep-total 12345678 12345678 12345678 12345678' to see a given IP address has a reputation. If the reports containing the the reputation checksums don't include the body checksums, I use `dblist -T` with timestamps in the same second to look for the reports of spam sent by the IP address. (The next version of dblist lets the microseconds be omitted. The current version takes -1 to mean 'ignore microseconds'.) (For various constraints on flooding and database compresson, the body checksums are sometimes put into reports separate from the reputation checksums in the database.) Vernon Schryver vjs@rhyolite.com From sven@dmv.com Tue Nov 15 09:36:54 2005 Received: from smtp-gw-cl-c.dmv.com (smtp-gw-cl-c.dmv.com [216.240.97.41]) by calcite.rhyolite.com (8.13.4/8.13.4) with ESMTP id jAFGaqfj025644 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) env-from ; Tue, 15 Nov 2005 09:36:53 -0700 (MST) Received: from mail-gw-cl-a.dmv.com (mail-gw-cl-a.dmv.com [216.240.97.38]) by smtp-gw-cl-c.dmv.com (8.12.10/8.12.10) with ESMTP id jAFGapYZ060456; Tue, 15 Nov 2005 11:36:51 -0500 (EST) (envelope-from sven@dmv.com) Received: from lanshark.dmv.com (lanshark.dmv.com [216.240.97.46]) by mail-gw-cl-a.dmv.com (8.12.9/8.12.9) with ESMTP id jAFGapQU064096; Tue, 15 Nov 2005 11:36:51 -0500 (EST) (envelope-from sven@dmv.com) Subject: Re: Adding IP to Metrics Header From: Sven Willenberger To: Vernon Schryver Cc: DCC-reputations@calcite.rhyolite.com In-Reply-To: <200511151605.jAFG5PI1098639@calcite.rhyolite.com> References: <200511151605.jAFG5PI1098639@calcite.rhyolite.com> Content-Type: text/plain Date: Tue, 15 Nov 2005 11:38:14 -0500 Message-Id: <1132072694.10715.17.camel@lanshark.dmv.com> Mime-Version: 1.0 X-Mailer: Evolution 2.4.1 Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.39 X-Scanned-By: MIMEDefang 2.48 on 216.240.97.38 X-DCC-Rhyolite-Metrics: calcite.rhyolite.com 101; Body=2 Fuz1=2 Fuz2=2 rep=2% Sender: dcc-reputations-admin@rhyolite.com Errors-To: dcc-reputations-admin@rhyolite.com X-BeenThere: dcc-reputations@rhyolite.com X-Mailman-Version: 2.0.5 Precedence: bulk List-Unsubscribe: , List-Id: Commercial Distributed Checksum Clearinghouse Mailing List List-Post: List-Help: List-Subscribe: , List-Archive: On Tue, 2005-11-15 at 09:05 -0700, Vernon Schryver wrote: > > From: Sven Willenberger > > > In searching the logfiles for those messages that have hits on rep, I > > would like to be able to get an idea of what servers (IPs) are sending > > these. Since the dcc checking mailservers are internal (i.e. MX IP > > entries in whiteclnt) I cannot use the relay information in the maillog > > files. > > > > As such, what would be involved with [optionally] adding IP to the > > Metrics header that is added so that it would resemble > > > > X-DCC-brand-Metrics: chost server-ID; bulk chknm1=count ... IP=[relay IP > > that is checked by rep] > > > > Just trying to get some useful statistics gathering in one grep/awk pass > > of the maillog :-) > > I'm affraid to change the X-DCC header lest I break filters that depend on > it to detect bulk mail. Adding "bulk rep" may have been too much. > > I suppose another X- header could be added. That may be an idea. Reading the manpage for DCC I saw IP listed in the subsection on Metrics on the types of checksums which raised my hopes a little that this information could be included in the Metrics. Perhaps simply a new Rep header : X-DCC-Reps-Metrics that would include bulk rep reps-total=count, rep=%, IP=[relay] which would then not break clients depending on the pre-Reps X-DCC header. > What about looking in the DCC log files? The third line contains the > IP address that gets the blame. I tend to use > /var/dcc/libexec/dblist -C 'rep 12345678 12345678 12345678 12345678' > or > /var/dcc/libexec/dblist -C 'rep-total 12345678 12345678 12345678 12345678' > to see a given IP address has a reputation. If the reports containing > the the reputation checksums don't include the body checksums, > I use `dblist -T` with timestamps in the same second to look for > the reports of spam sent by the IP address. (The next version of > dblist lets the microseconds be omitted. The current version takes -1 > to mean 'ignore microseconds'.) (For various constraints on flooding > and database compresson, the body checksums are sometimes put into > reports separate from the reputation checksums in the database.) Alas, I stopped keeping the logged messages a long time ago. I enabled logging briefly to check out the messages. I found the IP hash line and tried running: /var/dcc/libexec/dblist -C 'rep 67708712 3cef1eb2 218ec748 11c283ae' but got an error (both on the client as well as the reporting dcc server) of: unrecognized checksum values "rep 67708712 3cef1eb2 218ec748 11c283ae"; fatal error Sven From vjs@calcite.rhyolite.com Tue Nov 15 10:09:15 2005 Received: from calcite.rhyolite.com (localhost [127.0.0.1]) by calcite.rhyolite.com (8.13.4/8.13.4) with ESMTP id jAFH9EEP048801 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for env-from ; Tue, 15 Nov 2005 10:09:14 -0700 (MST) Received: (from vjs@localhost) by calcite.rhyolite.com (8.13.4/8.13.4/Submit) id jAFH9E3L048800 for DCC-reputations; Tue, 15 Nov 2005 10:09:14 -0700 (MST) Date: Tue, 15 Nov 2005 10:09:14 -0700 (MST) From: Vernon Schryver Message-Id: <200511151709.jAFH9E3L048800@calcite.rhyolite.com> To: DCC-reputations@calcite.rhyolite.com Subject: Re: Adding IP to Metrics Header In-Reply-To: <1132072694.10715.17.camel@lanshark.dmv.com> Sender: dcc-reputations-admin@rhyolite.com Errors-To: dcc-reputations-admin@rhyolite.com X-BeenThere: dcc-reputations@rhyolite.com X-Mailman-Version: 2.0.5 Precedence: bulk List-Unsubscribe: , List-Id: Commercial Distributed Checksum Clearinghouse Mailing List List-Post: List-Help: List-Subscribe: , List-Archive: > From: Sven Willenberger > Alas, I stopped keeping the logged messages a long time ago. I enabled > logging briefly to check out the messages. I found the IP hash line and > tried running: > /var/dcc/libexec/dblist -C 'rep 67708712 3cef1eb2 218ec748 11c283ae' > > but got an error (both on the client as well as the reporting dcc > server) of: > > unrecognized checksum values "rep 67708712 3cef1eb2 218ec748 11c283ae"; > fatal error That suggests that dblist was old or not the commercial version. Does `dblist -V` say 2.3.20? I get this from the search: 05/11/08 02:34:14.141920 7 compressed 23d6f888 rep-total 7 67708712 3cef1eb2 218ec748 11c283ae 1a63888 rep 7 67708712 3cef1eb2 218ec748 11c283ae a0f4bc 05/11/09 04:42:18.030616 5 1003 trimmed 256a27dc rep-total 12 67708712 3cef1eb2 218ec748 11c283ae 23d6f888 1a63888 rep 12 67708712 3cef1eb2 218ec748 11c283ae 23d6f888 a0f4bc 4edfd6a0 Since dblist reads the database, it only works on the server. To speed things up by not chugging through the entire database, I often use `dblist -P 3` or some other modest number of recent "pages" of the database....of course, only when I know the report is recent. Vernon Schryver vjs@rhyolite.com From sven@dmv.com Tue Nov 15 10:29:17 2005 Received: from smtp-gw-cl-d.dmv.com (smtp-gw-cl-d.dmv.com [216.240.97.42]) by calcite.rhyolite.com (8.13.4/8.13.4) with ESMTP id jAFHTD4g066281 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) env-from ; Tue, 15 Nov 2005 10:29:15 -0700 (MST) Received: from mail-gw-cl-a.dmv.com (mail-gw-cl-a.dmv.com [216.240.97.38]) by smtp-gw-cl-d.dmv.com (8.12.10/8.12.10) with ESMTP id jAFHTDft063051; Tue, 15 Nov 2005 12:29:13 -0500 (EST) (envelope-from sven@dmv.com) Received: from lanshark.dmv.com (lanshark.dmv.com [216.240.97.46]) by mail-gw-cl-a.dmv.com (8.12.9/8.12.9) with ESMTP id jAFHTCQU066275; Tue, 15 Nov 2005 12:29:12 -0500 (EST) (envelope-from sven@dmv.com) Subject: Re: Adding IP to Metrics Header From: Sven Willenberger To: Vernon Schryver Cc: DCC-reputations@calcite.rhyolite.com In-Reply-To: <200511151709.jAFH9E3L048800@calcite.rhyolite.com> References: <200511151709.jAFH9E3L048800@calcite.rhyolite.com> Content-Type: text/plain Date: Tue, 15 Nov 2005 12:30:35 -0500 Message-Id: <1132075835.10715.22.camel@lanshark.dmv.com> Mime-Version: 1.0 X-Mailer: Evolution 2.4.1 Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.48 on 216.240.97.42 X-Scanned-By: MIMEDefang 2.48 on 216.240.97.38 X-DCC-Rhyolite-Metrics: calcite.rhyolite.com 101; Body=2 Fuz1=2 Fuz2=2 rep=3% Sender: dcc-reputations-admin@rhyolite.com Errors-To: dcc-reputations-admin@rhyolite.com X-BeenThere: dcc-reputations@rhyolite.com X-Mailman-Version: 2.0.5 Precedence: bulk List-Unsubscribe: , List-Id: Commercial Distributed Checksum Clearinghouse Mailing List List-Post: List-Help: List-Subscribe: , List-Archive: On Tue, 2005-11-15 at 10:09 -0700, Vernon Schryver wrote: > > From: Sven Willenberger > > > Alas, I stopped keeping the logged messages a long time ago. I enabled > > logging briefly to check out the messages. I found the IP hash line and > > tried running: > > /var/dcc/libexec/dblist -C 'rep 67708712 3cef1eb2 218ec748 11c283ae' > > > > but got an error (both on the client as well as the reporting dcc > > server) of: > > > > unrecognized checksum values "rep 67708712 3cef1eb2 218ec748 11c283ae"; > > fatal error > > That suggests that dblist was old or not the commercial version. > Does `dblist -V` say 2.3.20? > (on the server in question that originally reported the checksum) #dblist -V 2.3.20 Sven From vjs@calcite.rhyolite.com Tue Nov 15 10:49:12 2005 Received: from calcite.rhyolite.com (localhost [127.0.0.1]) by calcite.rhyolite.com (8.13.4/8.13.4) with ESMTP id jAFHnCot076194 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for env-from ; Tue, 15 Nov 2005 10:49:12 -0700 (MST) Received: (from vjs@localhost) by calcite.rhyolite.com (8.13.4/8.13.4/Submit) id jAFHnCPZ076193 for DCC-reputations; Tue, 15 Nov 2005 10:49:12 -0700 (MST) Date: Tue, 15 Nov 2005 10:49:12 -0700 (MST) From: Vernon Schryver Message-Id: <200511151749.jAFHnCPZ076193@calcite.rhyolite.com> To: DCC-reputations@calcite.rhyolite.com Subject: Re: Adding IP to Metrics Header In-Reply-To: <1132075835.10715.22.camel@lanshark.dmv.com> Sender: dcc-reputations-admin@rhyolite.com Errors-To: dcc-reputations-admin@rhyolite.com X-BeenThere: dcc-reputations@rhyolite.com X-Mailman-Version: 2.0.5 Precedence: bulk List-Unsubscribe: , List-Id: Commercial Distributed Checksum Clearinghouse Mailing List List-Post: List-Help: List-Subscribe: , List-Archive: > From: Sven Willenberger > > Does `dblist -V` say 2.3.20? > > (on the server in question that originally reported the checksum) > > #dblist -V > 2.3.20 oh, now I see. Version 2.3.20 had a bug in code I added that made the checksum type optional. It's fixed in the unreleased 2.3.21 version I'm using. Try dblist -C '67708712 3cef1eb2 218ec748 11c283ae' Vernon Schryver vjs@rhyolite.com