spam rates

Daniel V Klein
Wed Apr 23 19:40:15 UTC 2003

Joe (and others, who might find spam statistics interesting, see Joe's note
at the end of this email):

You make an interesting observation, but I don't think it is a workable
discriminant.  DCC on my system logs all mail with a checksum count of 10 or
greater, and rejects as spam those with a checksum count of 30 or greater.
Of the 31366 suspicious/junk messages logged since April 9:

	1 or Highest	 3060
	2 or High	   11
	3 or Normal	12117	Unmarked 16066
	4 or Low	  109
	5 or Lowest	    3

Of the logged (suspicious/junk) messages, 9.7% are marked with an elevated

Other than that, I have no idea how what priority headers exist in the
messages that are rejected by my first 3 levels of filters.  I can only
comment on that which makes it onto my desktop machine that I haven't
subsequently deleted:

    In my garbage folder (mail that my last line of filters rejects):

	1 or Highest	 272
	2 or High    	  80
	3 or Normal  	2063	Unmarked 7807
	4 or Low     	   4
	5 or Lowest  	  12
					3.4% elevated priority

    In my junk folder (mail that gets through my last line of filters that I
    manually reject):

	1 or Highest	 682
	2 or High	  20
	3 or Normal	3562	Unmarked 8178
	4 or Low	  10
	5 or Lowest	   0
					5.6% elevated priority

    In my devull folder (mail that was sent to spam trap addresses):

	1 or Highest	1078
	2 or High	  59
	3 or Normal	7259	Unmarked   17
	4 or Low	   1
	5 or Lowest	   0
					13.5% elevated priority

In the spam categories overall, 7.0% of the mail is marked with an elevated
priority.  So it would seem that your observation also holds in my spam mail.

    USENIX mail (all of which I want/need, much of which is last-minute):

	1 or Highest	 367
	2 or High	  44
	3 or Normal	4322	Unmarked   45
	4 or Low 	   0
	5 or Lowest	   0
					8.6% elevated priority

    Some classified mail (not all folders, but the biggies)

	1 or Highest	 502
	2 or High	 127
	3 or Normal	4311	Unmarked   62
	4 or Low	   0
	5 or Lowest	   0
					12.6% elevated priority

    In my inbox (unclassified mail that hangs around, much of which is *old*,
    before classification was widely used):

	1 or Highest	   9
	2 or High	   3
	3 or Normal	 279	Unmarked 3614
	4 or Low	   0
	5 or Lowest	   0
					0.3% elevated priority

In most of the mail I want, 7.6% is marked with an elevated priority.  I
suspect that if I look only in recently received mail, the overall number
will be higher.

So while I can see your point about priorities pointing to spam, it won't
work for me because it also points to mail that people think is important
(although it might be of interest to the Spam Assassin folks, who use a
genetic algorithm with many discriminants).  Interestingly, some spam is
actually marked as low priority (as it should be!), while none of my real
mail is!


> One of my policies is that "high priority" email is always dumped to a
> special folder, which I examine more-or-less once a month. I receive a
> high-priority message (average this year) every 5 hours and 27 minutes.
> Last year (average, 9 hours and 35 minutes) I received three that were
> actually important. One was from a client, who called a couple days
> later to ask why I hadn't responded. I said "I never received a
> message". He said "But I sent it high-priority!" I said "That guarantees
> I won't see it because all high-priority mail is automatically
> discarded". He was incredulous. I then pointed out that this was because
> it was all spam. "What's spam?" he asked "Unsolicited advertising" I
> explained. "I've never gotten any." he said. So I sent him a sampling of
> ten message headers, +/-5 from his. "When you send high priority email,
> this is the company you keep" I explained. In the second case, I spotted
> a high-priority question that had come in to me (unsolicited technical
> question) within the last day (I was doing my sort-of-monthly perusal).
> I waited two weeks to answer it. I forget what the third was, but I
> never saw it. About six months later, the sender sent me a regular email
> asking if I was angry with him. "No" I replied. The answer came back
> "Didn't you get my email?"  "No, when did you send it?" "I sent it
> high-priority last summer" "Ah, that is the problem. Had you sent it
> regular email, I would have seen it immediately." I then pointed out
> that our exchange, which included the topic of the original message, was
> now up to about ten messages, had taken place over a 2-hour period
> because he used non-high-priority email which I saw immediately. I went
> back and found the message from him in the high-priority dumping ground.
> While it was important to him, it really wasn't all that important to
> me. I pointed out that "high priority" is an attempt of the sender to
> get my attention. The fact of the demand means that the sender thinks
> he/she is the most important thing in my life. However, it is my life,
> and what is important is for me to decide, not the sender.
> Joseph M. Newcomer
> FlounderCraft Ltd
> 610 Kirtland St.
> Pittsburgh PA 15208

More information about the DCC mailing list

Contact by mail or use the form.