benchmarking body checksums

Leandro Santi
Mon Feb 17 18:00:27 UTC 2003

Vernon Schryver wrote:

> > From: Leandro Santi <>
> > ...
> > 0) "standard" dccproc (ie built using CC=gcc)
> > 1) CC=gcc, CFLAGS=-O
> > 2) CC=gcc, CFLAGS=-O2
> > 3) CC=icc
> > 4) CC=icc CFLAGS=-march=pentiumii (ie using hardware vectorization)
> > 5) CC=icc, built using ICC's profile-guided optimization capability
> > ...
> > In order to minimize startup time influence and to actually measure
> > body checksumming speed, every one of the 95 messages is 200k or bigger.
> >
> > 0) scored ~2.3 MB/s on average
> > 1) ran on 65% of 0's time on average, at ~3.6 MB/s
> > 2) 69%, at ~3.3 MB/s
> > 3) 53%, at ~4.3 MB/s
> > 4) 54%, at ~4.3 MB/s
> > 5) 50%, at ~4.6 MB/s (twice as fast! :-)
> All of the numbers seem to be based on blocks of text that are at
> least 275 KBytes.  How does that size compare with size of the typical
> mail message?  If most mail messages are 1%-10% of that size, then the
> start-up costs of the checksumming are more important than the bulk costs.

Yes, I am aware of this. Typical message size is ~60-70K here, see

(That statistic is a little old, I guess actual message size is somewhat bigger).

Anyway, I just wanted to measure the effect of -O and -O2 over the
body checksumming routines. The intel icc stuff is there for the sake of

I thought about writing some custom code that called the checksumming
routines under the dcclib tree for a while,  but I realised I could measure
this by feeding larger messages to dccproc in order to leverage startup costs.

> 2.3 MByte/sec is significantly faster than most installations receive
> mail, which suggests that it is fast enough.  How fast are SpamAssassin
> and other mechanisms?

SpamAssassin is slower, I guess. I didn't try it for high volume filtering.

> On the other hand, faster is better.  Should I whack on the configure
> script to set CFLAGS=-O2 when it is not otherwise set and gcc is in use?

I think -O2 is slightly slower than -O with my old 2.91.66 gcc. But both -O
and -O2 are noticeably faster than the default setting. But I guess I should
try some newer gcc version first.

> > WRT the intel compiler. Compiling the DCC with icc is not supported,
> > I think (I only checked the MD5 outputs of each message and compared
> > them against the standard dccproc, and it worked just fine). It needs
> > some work in order to fix many warnings and of course to check that
> > everything else is working as expected.
> How much does the Intel compiler cost?  What sort of warnings are produced?

About ~$400 I think. But theres an evaluation version at the intel site.

> > ...
> > ps: Separately, Its interesting to see that 1.1.27 is sometimes faster,
> > sometimes slower than 1.1.11:
> > ...
> What varied those trials?  Was it different input text or measurement
> noise from cache effects?

What I noticed is that, for some messages, 1.1.27 tends to be faster and
for others, slower. Each message is tested multiple times in order to
leverage noise related problems. I could investigate more if its worth
the effort.

More information about the DCC mailing list

Contact by mail or use the form.