dccd peer flooding - possible rejection limit overrun?

Vernon Schryver vjs@calcite.rhyolite.com
Tue Aug 23 16:17:47 UTC 2005


> From: =?ISO-8859-2?Q?Martin_P=E1la?= <Martin.Pala@oskar.cz>

> 4.) when the multiple of threshold exceeds 4, the flooding stops (?).
> Even if the counter increases further, the peer is not updated
> anymore.

No, the flooding should be not stopped.  Instead it should be delayed
by up to 12 hours (depending on the dbclean expiration durations).
Then a summary of all of the accumulated, delayed counts will be flooded.

It is possible that I have broken something, but I hope not.
Consider this report I found with `/var/dcc/libexec/dblist -HvP1`

 05/08/23 02:24:21.900024    1        1072               4ad051cc
      path: 1049<-1010<-1058<-
   Body         582        86375a47 fce4acbd 6793fb5d edfbd198 4a058ea0 2327829
   Fuz1         593        3df8036f 13918e6f 553f311c be8688cf 4a058ea0 7d28ba
   Fuz2         742        a35a153c ba2986de 0b7eb80f 51ea55e9 4a058ea0 17d0849

If flooding stopped at 80, those total counts of 582, 593, and 742 would
be unlikely on that particular server.

Are you sure that flooding stops permanently after 80?


> - let's say that the rejection limit is 5000 - one dccd reports
> to the clients real checksum count (so the clients can reject
> messages as soon as 5000 is reached). however, the passive dccd
> will know just about 80 checksums.

Why would you use a rejection limit of 5000?  Don't you think
that 1000, 100, or even 50 copies of a message make it "bulk"?


> I think the solution could be to continue the flooding of dccd
> peers each time the dccd threshold is reached (not stop after 4
> updates were send). This can decrease the possible rejection limit
> overrun by just cca. 1*threshold (e.g. 5020 mails instead of 5000).

The DCC network will hear reports of more than 230 million mail messages
today.  Assume 70% or 160 million of them are spam recognized by DCC
clients as bulk.  If all of them were flooded without all of the delaying
and summarizing in flooding, and if they were kept for the default 14
days, then each DCC server might have a database of 224 GBytes and need
500 GBytes of RAM.

The idea of the 4*(-t value) thresholding is to quickly flood reports
of new bulk mail to other DCC servers.
The delayed flooding should tell other DCC servers that old and
already known bulk mail is continuing.

Perhaps the 4*(-t value) is too low.
Or perhaps it should be non-linear.


Vernon Schryver    vjs@rhyolite.com



More information about the DCC mailing list

Contact vjs@rhyolite.com by mail or use the form.