[SOLVED on dcc 1.3.31] Problem on dcc 1.3.30 - Continue Not Asking DCC...

Breno Moiana breno@haxent.com.br
Mon Apr 10 13:15:20 UTC 2006


I would like to thank you all for the help provided with the "dying DCC"
problem reported below. All comments and orientations provided were very
helpful.

On the update of March 19th the problem was solved. Not only the DCC
countdown is being properly resetted, but also the service is not
dropping as often. I have spent the last three weeks following up the
logs to make sure it was fixed, and it certainly looks like so. I do
have some small periods of unavailability, but those don't last more
than two minutes (as opposed to hours before the fix) and they are
fairly rare, and perfectly acceptable (twice a week, or something around
that).

Thank you all for the help.

Best Regards,

Breno Moiana.
=================
Haxent Consulting


On Thu, 2006-03-09 at 12:00 -0300, Breno Moiana wrote:
> Greetings.
> 
> I am new to this list, even though I have been digging through the 
> archives for quite some time now.
> 
> I am having a problem that is very similar to the one reported by Gary 
> Mills[1] on January 26th 2006.
> 
> [1] http://www.rhyolite.com/pipermail/dcc/2006/003019.html
> 
> According to the CHANGES file on the source tree, two fixes were made on 
> version 1.3.28 that could have solved Gary's problem. However, I am 
> still going thorugh it. Here is my report:
> 
> We have a DCC server set up on an email provider, handling around 3 
> million email messages a day.
> The server is a dual Xeon 3.2 64 bits, with 3GB RAM, and it is running 
> only the DCC and a small mySQL server, which we use only for graylisting 
> (an in-house developed graylisting solution).
> 
> The operating system is CentOS4.1, in its 64-bit build.
> 
> 
> Without any apparent reason, something happens to DCC that makes it stop 
> responding. Here is the log from the beginning of the problem:
> 
> : Mar  9 09:29:38 dcc dccifd[4782]: no DCC answer from 127.0.0.1,6277 
> after 18264 ms
> : Mar  9 09:29:38 dcc dccifd[4782]: continue not asking DCC 64 seconds 
> after failure
> : Mar  9 09:29:38 dcc last message repeated 4 times
> : Mar  9 09:29:39 dcc dccifd[4782]: continue not asking DCC 63 seconds 
> after failure
> : Mar  9 09:29:39 dcc last message repeated 8 times
> : Mar  9 09:29:40 dcc dccifd[4782]: continue not asking DCC 62 seconds 
> after failure
> : Mar  9 09:29:40 dcc last message repeated 8 times
> : Mar  9 09:29:41 dcc dccifd[4782]: continue not asking DCC 61 seconds 
> after failure
> : Mar  9 09:29:41 dcc last message repeated 4 times
> : Mar  9 09:29:42 dcc dccifd[4782]: continue not asking DCC 60 seconds 
> after failure
> : Mar  9 09:29:42 dcc last message repeated 2 times
> : Mar  9 09:29:42 dcc dccifd[4782]: no DCC answer from 127.0.0.1,6277 
> after 22221 ms
> : Mar  9 09:29:42 dcc dccifd[4782]: continue not asking DCC 128 seconds 
> after failure
> : Mar  9 09:29:42 dcc last message repeated 2 times
> : Mar  9 09:29:43 dcc dccifd[4782]: continue not asking DCC 127 seconds 
> after failure
> : Mar  9 09:29:43 dcc last message repeated 2 times
> : Mar  9 09:29:44 dcc dccifd[4782]: continue not asking DCC 126 seconds 
> after failure
> : Mar  9 09:29:44 dcc last message repeated 9 times
> : Mar  9 09:29:45 dcc dccifd[4782]: continue not asking DCC 125 seconds 
> after failure
> : Mar  9 09:29:45 dcc last message repeated 8 times
> : Mar  9 09:29:45 dcc dccifd[4782]: continue not asking DCC 124 seconds 
> after failure
> : Mar  9 09:29:46 dcc last message repeated 5 times
> : Mar  9 09:29:46 dcc dccifd[4782]: no DCC answer from 127.0.0.1,6277 
> after 26375 ms
> : Mar  9 09:29:46 dcc dccifd[4782]: continue not asking DCC 256 seconds 
> after failure
> : Mar  9 09:29:46 dcc last message repeated 3 times
> : Mar  9 09:29:47 dcc dccifd[4782]: continue not asking DCC 255 seconds 
> after failure
> (...)
> 
> As you can see, it doubles the waiting time until it reahes the maximum 
> of 2048 seconds, which is then repeatedly counted down.
> 
> Please notice that the RTT to the server remains low all the time, at 
> around 50ms.
> 
> 
> Not always, when I manually run the cron-dccd script, the errors stop:
> 
> (...)
> : Mar  8 17:54:20 dcc dccifd[4782]: continue not asking DCC 1994 seconds 
> after failure
> : Mar  8 17:54:20 dcc last message repeated 4 times
> : Mar  8 17:54:21 dcc dccifd[4782]: continue not asking DCC 1993 seconds 
> after failure
> : Mar  8 17:54:22 dcc dccd[4748]: 1.3.30 database /var/dcc/dcc_db 
> reopened with 2016 MByte window
> 
> Then no more entries related to this error are seen in the log.
> This doesn't happen every time I run the cron script.
> Also, on 1.3.30 it sometimes just stops giving errors and gets back to 
> work, correctly calculating and counting the checksums. On 1.3.27/29 we 
> didn't see the server getting back to work without human intervention. 
> Even when we did get it back to work, we couldn't say for sure what made 
> it happen.
> 
> We were originally running 1.3.27 when this error first showed up, then 
> upgraded to 1.3.29, and now to 1.3.30.
> 
> Any help will be greatly appreciated, as we are falling into RBLs every 
> other day, due to the eventual lack of DCC service (we allow email to 
> pass when the DCC doesn't respond)
> 
> Thanks for the attention!
> 
> Best Regards,
> 
> Breno Moiana
> ================
> Haxent Consulting
> 
> 
> 
> 
> _______________________________________________
> DCC mailing list      DCC@rhyolite.com
> http://www.rhyolite.com/mailman/listinfo/dcc
> 




More information about the DCC mailing list

Contact vjs@rhyolite.com by mail or use the form.