Timeout problems

Eric Deis nospamthankyou at anemone.cx
Wed Mar 30 06:25:53 CEST 2011


It's been a few months, but the timeout errors have started appearing again.

eg.
warning: timeout on 127.0.0.1:12525 while reading input attribute name
warning: problem talking to server 127.0.0.1:12525: Connection timed out

I am now running:

http://rz.ek-muc.de/pub/policyd-weight-debug

As per your recommendation.

I've tried to extract the relevant sections of the logs here:

Mar 29 14:37:43  postfix/smtpd[11492]: connect from
smtp.mdanderson.es[85.62.36.178]

Mar 29 14:37:49  postfix/policyd-weight[3434]: child: loop: awaked from
select
Mar 29 14:37:49  postfix/policyd-weight[3434]: child: new client, trying
accept
Mar 29 14:37:49  postfix/policyd-weight[1779]: main: got
child-status-data: 3434 0
Mar 29 14:37:49  postfix/policyd-weight[1779]: main: select
Mar 29 14:37:49  postfix/policyd-weight[3434]: child: new client, got a
sock and notified parent that I am busy
Mar 29 14:37:49  postfix/policyd-weight[1779]: main: got
child-status-data: 3434 0
Mar 29 14:37:49  postfix/policyd-weight[1779]: main: select
Mar 29 14:37:49  postfix/policyd-weight[3434]: child: loop: next
Mar 29 14:37:49  postfix/policyd-weight[3434]: child: loop: select
Mar 29 14:37:49  postfix/policyd-weight[3434]: child: loop: awaked from
select
Mar 29 14:37:49  postfix/policyd-weight[1779]: main: got
child-status-data: 3434 0
Mar 29 14:37:49  postfix/policyd-weight[1779]: main: select
Mar 29 14:37:49  postfix/policyd-weight[3434]: child: reading from client
Mar 29 14:37:49  postfix/policyd-weight[3434]: got a complete request set

Mar 29 14:37:52  postfix/policyd-weight[3434]: child: loop: awaked from
select
Mar 29 14:37:52  postfix/policyd-weight[3434]: child: new client, trying
accept
Mar 29 14:37:52  postfix/policyd-weight[1779]: main: got
child-status-data: 3434 0
Mar 29 14:37:52  postfix/policyd-weight[1779]: main: select
Mar 29 14:37:52  postfix/policyd-weight[3434]: child: new client, got a
sock and notified parent that I am busy
Mar 29 14:37:52  postfix/policyd-weight[1779]: main: got
child-status-data: 3434 0
Mar 29 14:37:52  postfix/policyd-weight[1779]: main: select
Mar 29 14:37:52  postfix/policyd-weight[3434]: child: loop: next
Mar 29 14:37:52  postfix/policyd-weight[3434]: child: loop: select
Mar 29 14:37:52  postfix/policyd-weight[3434]: child: loop: awaked from
select
Mar 29 14:37:52  postfix/policyd-weight[1779]: main: got
child-status-data: 3434 0
Mar 29 14:37:52  postfix/policyd-weight[1779]: main: select
Mar 29 14:37:52  postfix/policyd-weight[3434]: child: reading from client
Mar 29 14:37:52  postfix/policyd-weight[3434]: got a complete request set


Mar 29 14:39:21  postfix/policyd-weight[1810]: child: loop: awaked from
select
Mar 29 14:39:21  postfix/policyd-weight[1779]: main: got
child-status-data: 1810 0
Mar 29 14:39:21  postfix/policyd-weight[1779]: main: select
Mar 29 14:39:21  postfix/policyd-weight[1779]: main: got
child-status-data: 1810 d
Mar 29 14:39:21  postfix/policyd-weight[1779]: main: select
Mar 29 14:39:21  postfix/policyd-weight[1779]: main: got
child-status-data: 1810 1
Mar 29 14:39:21  postfix/policyd-weight[1779]: main: select
Mar 29 14:39:21  postfix/policyd-weight[1810]: child: loop: next
Mar 29 14:39:21  postfix/policyd-weight[1810]: child: loop: select

Mar 29 14:39:32  postfix/smtpd[11492]: warning: timeout on
127.0.0.1:12525 while reading input attribute name
Mar 29 14:39:32  postfix/smtpd[11492]: warning: problem talking to
server 127.0.0.1:12525: Connection timed out
Mar 29 14:39:33  postfix/policyd-weight[1810]: child: loop: awaked from
select
Mar 29 14:39:33  postfix/policyd-weight[1810]: child: new client, trying
accept
Mar 29 14:39:33  postfix/policyd-weight[1779]: main: got
child-status-data: 1810 0
Mar 29 14:39:33  postfix/policyd-weight[1779]: main: select
Mar 29 14:39:33  postfix/policyd-weight[1810]: child: new client, got a
sock and notified parent that I am busy
Mar 29 14:39:33  postfix/policyd-weight[1779]: main: got
child-status-data: 1810 0
Mar 29 14:39:33  postfix/policyd-weight[1779]: main: select
Mar 29 14:39:33  postfix/policyd-weight[1810]: child: loop: next
Mar 29 14:39:33  postfix/policyd-weight[1810]: child: loop: select
Mar 29 14:39:33  postfix/policyd-weight[1810]: child: loop: awaked from
select
Mar 29 14:39:33  postfix/policyd-weight[1779]: main: got
child-status-data: 1810 0
Mar 29 14:39:33  postfix/policyd-weight[1779]: main: select
Mar 29 14:39:33  postfix/policyd-weight[1810]: child: reading from client
Mar 29 14:39:33  postfix/policyd-weight[1810]: got a complete request set

Mar 29 14:39:35  postfix/policyd-weight[1779]: main: got
child-status-data: 3434 1
Mar 29 14:39:35  postfix/policyd-weight[1779]: main: select
Mar 29 14:39:35  postfix/policyd-weight[3434]: child: loop: next
Mar 29 14:39:35  postfix/policyd-weight[3434]: child: loop: select

Mar 29 14:41:13 postfix/smtpd[11492]: warning: timeout on
127.0.0.1:12525 while reading input attribute name
Mar 29 14:41:13 postfix/smtpd[11492]: warning: problem talking to server
127.0.0.1:12525: Connection timed out
Mar 29 14:41:13 postfix/smtpd[11492]: NOQUEUE: reject: RCPT from
smtp.mdanderson.es[85.62.36.178]: 451 4.3.5 Server configuration
problem; from=<ulx at gxuoqijkgst6c6.comandbuy2year.cc>
to=<rebecca at CENSOREDURL.COM> proto=SMTP helo=<smtp.mdanderson.es>


r.felber at ek-muc.de wrote:
> On Wed, Oct 20, 2010 at 11:39:01PM +0200, r.felber at ek-muc.de wrote:
>   
>> On Wed, Oct 20, 2010 at 01:45:15PM -0700, Eric Deis wrote:
>>     
>>> There is nothing relevant in /var/log/messages, definitely no errors. 
>>> Don't have the other logs mentioned.
>>>
>>> I included an excerpt previously of what was happening in maillog:
>>> Oct 19 21:14:37 MYSERVER postfix/smtpd[5696]: connect from
>>> smtp14.icpbounce.com[216.27.93.112]
>>> Oct 19 21:16:23 MYSERVER postfix/smtpd[5696]: warning: timeout on
>>> 127.0.0.1:12525 while reading input attribute name
>>> Oct 19 21:16:23 MYSERVER postfix/smtpd[5696]: warning: problem talking to
>>> server 127.0.0.1:12525: Connection timed out
>>> Oct 19 21:18:04 MYSERVER postfix/smtpd[5696]: warning: timeout on
>>> 127.0.0.1:12525 while reading input attribute name
>>> Oct 19 21:18:04 MYSERVER postfix/smtpd[5696]: warning: problem talking to
>>> server 127.0.0.1:12525: Connection timed out
>>> Oct 19 21:18:04 MYSERVER postfix/smtpd[5696]: NOQUEUE: reject: RCPT from
>>> smtp14.icpbounce.com[216.27.93.112]: 451 4.3.5 Server configuration
>>> problem; from=<bounces+454705.29871415.356246 at icpbounce.com>
>>> to=<gabriela at MYCLIENTDOMAIN.COM> proto=ESMTP helo=<smtp14.icpbounce.com>
>>> Oct 19 21:18:04 MYSERVER postfix/smtpd[5696]: disconnect from
>>> smtp14.icpbounce.com[216.27.93.112]
>>>
>>>
>>> Doing some more research, perhaps my local DNS cache was not setup properly.
>>>       
>> I don't think that was the issue.
>>
>>     
>>> /etc/resolve.conf was missing the following line:
>>> nameserver 127.0.0.1
>>>
>>> Only had references to my actual name servers. BIND is running. Did some 
>>> dig queries, and looks like the cache is working. First query will take 
>>> 100-200ms or less. Second query 1ms or 2ms, indicating cache.
>>>       
>>     
>>> I will just wait and see if the errors show up again. Quite random. Once 
>>> in three months, then once per week, and then daily. Is there anyway to 
>>> test and force the error?
>>>       
>> Once that error occurs frequently one could setup a strace which follows
>> forks on the master and its children.
>>
>> I see that not as a realistic option.
>>
>> I will insert some debug symbols for the listen-stuff in the next
>> days. 
>>     
>
> Can you please try the version from
>
> http://rz.ek-muc.de/pub/policyd-weight-debug
>
> Warning: it produces much more logging output.
>
>
>  
>   



More information about the Policyd-weight-users mailing list