[sr-dev] [Kamailio-Users] kamailio / deadlock3

Daniel-Constantin Mierla miconda at gmail.com
Thu Jan 28 14:56:36 CET 2010


I am cc-ing sr-dev, since tcp code is from ser and Andrei may have more 
insights...


On 1/28/10 2:41 PM, Aymeric Moizard wrote:
>
>
> On Thu, 28 Jan 2010, Henning Westerholt wrote:
>
>> On Thursday 28 January 2010, Aymeric Moizard wrote:
>>> here is the backtrace I have. unfortunatly without debug symbol!
>>> I found the same for many of the kamailio process. "sched_yield"
>>> is pending for ever. My system is a debian/etch.
>>>
>>> #0  0xffffe424 in __kernel_vsyscall ()
>>> #1  0xb7cef4ac in sched_yield () from /lib/tls/i686/cmov/libc.so.6
>>> #2  0x080a93fd in tcp_send ()
>>> #3  0xb7975679 in send_pr_buffer () from 
>>> /usr/lib/kamailio/modules/tm.so
>>> #4  0xb79789ac in t_forward_nonack () from 
>>> /usr/lib/kamailio/modules/tm.so
>>> #5  0xb7974784 in t_relay_to () from /usr/lib/kamailio/modules/tm.so
>>> #6  0xb7983a11 in load_tm () from /usr/lib/kamailio/modules/tm.so
>>> #7  0x081cf810 in mem_pool ()
>>> #8  0x00000000 in ?? ()
>>>
>>> I guess most t_relay operation towards my "mobipouce.com" domain
>>> with one IP being down breaks each kamailio process one after the
>>> other... I'm not sure every such t_relay operation is always breaking
>>> exactly one thread each time.
>>>
>>> I went through the lock/unlock of tcp_main.c but it seems every
>>> lock has an unlock at least...
>>
>> Hi Aymeric,
>>
>> i remember that we observed this "sched_yield" problems on one old 
>> 0.9 system
>> after some time (like weeks or month). We did not found the solution 
>> in this
>> case, after a restart it was gone again..
>>
>> You mentioned in an earlier mail that you see this related to UDP 
>> traffic, but
>> in the log file and also in your investigations you think its related 
>> to TPC?
>
> This is the exact case:
> 1-> SUBSCRIBE sent to/received by over UDP to kamailio.
> 2-> kamailio does a SRV record lookup for "mobipouce.com"
> 3-> kamailio try sip2.mobipouce.com (91.199.234.47) over TCP first
> 4-> connection failed with logs:
> Jan 27 12:56:38 ns26829 /usr/sbin/kamailio[9763]: 
> ERROR:core:tcp_blocking_connect: poll error: flags 18
> Jan 27 12:56:38 ns26829 /usr/sbin/kamailio[9763]: 
> ERROR:core:tcp_blocking_connect: failed to retrieve SO_ERROR (111) 
> Connection refused
> Jan 27 12:56:38 ns26829 /usr/sbin/kamailio[9763]: 
> ERROR:core:tcpconn_connect: tcp_blocking_connect failed
> Jan 27 12:56:38 ns26829 /usr/sbin/kamailio[9763]: ERROR:core:tcp_send: 
> connect failed
> Jan 27 12:56:38 ns26829 /usr/sbin/kamailio[9763]: ERROR:tm:msg_send: 
> tcp_send failed
> Jan 27 12:56:38 ns26829 /usr/sbin/kamailio[9763]: 
> ERROR:tm:t_forward_nonack: sending request failed
> 5-> I guess kamailio is supposed to try other SRV record value:
>     sip2.mobipouce.com (91.199.234.46) but it doesn't
>
> Thus, I'm guessing the issue is related to SRV record with failover OR 
> just tcp failure. Not related to UDP at all.

so TCP connect failed, the tcp worker returned as it prints the message 
and, to be sure I got it right, the UDP worker (the one that received) 
got blocked?

>
> It's definitly possible to reproduce the issue now!
>
> I guess anyone can try your version of kamailio and t_relay message
> to "mobipouce.com" and you'll fall in that case! Sending plenty of
> those messages will finally lock all kamailio process.

All? tcp and udp?

Cheers,
Daniel

>
> Regards,
> Aymeric MOIZARD / ANTISIP
> amsip - http://www.antisip.com
> osip2 - http://www.osip.org
> eXosip2 - http://savannah.nongnu.org/projects/exosip/
>
>
>> Regards,
>>
>> Henning
>>
>> Viele Grüße,
>>
>> Henning
>>
>
> _______________________________________________
> Kamailio (OpenSER) - Users mailing list
> Users at lists.kamailio.org
> http://lists.kamailio.org/cgi-bin/mailman/listinfo/users
> http://lists.openser-project.org/cgi-bin/mailman/listinfo/users

-- 
Daniel-Constantin Mierla
* http://www.asipto.com/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.sip-router.org/pipermail/sr-dev/attachments/20100128/80860e1a/attachment-0001.htm>


More information about the sr-dev mailing list