[sr-dev] [Kamailio-Users] kamailio / deadlock3

Aymeric Moizard jack at atosc.org
Thu Jan 28 15:34:54 CET 2010


some other answer below:

On Thu, 28 Jan 2010, Daniel-Constantin Mierla wrote:

> I am cc-ing sr-dev, since tcp code is from ser and Andrei may have more 
> insights...
>
>
> On 1/28/10 2:41 PM, Aymeric Moizard wrote:
>> 
>> 
>> On Thu, 28 Jan 2010, Henning Westerholt wrote:
>> 
>>> On Thursday 28 January 2010, Aymeric Moizard wrote:
>>>> here is the backtrace I have. unfortunatly without debug symbol!
>>>> I found the same for many of the kamailio process. "sched_yield"
>>>> is pending for ever. My system is a debian/etch.
>>>> 
>>>> #0  0xffffe424 in __kernel_vsyscall ()
>>>> #1  0xb7cef4ac in sched_yield () from /lib/tls/i686/cmov/libc.so.6
>>>> #2  0x080a93fd in tcp_send ()
>>>> #3  0xb7975679 in send_pr_buffer () from /usr/lib/kamailio/modules/tm.so
>>>> #4  0xb79789ac in t_forward_nonack () from 
>>>> /usr/lib/kamailio/modules/tm.so
>>>> #5  0xb7974784 in t_relay_to () from /usr/lib/kamailio/modules/tm.so
>>>> #6  0xb7983a11 in load_tm () from /usr/lib/kamailio/modules/tm.so
>>>> #7  0x081cf810 in mem_pool ()
>>>> #8  0x00000000 in ?? ()
>>>> 
>>>> I guess most t_relay operation towards my "mobipouce.com" domain
>>>> with one IP being down breaks each kamailio process one after the
>>>> other... I'm not sure every such t_relay operation is always breaking
>>>> exactly one thread each time.
>>>> 
>>>> I went through the lock/unlock of tcp_main.c but it seems every
>>>> lock has an unlock at least...
>>> 
>>> Hi Aymeric,
>>> 
>>> i remember that we observed this "sched_yield" problems on one old 0.9 
>>> system
>>> after some time (like weeks or month). We did not found the solution in 
>>> this
>>> case, after a restart it was gone again..
>>> 
>>> You mentioned in an earlier mail that you see this related to UDP traffic, 
>>> but
>>> in the log file and also in your investigations you think its related to 
>>> TPC?
>> 
>> This is the exact case:
>> 1-> SUBSCRIBE sent to/received by over UDP to kamailio.
>> 2-> kamailio does a SRV record lookup for "mobipouce.com"
>> 3-> kamailio try sip2.mobipouce.com (91.199.234.47) over TCP first
>> 4-> connection failed with logs:
>> Jan 27 12:56:38 ns26829 /usr/sbin/kamailio[9763]: 
>> ERROR:core:tcp_blocking_connect: poll error: flags 18
>> Jan 27 12:56:38 ns26829 /usr/sbin/kamailio[9763]: 
>> ERROR:core:tcp_blocking_connect: failed to retrieve SO_ERROR (111) 
>> Connection refused
>> Jan 27 12:56:38 ns26829 /usr/sbin/kamailio[9763]: 
>> ERROR:core:tcpconn_connect: tcp_blocking_connect failed
>> Jan 27 12:56:38 ns26829 /usr/sbin/kamailio[9763]: ERROR:core:tcp_send: 
>> connect failed
>> Jan 27 12:56:38 ns26829 /usr/sbin/kamailio[9763]: ERROR:tm:msg_send: 
>> tcp_send failed
>> Jan 27 12:56:38 ns26829 /usr/sbin/kamailio[9763]: 
>> ERROR:tm:t_forward_nonack: sending request failed
>> 5-> I guess kamailio is supposed to try other SRV record value:
>>     sip2.mobipouce.com (91.199.234.46) but it doesn't
>> 
>> Thus, I'm guessing the issue is related to SRV record with failover OR just 
>> tcp failure. Not related to UDP at all.
>
> so TCP connect failed, the tcp worker returned as it prints the message and, 
> to be sure I got it right, the UDP worker (the one that received) got 
> blocked?

1-> TCP connect failed
2-> second SRV is used: TCP connect succeed, but lock in tcp_send

That's what I understand.

I have tested a TCP connection to my server: It seems to be still
working.

>> It's definitly possible to reproduce the issue now!
>> 
>> I guess anyone can try your version of kamailio and t_relay message
>> to "mobipouce.com" and you'll fall in that case! Sending plenty of
>> those messages will finally lock all kamailio process.
>
> All? tcp and udp?

Only udp!
Aymeric

> Cheers,
> Daniel
>
>> 
>> Regards,
>> Aymeric MOIZARD / ANTISIP
>> amsip - http://www.antisip.com
>> osip2 - http://www.osip.org
>> eXosip2 - http://savannah.nongnu.org/projects/exosip/
>> 
>> 
>>> Regards,
>>> 
>>> Henning
>>> 
>>> Viele Grüße,
>>> 
>>> Henning
>>> 
>> 
>> _______________________________________________
>> Kamailio (OpenSER) - Users mailing list
>> Users at lists.kamailio.org
>> http://lists.kamailio.org/cgi-bin/mailman/listinfo/users
>> http://lists.openser-project.org/cgi-bin/mailman/listinfo/users
>
> -- 
> Daniel-Constantin Mierla
> * http://www.asipto.com/
>
>


More information about the sr-dev mailing list