[Kamailio-Users] kamailio / deadlock3

Aymeric Moizard jack at atosc.org
Thu Jan 28 13:59:45 CET 2010


Hi again,

here is the backtrace I have. unfortunatly without debug symbol!
I found the same for many of the kamailio process. "sched_yield"
is pending for ever. My system is a debian/etch.

#0  0xffffe424 in __kernel_vsyscall ()
#1  0xb7cef4ac in sched_yield () from /lib/tls/i686/cmov/libc.so.6
#2  0x080a93fd in tcp_send ()
#3  0xb7975679 in send_pr_buffer () from /usr/lib/kamailio/modules/tm.so
#4  0xb79789ac in t_forward_nonack () from /usr/lib/kamailio/modules/tm.so
#5  0xb7974784 in t_relay_to () from /usr/lib/kamailio/modules/tm.so
#6  0xb7983a11 in load_tm () from /usr/lib/kamailio/modules/tm.so
#7  0x081cf810 in mem_pool ()
#8  0x00000000 in ?? ()

I guess most t_relay operation towards my "mobipouce.com" domain
with one IP being down breaks each kamailio process one after the
other... I'm not sure every such t_relay operation is always breaking
exactly one thread each time.

I went through the lock/unlock of tcp_main.c but it seems every
lock has an unlock at least...

Tks,
Aymeric MOIZARD / ANTISIP
amsip - http://www.antisip.com
osip2 - http://www.osip.org
eXosip2 - http://savannah.nongnu.org/projects/exosip/


On Thu, 28 Jan 2010, Aymeric Moizard wrote:

>
>
>
> On Thu, 28 Jan 2010, Daniel-Constantin Mierla wrote:
>
>> Hello,
>> 
>> On 1/28/10 11:18 AM, Aymeric Moizard wrote:
>>> 
>>> Got some more info:
>>> 
>>> The UDP deadlock always seems to happen after a SUBSCRIBE
>>> is sent (in UDP) to mobipouce.com:
>>> 
>>> Jan 28 11:00:40 ns26829 /usr/sbin/kamailio[13363]: 
>>> ERROR:core:tcp_blocking_connect: poll error: flags 18
>>> Jan 28 11:00:40 ns26829 /usr/sbin/kamailio[13363]: 
>>> ERROR:core:tcp_blocking_connect: failed to retrieve SO_ERROR (111) 
>>> Connection refused
>>> Jan 28 11:00:40 ns26829 /usr/sbin/kamailio[13363]: 
>>> ERROR:core:tcpconn_connect: tcp_blocking_connect failed
>>> Jan 28 11:00:40 ns26829 /usr/sbin/kamailio[13363]: ERROR:core:tcp_send: 
>>> connect failed
>>> Jan 28 11:00:40 ns26829 /usr/sbin/kamailio[13363]: ERROR:tm:msg_send: 
>>> tcp_send failed
>>> Jan 28 11:00:40 ns26829 /usr/sbin/kamailio[13363]: 
>>> ERROR:tm:t_forward_nonack: sending request failed
>>> 
>>> This logs happens each time I got a SUSCRIBE being relayed to another 
>>> server: mobipouce.com. But the deadlock doesn't appear each time.
>>> 
>>> mobipouce.com is an existing & running server where I can connect with UDP 
>>> and TCP. However, the SRV record returns 2 host where one host is down.
>>> (and I never got reply for the SUBSCRIBE: either if it fall into deadlock 
>>> cas or not)
>>> 
>>> In case I can reproduce what step could I take to get more information 
>>> about the issue? Any kmctl command?
>> 
>> is it recovering itself or you have to restart? How much cpu usage you get?
>
> Not noticed any CPU issue: I'll check exactly next time. (but traffic is 
> growing up as kamailio don't answer any more.
>
>> I if one or many eating lot of cpu, then use gdb to attach to the pid of 
>> process using lot of cpu and get the back trace:
>> 
>> gdb /path/to/kamailio pid
>
> I think I can reproduce now. So I'll take a try.
>
> It's definitly after the SRV check: the server choose the
> sip2.mobipouce.com server where no sip server is running
> and failed to connect. Then the network capture shows that
> kamailio is still sending a few SIP packets (like NOTIFY)
> but no SIP answers is coming out of kamailio.
>
> I will do more testing, but I guess one can reproduce
> by relaying to mobipouce.com!
>
> Aymeric
>
>> Cheers,
>> Daniel
>> 
>>> 
>>> Regards,
>>> Aymeric MOIZARD / ANTISIP
>>> amsip - http://www.antisip.com
>>> osip2 - http://www.osip.org
>>> eXosip2 - http://savannah.nongnu.org/projects/exosip/
>>> 
>>> 
>>> On Thu, 28 Jan 2010, Aymeric Moizard wrote:
>>> 
>>>> 
>>>> Hi again people!
>>>> 
>>>> I'm currently having some trouble with my sip.antisip.com server.
>>>> 
>>>> Within the previous 2 or 3 days, kamailio sometimes fall into
>>>> some kind of dead lock.
>>>> 
>>>> I've been checking my logs while the dead lock happen, and it
>>>> seems (although I'm not sure with only the logs) that only UDP
>>>> support is broken: I can see some TLS and TCP registrations but
>>>> do not see the usual udp traffic (keep alive for example)
>>>> 
>>>> Any idea?
>>>> 
>>>> Aymeric MOIZARD / ANTISIP
>>>> amsip - http://www.antisip.com
>>>> osip2 - http://www.osip.org
>>>> eXosip2 - http://savannah.nongnu.org/projects/exosip/
>>>> 
>>>> 
>>>> _______________________________________________
>>>> Kamailio (OpenSER) - Users mailing list
>>>> Users at lists.kamailio.org
>>>> http://lists.kamailio.org/cgi-bin/mailman/listinfo/users
>>>> http://lists.openser-project.org/cgi-bin/mailman/listinfo/users
>>>> 
>>> 
>>> _______________________________________________
>>> Kamailio (OpenSER) - Users mailing list
>>> Users at lists.kamailio.org
>>> http://lists.kamailio.org/cgi-bin/mailman/listinfo/users
>>> http://lists.openser-project.org/cgi-bin/mailman/listinfo/users
>> 
>> -- 
>> Daniel-Constantin Mierla
>> * http://www.asipto.com/
>> 
>> 
>
> _______________________________________________
> Kamailio (OpenSER) - Users mailing list
> Users at lists.kamailio.org
> http://lists.kamailio.org/cgi-bin/mailman/listinfo/users
> http://lists.openser-project.org/cgi-bin/mailman/listinfo/users
>



More information about the Users mailing list