[SR-Users] Kamailio 1.5 stops processing messages

Santiago Gimeno santiago.gimeno at gmail.com
Fri May 14 12:18:45 CEST 2010


Hi,

I have experienced a new lock. I have seen that 4 of the UDP worker
processes, but not all, have gone 100% cpu. Kamailio was still able to
process messages through the rest of UDP worker processes. By checking the
logs I can see that the 4 blocked processes have stopped handling messages
at the same time.

I have attached to the blocked processes and the backtrace looks similar to
the previous one except that it seems that the lock is not exactly related
to t_replicate.

#0  0xb7f24410 in ?? ()
#1  0xbf9cdeb8 in ?? ()
#2  0x00000001 in ?? ()
#3  0xa72eb318 in ?? ()
#4  0xb7e804ac in sched_yield () from /lib/tls/i686/cmov/libc.so.6
#5  0xb7aee463 in lock_hash (i=13692) at ../../mem/../fastlock.h:182
#6  0xb7b09587 in t_lookup_request (p_msg=0x81d9708, leave_new_locked=1) at
t_lookup.c:468
#7  0xb7b0a4ae in t_newtran (p_msg=0x81d9708) at t_lookup.c:1124
#8  0xb7afc40c in t_relay_to (p_msg=0x81d9708, proxy=0x0, flags=8) at
t_funcs.c:212
#9  0xb7b0fac7 in w_t_relay (p_msg=0x81d9708, proxy=0x0, flags=0x8 <Address
0x8 out of bounds>) at tm.c:1002
#10 0x0805301c in do_action (a=0x818c370, msg=0x81d9708) at action.c:874
#11 0x080557aa in run_action_list (a=0x818c370, msg=0x81d9708) at
action.c:145
#12 0x0809c304 in eval_expr (e=0x818c3d8, msg=0x81d9708, val=0x0) at
route.c:1171
#13 0x0809bd80 in eval_expr (e=0x818c400, msg=0x81d9708, val=0x0) at
route.c:1488
#14 0x0809bd16 in eval_expr (e=0x818c428, msg=0x81d9708, val=0x0) at
route.c:1493
#15 0x080527ed in do_action (a=0x818c740, msg=0x81d9708) at action.c:729
#16 0x080557aa in run_action_list (a=0x818be08, msg=0x81d9708) at
action.c:145
#17 0x08053efb in do_action (a=0x81a6a48, msg=0x81d9708) at action.c:120
#18 0x080557aa in run_action_list (a=0x81a14e0, msg=0x81d9708) at
action.c:145
#19 0x08053efb in do_action (a=0x81c29f0, msg=0x81d9708) at action.c:120
#20 0x080557aa in run_action_list (a=0x81c2698, msg=0x81d9708) at
action.c:145
#21 0x08054491 in do_action (a=0x81c2b48, msg=0x81d9708) at action.c:746
#22 0x080557aa in run_action_list (a=0x81c2030, msg=0x81d9708) at
action.c:145
#23 0x08053efb in do_action (a=0x8193278, msg=0x81d9708) at action.c:120
#24 0x080557aa in run_action_list (a=0x8193278, msg=0x81d9708) at
action.c:145
#25 0x08054491 in do_action (a=0x8193418, msg=0x81d9708) at action.c:746
#26 0x080557aa in run_action_list (a=0x8193418, msg=0x81d9708) at
action.c:145
#27 0x08054f2d in do_action (a=0x8193480, msg=0x81d9708) at action.c:752
#28 0x080557aa in run_action_list (a=0x81922f0, msg=0x81d9708) at
action.c:145
#29 0x08054491 in do_action (a=0x81934e8, msg=0x81d9708) at action.c:746
#30 0x080557aa in run_action_list (a=0x818ee78, msg=0x81d9708) at
action.c:145
#31 0x08053efb in do_action (a=0x81b3448, msg=0x81d9708) at action.c:120
#32 0x080557aa in run_action_list (a=0x81b0ed0, msg=0x81d9708) at
action.c:145
#33 0x08054491 in do_action (a=0x81b5f68, msg=0x81d9708) at action.c:746
#34 0x080557aa in run_action_list (a=0x81b5f68, msg=0x81d9708) at
action.c:145
#35 0x08054f2d in do_action (a=0x81b5fd0, msg=0x81d9708) at action.c:752
#36 0x080557aa in run_action_list (a=0x81aefd0, msg=0x81d9708) at
action.c:145
#37 0x08053efb in do_action (a=0x818bc08, msg=0x81d9708) at action.c:120
#38 0x080557aa in run_action_list (a=0x8187910, msg=0x81d9708) at
action.c:145
#39 0x08055b43 in run_top_route (a=0x8187910, msg=0x81d9708) at action.c:120
#40 0x0808c659 in receive_msg (
    buf=0x8158040 "INVITE
sip:xxxxxxxx at xxxxxxx.com<sip%3Axxxxxxxx at xxxxxxx.com>SIP/2.0\r\nRecord-Route:
<sip:10.100.2.252;lr=on;ftag=as6897d90b>\r\nVia:
SIP/2.0/UDP 10.100.2.252;branch=z9hG4bKc753.2c8885d4.0\r\nVia: SIP/2.0/UDP
10.100.2.253:5060;bra"..., len=896, rcv_info=0xbf9d0aa4) at receive.c:175
#41 0x080c3ea3 in udp_rcv_loop () at udp_server.c:449
#42 0x0806e394 in main (argc=9, argv=0xbf9d0c84) at main.c:774

I hope this provides some new hints on what might be happening.

Thanks,

Santi


2010/5/13 Santiago Gimeno <santiago.gimeno at gmail.com>

> Hi,
>
> We don't use any flags in the t_replicate function, just the uri of the
> backup proxy.
> I think we are not creating any transaction before t_replicate.
>
>
> Thanks,
>
> Santi
>
> 2010/5/13 Daniel-Constantin Mierla <miconda at gmail.com>
>
>>  Hello,
>>
>> what are the parameters for t_replicate(), any flags? Do you create a
>> transaction for register before t_replicate (e.g., by calling t_newtran in
>> config)?
>>
>> Cheers,
>> Daniel
>>
>>
>>
>> On 5/13/10 11:53 AM, Santiago Gimeno wrote:
>>
>> Hi,
>>
>> It happens randomly, after some hours or some days, no particular pattern.
>> There is no error message in the syslog. The debug level is set at 0.
>> Would a higher debug level help to track the issue?
>> We only use t_replicate to send the REGISTER to a backup proxy.
>>
>> Thanks,
>>
>> Santi
>>
>> 2010/5/13 Daniel-Constantin Mierla <miconda at gmail.com>
>>
>>> Hello,
>>>
>>> seems to be something related to t_replicate(). Can you reproduce it
>>> easily? If not, how often does it happen? Do you see any particular error
>>> messages in syslog, before frozen?
>>>
>>> Cheers,
>>> Daniel
>>>
>>>
>>> On 5/13/10 10:39 AM, Santiago Gimeno wrote:
>>>
>>>  Hi,
>>>
>>> I'm working with Kamailio 1.5 (rev 5968) and from time to time kamailio
>>> freezes and stops processing messages. Kamailio is working only in UDP. I
>>> really don't know in which conditions this problem is happening: it happens
>>> both in high and low load.
>>> Here is a backtrace of one of the UDP working processes when kamailio is
>>> frozen (all the other UDP process backtraces look similar) :
>>>
>>> #0  0xb7fdf410 in ?? ()
>>> #1  0xbfbfec28 in ?? ()
>>> #2  0x00000001 in ?? ()
>>> #3  0xa7392ad0 in ?? ()
>>> #4  0xb7f3b4ac in sched_yield () from /lib/tls/i686/cmov/libc.so.6
>>> #5  0xb7ba9463 in lock_hash (i=10361) at ../../mem/../fastlock.h:182
>>> #6  0xb7bc4587 in t_lookup_request (p_msg=0x823fc70, leave_new_locked=1)
>>> at t_lookup.c:468
>>> #7  0xb7bc54ae in t_newtran (p_msg=0x823fc70) at t_lookup.c:1124
>>> #8  0xb7bb740c in t_relay_to (p_msg=0x823fc70, proxy=0x0, flags=1) at
>>> t_funcs.c:212
>>> #9  0xb7bb8f6c in t_replicate (p_msg=0x823fc70, dst=0x817cdf8, flags=1)
>>> at t_fwd.c:763
>>> #10 0xb7bc7fce in w_t_replicate (p_msg=0x823fc70, dst=0x817cdf8
>>> "\230ê\030\b\016", flags=0x0) at tm.c:957
>>> #11 0x0805301c in do_action (a=0x818eb20, msg=0x823fc70) at action.c:874
>>> #12 0x080557aa in run_action_list (a=0x818eb20, msg=0x823fc70) at
>>> action.c:145
>>> #13 0x08054491 in do_action (a=0x818ec78, msg=0x823fc70) at action.c:746
>>> #14 0x080557aa in run_action_list (a=0x818c940, msg=0x823fc70) at
>>> action.c:145
>>> #15 0x08053efb in do_action (a=0x81b38e0, msg=0x823fc70) at action.c:120
>>> #16 0x080557aa in run_action_list (a=0x81b3878, msg=0x823fc70) at
>>> action.c:145
>>> #17 0x08054491 in do_action (a=0x81b5f00, msg=0x823fc70) at action.c:746
>>> #18 0x080557aa in run_action_list (a=0x81b5f00, msg=0x823fc70) at
>>> action.c:145
>>> #19 0x08054f2d in do_action (a=0x81b5f68, msg=0x823fc70) at action.c:752
>>> #20 0x080557aa in run_action_list (a=0x81b5f68, msg=0x823fc70) at
>>> action.c:145
>>> #21 0x08054f2d in do_action (a=0x81b5fd0, msg=0x823fc70) at action.c:752
>>> #22 0x080557aa in run_action_list (a=0x81aefd0, msg=0x823fc70) at
>>> action.c:145
>>> #23 0x08053efb in do_action (a=0x818bc08, msg=0x823fc70) at action.c:120
>>> #24 0x080557aa in run_action_list (a=0x8187910, msg=0x823fc70) at
>>> action.c:145
>>> #25 0x08055b43 in run_top_route (a=0x8187910, msg=0x823fc70) at
>>> action.c:120
>>> #26 0x0808c659 in receive_msg (
>>>     buf=0x8158040 "REGISTER sip:xxxxxxxxxxx.com SIP/2.0\r\nFrom: <
>>> sip:xxxxxx at xxxxxxxxxxxx.com <sip%3Axxxxxx at xxxxxxxxxxxx.com>>;tag=as516fcce2\r\nTo:
>>> <sip:xxxxxxx at xxxxxxxxxxxx.com <sip%3Axxxxxxx at xxxxxxxxxxxx.com>>\r\nCall-ID:
>>> 01e4ea8e185c85332ea68874543c810e at xxxxxxxxxxxxxx.com\r\nCSeq: 383
>>> REGISTER\r\nVia: SIP"..., len=794, rcv_info=0xbfc004d4) at receive.c:175
>>> #27 0x080c3ea3 in udp_rcv_loop () at udp_server.c:449
>>> #28 0x0806e394 in main (argc=9, argv=0xbfc006b4) at main.c:774
>>>
>>> Any idea of what might be happening?
>>>
>>> Best regards,
>>>
>>> Santi
>>>
>>>
>>> _______________________________________________
>>> SIP Express Router (SER) and Kamailio (OpenSER) - sr-users mailing list
>>> sr-users at lists.sip-router.orghttp://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-users
>>>
>>>
>>> --
>>> Daniel-Constantin Mierla
>>> * http://www.asipto.com/
>>> * http://twitter.com/miconda
>>> * http://www.linkedin.com/in/danielconstantinmierla
>>>
>>>
>>
>> --
>> Daniel-Constantin Mierla
>> Kamailio Advanced Training for Carriers
>> June 21-23, 2010, Miami, Fl, USA
>> * http://www.asipto.com/
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.sip-router.org/pipermail/sr-users/attachments/20100514/848a27f8/attachment.htm>


More information about the sr-users mailing list