Hi,

 

I have been testing one Kamailio v5.1.6 instance with one rtpengine instance, using sipp playing media files at 40 cps (-r 40) with up to 1600 concurrent calls. During the load tests if rtpengine is pkill'ed/restarted a few times Kamailio would crash. It is quite repeatable and every time the backtrace from gdb points to the same place as shown below.


However the same tests on Kamailio v5.0.7 with the same cfg files and the same rtpengine instance did not cause any crash.


Here’s what I got from gdb backtrace for v5.1.6 using a dbg build: 2 core dump files:


1.  UDP receiver processes 14483

{{{

[New LWP 14483]
Core was generated by `/usr/sbin/kamailio -P /var/run/kamailio/kamailio.pid -f /etc/kamailio/kamailio.'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007fadfa824d8e in t_should_relay_response (Trans=0x7fadf4207730, new_code=200, branch=0, should_store=0x7ffd5038fce4, should_relay=0x7ffd5038fce0, cancel_data=0x7ffd5038fed0,
    reply=0x7fadfb545210) at t_reply.c:1282
1282    t_reply.c: No such file or directory.
(gdb) bt
#0  0x00007fadfa824d8e in t_should_relay_response (Trans=0x7fadf4207730, new_code=200, branch=0, should_store=0x7ffd5038fce4, should_relay=0x7ffd5038fce0, cancel_data=0x7ffd5038fed0,
    reply=0x7fadfb545210) at t_reply.c:1282
#1  0x00007fadfa829577 in relay_reply (t=0x7fadf4207730, p_msg=0x7fadfb545210, branch=0, msg_status=200, cancel_data=0x7ffd5038fed0, do_put_on_wait=1) at t_reply.c:1786
#2  0x00007fadfa82f54c in reply_received (p_msg=0x7fadfb545210) at t_reply.c:2537
#3  0x000000000054624b in do_forward_reply (msg=0x7fadfb545210, mode=0) at core/forward.c:747
#4  0x0000000000547e4c in forward_reply (msg=0x7fadfb545210) at core/forward.c:852
#5  0x000000000058e186 in receive_msg (
    buf=0xa595a0 <buf> "SIP/2.0 200 OK\r\nVia: SIP/2.0/UDP 192.168.70.102;branch=z9hG4bKa042.afac8eb973f1dfad7a549af0ab1a8ccc.0, SIP/2.0/UDP 192.168.60.80:5060;branch=z9hG4bK-3750-978-0\r\nFrom: sipp <sip:Customer68@192.168.60.8"..., len=888, rcv_info=0x7ffd50390480) at core/receive.c:364
#6  0x00000000004af6b1 in udp_rcv_loop () at core/udp_server.c:554
#7  0x00000000004246ac in main_loop () at main.c:1619
#8  0x000000000042bd5c in main (argc=13, argv=0x7ffd50390b38) at main.c:2638

}}}



2. Main process 14468
{{{
[New LWP 14468]
Core was generated by `/usr/sbin/kamailio -P /var/run/kamailio/kamailio.pid -f /etc/kamailio/kamailio.'.
Program terminated with signal SIGABRT, Aborted.
#0  0x00007fadfbc77428 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54
54    ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0  0x00007fadfbc77428 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54
#1  0x00007fadfbc7902a in __GI_abort () at abort.c:89
#2  0x000000000041a029 in sig_alarm_abort (signo=14) at main.c:646
#3  <signal handler called>
#4  syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:37
#5  0x00007fadf354e67d in futex_get (lock=0x7fadf3e94e50) at ../../core/parser/../mem/../futexlock.h:121
#6  0x00007fadf3561113 in mod_destroy () at rtpengine.c:1810
#7  0x000000000055132b in destroy_modules () at core/sr_module.c:832
#8  0x0000000000418c9f in cleanup (show_status=1) at main.c:521
#9  0x000000000041a313 in shutdown_children (sig=15, show_status=1) at main.c:663
#10 0x000000000041cfa5 in handle_sigs () at main.c:768
#11 0x0000000000425fb5 in main_loop () at main.c:1752
#12 0x000000000042bd5c in main (argc=13, argv=0x7ffd50390b38) at main.c:2638
}}}

The parameters for rtpengine:

{{{

loadmodule "rtpengine.so"
modparam("rtpengine", "db_url", "text:///usr/share/kamailio/dbtext/kamailio")
modparam("rtpengine", "hash_table_size", 4)
modparam("rtpengine", "setid_default", 1)
modparam("rtpengine", "rtpengine_disable_tout", 20)
modparam("rtpengine", "rtpengine_retr", 1)
modparam("rtpengine", "setid_avp", "$avp(setid)")
modparam("rtpengine", "rtp_inst_pvar", "$avp(rtpInstance)")
modparam("rtpengine", "rtpengine_tout_ms", 1000)
modparam("rtpengine", "read_sdp_pv", "$var(sdpToRtpengine)")
modparam("rtpengine", "write_sdp_pv", "$var(sdpFromRtpengine)")

}}}


I'm using a simplified kamailio.cfg from installation, and here are calls to rtpengine:

{{{

...

route[INVITE]
{
        $var(sdpToRtpengine) = $rb;
        $var(ret) = rtpengine_manage("direction=dirty direction=clean ICE=remove");
        xlog("L_INFO", "$ci INVITE: rtpengine chosen: $avp(rtpInstance)");
        remove_body();
        replace_body(".*", $var(sdpFromRtpengine));
        t_on_reply("RESPONSE");
 
        route(RELAY);
}

onreply_route[RESPONSE]
{
        $var(sdpToRtpengine) = $rb;
        $var(ret) = rtpengine_manage("direction=clean direction=dirty ICE=remove");
         remove_body();
         replace_body(".*", $var(sdpFromRtpengine));
         xlog("L_INFO", "$ci RESPONSE: $rm - $rs $rr, cseq=$cs, by [$hdr(Server)], from $si:$sp");
}
...

}}}


When rtpengine is down for a couple of seconds, there were a lot of SIP retransmissions and timeouts. Doing a netstat and I can see Kamailio’s receive buffer is quite filled up.

 

Please let me know if more information is needed. Thank you!

 

Cheers,

Yufei