Hello,
thanks for spending more time on it! I will try to reproduce in my side
during the next days, currently being out of the office, and then see
what I can find.
Cheers,
Daniel
On 22.01.19 15:36, Yufei Tao wrote:
Hi Daniel,
I tested latest v5.2.1 Debian package and created the crash as well.
Two core dump files again similar to 5.1.6:
1. 1687 - udp receiver process
{{{
[New LWP 1687]
Core was generated by `/usr/sbin/kamailio -P
/var/run/kamailio/kamailio.pid -f /etc/kamailio/kamailio.'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00007f015b3795cc in t_should_relay_response
(Trans=0x7f015cce9e98, new_code=200, branch=0,
should_store=0x7ffd5b4c5e24, should_relay=0x7ffd5b4c5e20,
cancel_data=0x7ffd5b4c6010,
reply=0x7f0161407ab8) at t_reply.c:1279
1279 t_reply.c: No such file or directory.
(gdb) bt
#0 0x00007f015b3795cc in t_should_relay_response
(Trans=0x7f015cce9e98, new_code=200, branch=0,
should_store=0x7ffd5b4c5e24, should_relay=0x7ffd5b4c5e20,
cancel_data=0x7ffd5b4c6010,
reply=0x7f0161407ab8) at t_reply.c:1279
#1 0x00007f015b37dec7 in relay_reply (t=0x7f015cce9e98,
p_msg=0x7f0161407ab8, branch=0, msg_status=200,
cancel_data=0x7ffd5b4c6010, do_put_on_wait=1) at t_reply.c:1804
#2 0x00007f015b383eaa in reply_received (p_msg=0x7f0161407ab8) at
t_reply.c:2539
#3 0x000000000054e7f0 in do_forward_reply (msg=0x7f0161407ab8,
mode=0) at core/forward.c:747
#4 0x0000000000550415 in forward_reply (msg=0x7f0161407ab8) at
core/forward.c:852
#5 0x0000000000599159 in receive_msg (
buf=0xa6ec80 <buf> "SIP/2.0 200 OK\r\nVia: SIP/2.0/UDP
192.168.70.101;branch=z9hG4bK155f.f4284a7086985c9b088dc7c0dd32c63e.0,
SIP/2.0/UDP 192.168.60.80:5060;branch=z9hG4bK-5164-4615-0\r\nFrom:
sipp <sip:Customer69@192.168.60.
<mailto:sip%3ACustomer69@192.168.60.>"..., len=886,
rcv_info=0x7ffd5b4c65d0) at core/receive.c:433
#6 0x00000000004b22e8 in udp_rcv_loop () at core/udp_server.c:541
#7 0x0000000000425205 in main_loop () at main.c:1645
#8 0x000000000042c9a5 in main (argc=13, argv=0x7ffd5b4c6c98) at
main.c:2675
}}}
2. 1673 - main process
{{{
[New LWP 1673]
Core was generated by `/usr/sbin/kamailio -P
/var/run/kamailio/kamailio.pid -f /etc/kamailio/kamailio.'.
Program terminated with signal SIGABRT, Aborted.
#0 0x00007f0162344428 in __GI_raise (sig=sig@entry=6) at
../sysdeps/unix/sysv/linux/raise.c:54
54 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0 0x00007f0162344428 in __GI_raise (sig=sig@entry=6) at
../sysdeps/unix/sysv/linux/raise.c:54
#1 0x00007f016234602a in __GI_abort () at abort.c:89
#2 0x000000000041a836 in sig_alarm_abort (signo=14) at main.c:663
#3 <signal handler called>
#4 syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:37
#5 0x00007f0160a259ed in futex_get (lock=0x7f015c2739c8) at
../../core/parser/../mem/../futexlock.h:121
#6 0x00007f0160a395bc in mod_destroy () at rtpengine.c:1941
#7 0x00000000005589e2 in destroy_modules () at core/sr_module.c:732
#8 0x000000000041940b in cleanup (show_status=1) at main.c:537
#9 0x000000000041ab21 in shutdown_children (sig=15, show_status=1) at
main.c:680
#10 0x000000000041d7c3 in handle_sigs () at main.c:785
#11 0x0000000000426b23 in main_loop () at main.c:1780
#12 0x000000000042c9a5 in main (argc=13, argv=0x7ffd5b4c6c98) at
main.c:2675
}}}
I used the example kamailio-minimal-proxy.cfg from 5.2.1 source
/misc/examples/mixed/ directory and added rtpengine parameters and
calls to rtpengine functions, as the cfg file for 5.1.6 didn't work
for 5.2.1. Attached is the kamailio.cfg for v5.2.1 that I used in the
tests.
Cheers,
Yufei
On Tue, 22 Jan 2019 at 07:33, Daniel-Constantin Mierla
<miconda(a)gmail.com <mailto:miconda@gmail.com>> wrote:
Hello,
can you share with me the full config along with sipp scenario
files and commands you used for testing? I would like to reproduce
on my test environment to be able to troubleshoot.
Also, can you try with latest version from 5.2 branch? I pushed
some fixes recently to rtpengine as well as a rework for reply
handling inside the tm module -- these because there were some
similar reports before, but none of them had a way to reproduce.
Since you can reproduce it, if I can test it here I can be sure
the proper fix was done or the issue is somewhere else.
Cheers,
Daniel
On 21.01.19 18:48, Yufei Tao wrote:
Hi,
I have been testing one Kamailio v5.1.6 instance with one
rtpengine instance, using sipp playing media files at 40 cps (-r
40) with up to 1600 concurrent calls. During the load tests if
rtpengine is pkill'ed/restarted a few times Kamailio would crash.
It is quite repeatable and every time the backtrace from gdb
points to the same place as shown below.
However the same tests on Kamailio v5.0.7 with the same cfg files
and the same rtpengine instance did not cause any crash.
Here’s what I got from gdb backtrace for v5.1.6 using a dbg
build: 2 core dump files:
1. UDP receiver processes 14483
{{{
[New LWP 14483]
Core was generated by `/usr/sbin/kamailio -P
/var/run/kamailio/kamailio.pid -f /etc/kamailio/kamailio.'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00007fadfa824d8e in t_should_relay_response
(Trans=0x7fadf4207730, new_code=200, branch=0,
should_store=0x7ffd5038fce4, should_relay=0x7ffd5038fce0,
cancel_data=0x7ffd5038fed0,
reply=0x7fadfb545210) at t_reply.c:1282
1282 t_reply.c: No such file or directory.
(gdb) bt
#0 0x00007fadfa824d8e in t_should_relay_response
(Trans=0x7fadf4207730, new_code=200, branch=0,
should_store=0x7ffd5038fce4, should_relay=0x7ffd5038fce0,
cancel_data=0x7ffd5038fed0,
reply=0x7fadfb545210) at t_reply.c:1282
#1 0x00007fadfa829577 in relay_reply (t=0x7fadf4207730,
p_msg=0x7fadfb545210, branch=0, msg_status=200,
cancel_data=0x7ffd5038fed0, do_put_on_wait=1) at t_reply.c:1786
#2 0x00007fadfa82f54c in reply_received (p_msg=0x7fadfb545210)
at t_reply.c:2537
#3 0x000000000054624b in do_forward_reply (msg=0x7fadfb545210,
mode=0) at core/forward.c:747
#4 0x0000000000547e4c in forward_reply (msg=0x7fadfb545210) at
core/forward.c:852
#5 0x000000000058e186 in receive_msg (
buf=0xa595a0 <buf> "SIP/2.0 200 OK\r\nVia: SIP/2.0/UDP
192.168.70.102;branch=z9hG4bKa042.afac8eb973f1dfad7a549af0ab1a8ccc.0,
SIP/2.0/UDP 192.168.60.80:5060;branch=z9hG4bK-3750-978-0\r\nFrom:
sipp <sip:Customer68@192.168.60.8
<mailto:sip%3ACustomer68@192.168.60.8>"..., len=888,
rcv_info=0x7ffd50390480) at core/receive.c:364
#6 0x00000000004af6b1 in udp_rcv_loop () at core/udp_server.c:554
#7 0x00000000004246ac in main_loop () at main.c:1619
#8 0x000000000042bd5c in main (argc=13, argv=0x7ffd50390b38) at
main.c:2638
}}}
2. Main process 14468
{{{
[New LWP 14468]
Core was generated by `/usr/sbin/kamailio -P
/var/run/kamailio/kamailio.pid -f /etc/kamailio/kamailio.'.
Program terminated with signal SIGABRT, Aborted.
#0 0x00007fadfbc77428 in __GI_raise (sig=sig@entry=6) at
../sysdeps/unix/sysv/linux/raise.c:54
54 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0 0x00007fadfbc77428 in __GI_raise (sig=sig@entry=6) at
../sysdeps/unix/sysv/linux/raise.c:54
#1 0x00007fadfbc7902a in __GI_abort () at abort.c:89
#2 0x000000000041a029 in sig_alarm_abort (signo=14) at main.c:646
#3 <signal handler called>
#4 syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:37
#5 0x00007fadf354e67d in futex_get (lock=0x7fadf3e94e50) at
../../core/parser/../mem/../futexlock.h:121
#6 0x00007fadf3561113 in mod_destroy () at rtpengine.c:1810
#7 0x000000000055132b in destroy_modules () at core/sr_module.c:832
#8 0x0000000000418c9f in cleanup (show_status=1) at main.c:521
#9 0x000000000041a313 in shutdown_children (sig=15,
show_status=1) at main.c:663
#10 0x000000000041cfa5 in handle_sigs () at main.c:768
#11 0x0000000000425fb5 in main_loop () at main.c:1752
#12 0x000000000042bd5c in main (argc=13, argv=0x7ffd50390b38) at
main.c:2638
}}}
The parameters for rtpengine:
{{{
loadmodule "rtpengine.so"
modparam("rtpengine", "db_url",
"text:///usr/share/kamailio/dbtext/kamailio")
modparam("rtpengine", "hash_table_size", 4)
modparam("rtpengine", "setid_default", 1)
modparam("rtpengine", "rtpengine_disable_tout", 20)
modparam("rtpengine", "rtpengine_retr", 1)
modparam("rtpengine", "setid_avp", "$avp(setid)")
modparam("rtpengine", "rtp_inst_pvar",
"$avp(rtpInstance)")
modparam("rtpengine", "rtpengine_tout_ms", 1000)
modparam("rtpengine", "read_sdp_pv",
"$var(sdpToRtpengine)")
modparam("rtpengine", "write_sdp_pv",
"$var(sdpFromRtpengine)")
}}}
I'm using a simplified kamailio.cfg from installation, and here
are calls to rtpengine:
{{{
...
route[INVITE]
{
$var(sdpToRtpengine) = $rb;
$var(ret) = rtpengine_manage("direction=dirty
direction=clean ICE=remove");
xlog("L_INFO", "$ci INVITE: rtpengine chosen:
$avp(rtpInstance)");
remove_body();
replace_body(".*", $var(sdpFromRtpengine));
t_on_reply("RESPONSE");
route(RELAY);
}
onreply_route[RESPONSE]
{
$var(sdpToRtpengine) = $rb;
$var(ret) = rtpengine_manage("direction=clean
direction=dirty ICE=remove");
remove_body();
replace_body(".*", $var(sdpFromRtpengine));
xlog("L_INFO", "$ci RESPONSE: $rm - $rs $rr, cseq=$cs,
by [$hdr(Server)], from $si:$sp");
}
...
}}}
When rtpengine is down for a couple of seconds, there were a lot
of SIP retransmissions and timeouts. Doing a netstat and I can
see Kamailio’s receive buffer is quite filled up.
Please let me know if more information is needed.Thank you!
Cheers,
Yufei
_______________________________________________
Kamailio (SER) - Users Mailing List
sr-users(a)lists.kamailio.org <mailto:sr-users@lists.kamailio.org>
https://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-users
--
Daniel-Constantin Mierla --
www.asipto.com <http://www.asipto.com>
www.twitter.com/miconda <http://www.twitter.com/miconda> --
www.linkedin.com/in/miconda <http://www.linkedin.com/in/miconda>
Kamailio World Conference - May 6-8, 2019 --
www.kamailioworld.com
<http://www.kamailioworld.com>
Kamailio Advanced Training - Mar 4-6, 2019 in Berlin; Mar 25-27, 2019, in Washington,
DC, USA --
www.asipto.com <http://www.asipto.com>
Kamailio Advanced Training - Mar 4-6, 2019 in Berlin; Mar 25-27, 2019, in Washington, DC,
USA --