[sr-dev] [kamailio/kamailio] TM crash when relaying reply, in context of unreachable rtpengine (Issue #3297)
Alex Balashov
notifications at github.com
Thu Dec 8 22:41:40 CET 2022
When relaying this `200 OK` repeatedly with calls to `rtpengine_answer()` failing because all RTPEngines in a set are unreachable...
```
Dec 8 06:24:38 gw kamailio[4823]: ERROR: rtpengine [rtpengine.c:2960]: select_rtpp_node(): rtpengine failed to select new for calllen=36 callid=8279a4ea-151e-48de-a23c-5840104126de
```
```
SIP/2.0 200 OK
v:SIP/2.0/UDP 10.150.20.20;branch=z9hG4bK03e.e4da444e627bb575ef4f46151729a1fa.0,SIP/2.0/UDP 10.151.20.108;received=10.151.20.108;rport=5060;branch=z9hG4bKKQyrBKpp04X0K
Record-Route:<sip:10.150.20.20;lr;ftag=emHm2apy95jFe;rtp_relay=1;rtp_group=0>
f:"14045551212"<sip:14045551212 at 10.151.20.108>;tag=emHm2apy95jFe
t:<sip:16782001111 at internal.evaristesys.com>;tag=tHQt0rDcF7UHK
i:8279a4ea-151e-48de-a23c-5840104126de
CSeq:60686471 INVITE
m:<sip:16782001111 at 10.160.10.55:5060;transport=udp>
User-Agent:VoiceAppServer
Allow:INVITE,ACK,BYE,CANCEL,OPTIONS,MESSAGE,INFO,UPDATE,REFER,NOTIFY
k:path,replaces
u:talk,hold,conference,refer
c:application/sdp
Content-Disposition:session
l:257
v=0
o=DM 1670428117 1670428118 IN IP4 10.160.10.55
s=DM
c=IN IP4 10.160.10.55
t=0 0
m=audio 52418 RTP/AVP 0 101
a=rtpmap:0 PCMU/8000
a=rtpmap:101 telephone-event/8000
a=fmtp:101 0-16
a=ptime:20
a=rtcp:52419 IN IP4 10.160.10.55
```
Repeat INVITEs eventually result in a crash. This backtrace is taken from a different server but identical scenario:
```
#0 t_should_relay_response (Trans=0x7f4aee56ba80, new_code=200, branch=0, should_store=0x7ffd8b8bbba0, should_relay=0x7ffd8b8bbba4, cancel_data=0x7ffd8b8bbe40,
reply=0x7f4afe40df70) at t_reply.c:1285
1285 && !(inv_through && Trans->uac[branch].last_received<300)) {
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.192.el6.x86_64 jansson-2.11-1.el6.x86_64 keyutils-libs-1.4-5.el6.x86_64 krb5-libs-1.10.3-57.el6.x86_64 libcom_err-1.41.12-22.el6.x86_64 libev-4.03-3.el6.x86_64 libselinux-2.0.94-7.el6.x86_64 libunistring-0.9.3-5.el6.x86_64 openssl-1.0.1e-48.el6_8.3.x86_64 sqlite-3.6.20-1.el6_7.2.x86_64 zlib-1.2.3-29.el6.x86_64
(gdb) where
#0 t_should_relay_response (Trans=0x7f4aee56ba80, new_code=200, branch=0, should_store=0x7ffd8b8bbba0, should_relay=0x7ffd8b8bbba4, cancel_data=0x7ffd8b8bbe40,
reply=0x7f4afe40df70) at t_reply.c:1285
#1 0x00007f4afd73ab7d in relay_reply (t=0x7f4aee56ba80, p_msg=0x7f4afe40df70, branch=0, msg_status=200, cancel_data=0x7ffd8b8bbe40, do_put_on_wait=1)
at t_reply.c:1821
#2 0x00007f4afd740c38 in reply_received (p_msg=0x7f4afe40df70) at t_reply.c:2558
#3 0x00000000004b5d6a in do_forward_reply (msg=0x7f4afe40df70, mode=0) at core/forward.c:747
#4 0x00000000004b78c9 in forward_reply (msg=0x7f4afe40df70) at core/forward.c:852
#5 0x000000000054982b in receive_msg (
buf=0xb27700 "SIP/2.0 200 OK\r\nv:SIP/2.0/UDP xxx;branch=z9hG4bK9915.dcfe6d03cb21e6c94a6705adb095c566.0,SIP/2.0/UDP xxx;received=xxx;rport=5060;branch=z9hG4bKZmZp0NZv3FvQB\r\nRecord-Rout"..., len=984, rcv_info=0x7ffd8b8bc6c0) at core/receive.c:434
#6 0x00000000006658ca in udp_rcv_loop () at core/udp_server.c:541
#7 0x000000000042500f in main_loop () at main.c:1655
#8 0x000000000042c5bb in main (argc=11, argv=0x7ffd8b8bcc88) at main.c:2696
```
Looks like in the crash frame, `Trans->uac` is `NULL`:
```
(gdb) frame 0
#0 t_should_relay_response (Trans=0x7f4aee56ba80, new_code=200, branch=0, should_store=0x7ffd8b8bbba0, should_relay=0x7ffd8b8bbba4, cancel_data=0x7ffd8b8bbe40,
reply=0x7f4afe40df70) at t_reply.c:1285
1285 && !(inv_through && Trans->uac[branch].last_received<300)) {
(gdb) print Trans->uac
$1 = (struct ua_client *) 0x0
```
I have seen this crash from time to time, though it is infrequent. It always seems to occur when there are problems reaching RTPEngine in a timely fashion.
This is on Kamailio 5.2.4:de9a03, which I know has long fallen off the back of the maintenance train. However, line 1386 of `t_reply.c` in `master:HEAD` shows that the code is no different now:
https://github.com/kamailio/kamailio/blob/master/src/modules/tm/t_reply.c#L1386
Of course, it's possible that something upstream has changed so as to prevent this state. Otherwise, I suppose a null pointer safety check for `Trans->uac` is recommended. I'd be happy to submit a patch, just wasn't sure if the feeling was more that this should be resolved higher up the call stack.
--
Reply to this email directly or view it on GitHub:
https://github.com/kamailio/kamailio/issues/3297
You are receiving this because you are subscribed to this thread.
Message ID: <kamailio/kamailio/issues/3297 at github.com>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.kamailio.org/pipermail/sr-dev/attachments/20221208/d3b123f9/attachment-0001.htm>
More information about the sr-dev
mailing list