When relaying this `200 OK` repeatedly with calls to `rtpengine_answer()` failing because all RTPEngines in a set are unreachable...
``` Dec 8 06:24:38 gw kamailio[4823]: ERROR: rtpengine [rtpengine.c:2960]: select_rtpp_node(): rtpengine failed to select new for calllen=36 callid=8279a4ea-151e-48de-a23c-5840104126de ```
``` SIP/2.0 200 OK v:SIP/2.0/UDP 10.150.20.20;branch=z9hG4bK03e.e4da444e627bb575ef4f46151729a1fa.0,SIP/2.0/UDP 10.151.20.108;received=10.151.20.108;rport=5060;branch=z9hG4bKKQyrBKpp04X0K Record-Route:sip:10.150.20.20;lr;ftag=emHm2apy95jFe;rtp_relay=1;rtp_group=0 f:"14045551212"sip:14045551212@10.151.20.108;tag=emHm2apy95jFe t:sip:16782001111@internal.evaristesys.com;tag=tHQt0rDcF7UHK i:8279a4ea-151e-48de-a23c-5840104126de CSeq:60686471 INVITE m:sip:16782001111@10.160.10.55:5060;transport=udp User-Agent:VoiceAppServer Allow:INVITE,ACK,BYE,CANCEL,OPTIONS,MESSAGE,INFO,UPDATE,REFER,NOTIFY k:path,replaces u:talk,hold,conference,refer c:application/sdp Content-Disposition:session l:257
v=0 o=DM 1670428117 1670428118 IN IP4 10.160.10.55 s=DM c=IN IP4 10.160.10.55 t=0 0 m=audio 52418 RTP/AVP 0 101 a=rtpmap:0 PCMU/8000 a=rtpmap:101 telephone-event/8000 a=fmtp:101 0-16 a=ptime:20 a=rtcp:52419 IN IP4 10.160.10.55 ```
Repeat INVITEs eventually result in a crash. This backtrace is taken from a different server but identical scenario:
``` #0 t_should_relay_response (Trans=0x7f4aee56ba80, new_code=200, branch=0, should_store=0x7ffd8b8bbba0, should_relay=0x7ffd8b8bbba4, cancel_data=0x7ffd8b8bbe40, reply=0x7f4afe40df70) at t_reply.c:1285 1285 && !(inv_through && Trans->uac[branch].last_received<300)) { Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.192.el6.x86_64 jansson-2.11-1.el6.x86_64 keyutils-libs-1.4-5.el6.x86_64 krb5-libs-1.10.3-57.el6.x86_64 libcom_err-1.41.12-22.el6.x86_64 libev-4.03-3.el6.x86_64 libselinux-2.0.94-7.el6.x86_64 libunistring-0.9.3-5.el6.x86_64 openssl-1.0.1e-48.el6_8.3.x86_64 sqlite-3.6.20-1.el6_7.2.x86_64 zlib-1.2.3-29.el6.x86_64 (gdb) where #0 t_should_relay_response (Trans=0x7f4aee56ba80, new_code=200, branch=0, should_store=0x7ffd8b8bbba0, should_relay=0x7ffd8b8bbba4, cancel_data=0x7ffd8b8bbe40, reply=0x7f4afe40df70) at t_reply.c:1285 #1 0x00007f4afd73ab7d in relay_reply (t=0x7f4aee56ba80, p_msg=0x7f4afe40df70, branch=0, msg_status=200, cancel_data=0x7ffd8b8bbe40, do_put_on_wait=1) at t_reply.c:1821 #2 0x00007f4afd740c38 in reply_received (p_msg=0x7f4afe40df70) at t_reply.c:2558 #3 0x00000000004b5d6a in do_forward_reply (msg=0x7f4afe40df70, mode=0) at core/forward.c:747 #4 0x00000000004b78c9 in forward_reply (msg=0x7f4afe40df70) at core/forward.c:852 #5 0x000000000054982b in receive_msg ( buf=0xb27700 "SIP/2.0 200 OK\r\nv:SIP/2.0/UDP xxx;branch=z9hG4bK9915.dcfe6d03cb21e6c94a6705adb095c566.0,SIP/2.0/UDP xxx;received=xxx;rport=5060;branch=z9hG4bKZmZp0NZv3FvQB\r\nRecord-Rout"..., len=984, rcv_info=0x7ffd8b8bc6c0) at core/receive.c:434 #6 0x00000000006658ca in udp_rcv_loop () at core/udp_server.c:541 #7 0x000000000042500f in main_loop () at main.c:1655 #8 0x000000000042c5bb in main (argc=11, argv=0x7ffd8b8bcc88) at main.c:2696 ```
Looks like in the crash frame, `Trans->uac` is `NULL`:
``` (gdb) frame 0 #0 t_should_relay_response (Trans=0x7f4aee56ba80, new_code=200, branch=0, should_store=0x7ffd8b8bbba0, should_relay=0x7ffd8b8bbba4, cancel_data=0x7ffd8b8bbe40, reply=0x7f4afe40df70) at t_reply.c:1285 1285 && !(inv_through && Trans->uac[branch].last_received<300)) { (gdb) print Trans->uac $1 = (struct ua_client *) 0x0 ```
I have seen this crash from time to time, though it is infrequent. It always seems to occur when there are problems reaching RTPEngine in a timely fashion.
This is on Kamailio 5.2.4:de9a03, which I know has long fallen off the back of the maintenance train. However, line 1386 of `t_reply.c` in `master:HEAD` shows that the code is no different now:
https://github.com/kamailio/kamailio/blob/master/src/modules/tm/t_reply.c#L1...
Of course, it's possible that something upstream has changed so as to prevent this state. Otherwise, I suppose a null pointer safety check for `Trans->uac` is recommended. I'd be happy to submit a patch, just wasn't sure if the feeling was more that this should be resolved higher up the call stack.
any recent news? I would say this can be closed no?
Closed #3297 as completed.
Hi, I haven't found anything more on this and agree it can be closed.