Hi Federico,

Thank you to keep looking at this.

I have sent you all my changes to default config. So there is nothing done after t_is_canceled (no voice mail, ...)

I have just enabled WITH_NAT & WITH_NATSIPPING.

I am sure if it is related but do you do special tricks for DECLINE (when enabling tsilo) ?
(

The trace is always ending up with:

Program terminated with signal SIGSEGV, Segmentation fault.

#0  0x00007f5741b4efd7 in free_faked_req (faked_req=0x7f5741e00100 <faked_req>, t=0x7f573adb3c60) at t_reply.c:931

931 faked_req->body->free(&faked_req->body);

(gdb) p* faked_req->body->free

Cannot access memory at address 0x312e3838312e3832



It seems that the address of the "free" pointer is corrupted somewhere.
I think I need to trace when it is modified (if it is possible).

Regards

thibault






2015-09-18 14:15 GMT+02:00 Federico Cabiddu <federico.cabiddu@gmail.com>:
Hi Thibault,
I tried again to reproduce the issue but with no luck. 
Are you doing something "special" in the failure route after calling t_is_canceled?

Regards,

Federico

On Wed, Sep 16, 2015 at 7:33 PM, Thibault Gueslin <thibault.gueslin@gmail.com> wrote:
Hello Federico,

You are right. The first line comes from the other core dump (which is related to normal sig handling on crash).

In my scenario, I have only 2 SIP client: one caller and one callee
I am starting the callee after call is initiated then kill the callee, then relaunch the callee.
Each time I expect the call to be presented again.
It seems that the issue arrives when I wait for timeout after this scenario.

Its confirmed by the log where the issue arrives each time in MANAGE_FAILURE

failure_route[MANAGE_FAILURE] {
route(NATMANAGE);

if (t_is_canceled()) {
exit;
}
....
}

 9(30189) DEBUG: tm [t_reply.c:1230]: t_should_relay_response(): ->>>>>>>>> T_code=180, new_code=408
 9(30189) exec: *** cfgtrace:failure_route=[MANAGE_FAILURE] c=[/etc/kamailio/kamailio.cfg] l=1089 a=5 n=route
 9(30189) exec: *** cfgtrace:failure_route=[NATMANAGE] c=[/etc/kamailio/kamailio.cfg] l=950 a=16 n=if
 9(30189) exec: *** cfgtrace:failure_route=[NATMANAGE] c=[/etc/kamailio/kamailio.cfg] l=943 a=24 n=is_request
 9(30189) exec: *** cfgtrace:failure_route=[NATMANAGE] c=[/etc/kamailio/kamailio.cfg] l=949 a=16 n=if
 9(30189) exec: *** cfgtrace:failure_route=[NATMANAGE] c=[/etc/kamailio/kamailio.cfg] l=944 a=24 n=has_totag
 9(30189) DEBUG: siputils [checks.c:97]: has_totag(): no totag
 9(30189) exec: *** cfgtrace:failure_route=[NATMANAGE] c=[/etc/kamailio/kamailio.cfg] l=953 a=16 n=if
 9(30189) exec: *** cfgtrace:failure_route=[NATMANAGE] c=[/etc/kamailio/kamailio.cfg] l=950 a=41 n=isflagset
 9(30189) exec: *** cfgtrace:failure_route=[NATMANAGE] c=[/etc/kamailio/kamailio.cfg] l=953 a=25 n=rtpproxy_manage
 9(30189) exec: *** cfgtrace:failure_route=[NATMANAGE] c=[/etc/kamailio/kamailio.cfg] l=962 a=16 n=if
 9(30189) exec: *** cfgtrace:failure_route=[NATMANAGE] c=[/etc/kamailio/kamailio.cfg] l=955 a=24 n=is_request
 9(30189) exec: *** cfgtrace:failure_route=[NATMANAGE] c=[/etc/kamailio/kamailio.cfg] l=961 a=16 n=if
 9(30189) exec: *** cfgtrace:failure_route=[NATMANAGE] c=[/etc/kamailio/kamailio.cfg] l=956 a=24 n=has_totag
 9(30189) DEBUG: siputils [checks.c:97]: has_totag(): no totag
 9(30189) exec: *** cfgtrace:failure_route=[NATMANAGE] c=[/etc/kamailio/kamailio.cfg] l=960 a=16 n=if
 9(30189) exec: *** cfgtrace:failure_route=[NATMANAGE] c=[/etc/kamailio/kamailio.cfg] l=957 a=24 n=t_is_branch_route
 9(30189) exec: *** cfgtrace:failure_route=[NATMANAGE] c=[/etc/kamailio/kamailio.cfg] l=969 a=16 n=if
 9(30189) exec: *** cfgtrace:failure_route=[NATMANAGE] c=[/etc/kamailio/kamailio.cfg] l=962 a=24 n=is_reply
 9(30189) exec: *** cfgtrace:failure_route=[NATMANAGE] c=[/etc/kamailio/kamailio.cfg] l=969 a=2 n=return
 9(30189) exec: *** cfgtrace:failure_route=[MANAGE_FAILURE] c=[/etc/kamailio/kamailio.cfg] l=1112 a=16 n=if
 9(30189) exec: *** cfgtrace:failure_route=[MANAGE_FAILURE] c=[/etc/kamailio/kamailio.cfg] l=1091 a=24 n=t_is_canceled
 9(30189) DEBUG: tm [t_lookup.c:1011]: t_check_msg(): DEBUG: t_check_msg: msg id=9 global id=9 T start=0x7f088e376a78
 9(30189) DEBUG: tm [t_lookup.c:1083]: t_check_msg(): DEBUG: t_check_msg: T already found!

19(30199) CRITICAL: <core> [pass_fd.c:275]: receive_fd(): EOF on 20
19(30199) DEBUG: <core> [tcp_main.c:3448]: handle_ser_child(): dead child 9, pid 30189 (shutting down?)
19(30199) DEBUG: <core> [io_wait.h:598]: io_watch_del(): DBG: io_watch_del (0x9ddc40, 20, -1, 0x0) fd_no=25 called
 0(30180) ALERT: <core> [main.c:728]: handle_sigs(): child process 30189 exited by a signal 11
 0(30180) ALERT: <core> [main.c:731]: handle_sigs(): core was generated
 0(30180) INFO: <core> [main.c:743]: handle_sigs(): terminating due to SIGCHLD

The conf comes from your presentation !

Small changes:

request_route {
..
# account only INVITEs
if (is_method("INVITE")) {
setflag(FLT_ACC); # do accounting
route(RELAY);
route(INVITE);
exit;
}

# Wrapper for relaying requests
route[RELAY] {

# enable additional event routes for forwarded requests
# - serial forking, RTP relaying handling, a.s.o.
if (is_method("INVITE|BYE|SUBSCRIBE|UPDATE")) {
if(!t_is_set("branch_route")) t_on_branch("MANAGE_BRANCH");
}
if (is_method("INVITE|SUBSCRIBE|UPDATE")) {
if(!t_is_set("onreply_route")) t_on_reply("MANAGE_REPLY");
}
if (is_method("INVITE")) {
if(!t_is_set("failure_route")) t_on_failure("MANAGE_FAILURE");
} else {

if (!t_relay()) {
sl_reply_error();
}
exit;
}

}

the other part is like you:
# manage incoming REGISTERs 
route[INVITE] {
if (!lookup("location")) 
{
send_reply("100", "Trying"); 
route(SUSPEND); 
else 
{
t_relay();
ts_store();
$sht(vtp=>stored::$rU) = 1;
xdbg("stored transaction [$T(id_index):$T(id_label)] $fU => $rU\n"); 
}
route(SENDPUSH);
}
 
#suspend route
route[SUSPEND] 
{
if(!t_suspend()) {
xlog("failed suspending trasaction [$T(id_index): $T(id_label)]n");
send_reply("501", "Unknown destination");
exit;
}
xdbg("suspended transaction [$T(id_index):$T(id_label)] $fU => $rU\n");
$sht(vtp=>join::$rU) = "" + $T(id_index) + ":" + $T(id_label);
xdbg("htable key value [$sht(vtp=>join::$rU)]n");
}
 
route[REGISTER] 
{
if(isflagset(FLT_NATS)) {
setbflag(FLB_NATB);
#!ifdef WITH_NATSIPPING
# do SIP NAT pinging
setbflag(FLB_NATSIPPING);
#!endif
}
if (!save("location"))
sl_reply_error();
route(PUSHJOIN);
exit;
}
 
route[PUSHJOIN]
{
$var(hjoin) = 0;
lock("$tU");
$var(hjoin) = $sht(vtp=>join::$tU); $var(hstored) = $sht(vtp=>stored::$tU);
$sht(vtp=>join::$tU) = $null; unlock("$tU");
if ($var(hjoin)==0) {
if ($var(hstored))
ts_append("location", "$tU");
return;
}
$var(id_index) = $(var(hjoin){s.select,0,:}{s.int});
$var(id_label) = $(var(hjoin){s.select,1,:}{s.int});
xdbg("resuming transaction [$var(id_index):$var(id_label)] $tU ($var(hjoin))n");
t_continue("$var(id_index)", "$var(id_label)", "INVRESUME");
}

route[INVRESUME]
{
lookup("location");
t_relay();
ts_store();
$sht(vtp=>stored::$rU) = 1;
xdbg("stored transaction [$T(id_index):$T(id_label)] $fU => $rU\n");
}

regards 

2015-09-16 8:23 GMT+02:00 Federico Cabiddu <federico.cabiddu@gmail.com>:
Hi Thibault,
I'm not sure I understand the scenario of your crash. Is the branch rejecting the call a branch added with ts_append? What are you doing upon receiving the 603 (supposing that's how the application is rejecting the call)? Are you appending other branches? 
In the bt it looks like the transaction timed out but then the the log line

"#0  0x00007fc279f64855 in lock_entry (entry=0x7fc2761d6068) at ts_hash.c:156"

and the core seem unrelated. Maybe you can share the relevant parts of your routing script so that I can get better what's going on.
Also it would be very useful if you could provide the logs of your test with debug level 3.
Thanks for your collaboration.

Regards,

Federico

On Tue, Sep 15, 2015 at 5:53 PM, Thibault Gueslin <thibault.gueslin@gmail.com> wrote:
Hello Federico,

I have built from 4.3 branch.

I got a crash again... However it seems different than previous one:

Issue seems located in tm module.


It appears if the remote denied the incoming call, then quit application .


thibault


Core was generated by `sbin/kamailio -f /etc/kamailio/kamailio.cfg -L ./lib64/kamailio/modules/'.

Program terminated with signal SIGSEGV, Segmentation fault.

#0  0x00007fc279f64855 in lock_entry (entry=0x7fc2761d6068) at ts_hash.c:156

156 ts_lock(t_table, entry);


(gdb) bt full

#0  0x0000000000000001 in ?? ()

No symbol table info available.

#1  0x00007fc27cd77fd9 in free_faked_req (faked_req=0x7fc27d029100 <faked_req>, t=0x7fc2761f2fc0) at t_reply.c:931

        hdr = 0x0

        __FUNCTION__ = "free_faked_req"

#2  0x00007fc27cd78df4 in run_failure_handlers (t=0x7fc2761f2fc0, rpl=0xffffffffffffffff, code=408, extra_flags=96) at t_reply.c:997

        faked_req = {id = 3, pid = 15569, tval = {tv_sec = 1442326316, tv_usec = 475922}, fwd_send_flags = {f = 4 '\004', blst_imask = 0 '\000'}, rpl_send_flags = {f = 0 '\000', blst_imask = 0 '\000'}, 

          first_line = {type = 1, flags = 1, len = 68, u = {request = {method = {

                  s = 0x7fc2761efc08 "INVITE sip:toto4.toto.com.Thibault@sip-staging.serveur.com SIP/2.0\r\nVia: SIP/2.0/TCP 172.16.224.222:54924;branch=z9hG4bK44b87ead2c84b31a;rport\r\nContact: <sip:toto4.toto.com.Ipod_tgu@172.16.224.222:549"..., len = 6}, uri = {

                  s = 0x7fc2761efc0f "sip:toto4.toto.com.Thibault@sip-staging.serveur.com SIP/2.0\r\nVia: SIP/2.0/TCP 172.16.224.222:54924;branch=z9hG4bK44b87ead2c84b31a;rport\r\nContact: <sip:toto4.toto.com.Ipod_tgu@172.16.224.222:54924;tran"..., len = 51}, version = {

                  s = 0x7fc2761efc43 "SIP/2.0\r\nVia: SIP/2.0/TCP 172.16.224.222:54924;branch=z9hG4bK44b87ead2c84b31a;rport\r\nContact: <sip:toto4.toto.com.Ipod_tgu@172.16.224.222:54924;transport=tcp>\r\nMax-Forwards: 69\r\nTo: <sip:toto4.toto.co"..., len = 7}, method_value = 1}, reply = {version = {

                  s = 0x7fc2761efc08 "INVITE sip:toto4.toto.com.Thibault@sip-staging.serveur.com SIP/2.0\r\nVia: SIP/2.0/TCP 172.16.224.222:54924;branch=z9hG4bK44b87ead2c84b31a;rport\r\nContact: <sip:toto4.toto.com.Ipod_tgu@172.16.224.222:549"..., len = 6}, status = {

                  s = 0x7fc2761efc0f "sip:toto4.toto.com.Thibault@sip-staging.serveur.com SIP/2.0\r\nVia: SIP/2.0/TCP 172.16.224.222:54924;branch=z9hG4bK44b87ead2c84b31a;rport\r\nContact: <sip:toto4.toto.com.Ipod_tgu@172.16.224.222:54924;tran"..., len = 51}, reason = {

                  s = 0x7fc2761efc43 "SIP/2.0\r\nVia: SIP/2.0/TCP 172.16.224.222:54924;branch=z9hG4bK44b87ead2c84b31a;rport\r\nContact: <sip:toto4.toto.com.Ipod_tgu@172.16.224.222:54924;transport=tcp>\r\nMax-Forwards: 69\r\nTo: <sip:toto4.toto.co"..., len = 7}, statuscode = 1}}}, via1 = 0x7fc2761f0008, via2 = 0x0, headers = 0x7fc2761effc8, last_header = 0x7fc2761f07c0, parsed_flag = 18446744073709551615, 

          h_via1 = 0x7fc2761effc8, h_via2 = 0x0, callid = 0x7fc2761f0650, to = 0x7fc2761f01f0, cseq = 0x7fc2761f0690, from = 0x7fc2761f0408, contact = 0x7fc2761f0170, maxforwards = 0x7fc2761f01b0, route = 0x0, 

          record_route = 0x0, content_type = 0x7fc2761f0780, content_length = 0x7fc2761f07c0, authorization = 0x0, expires = 0x0, proxy_auth = 0x0, supported = 0x0, require = 0x0, proxy_require = 0x0, 

          unsupported = 0x0, allow = 0x7fc2761f0740, event = 0x0, accept = 0x0, accept_language = 0x0, organization = 0x0, priority = 0x0, subject = 0x0, user_agent = 0x7fc2761f0700, server = 0x0, 

          content_disposition = 0x0, diversion = 0x0, rpid = 0x0, refer_to = 0x0, session_expires = 0x0, min_se = 0x0, sipifmatch = 0x0, subscription_state = 0x0, date = 0x0, identity = 0x0, identity_info = 0x0, 

          pai = 0x0, ppi = 0x0, path = 0x0, privacy = 0x0, body = 0x7fc27e026c70, 

          eoh = 0x7fc2761efe45 "\r\nv=0\r\no=- 791306690 125312959 IN IP4 172.16.224.222\r\ns=-\r\nc=IN IP4 172.16.224.222\r\nt=0 0\r\na=tool:baresip 0.4.3\r\nm=audio 25940 RTP/AVP 96 97 98 8 0 101\r\nb=AS:125\r\na=rtpmap:96 opus/48000/2\r\na=rtpmap:97"..., 

          unparsed = 0x7fc2761efe45 "\r\nv=0\r\no=- 791306690 125312959 IN IP4 172.16.224.222\r\ns=-\r\nc=IN IP4 172.16.224.222\r\nt=0 0\r\na=tool:baresip 0.4.3\r\nm=audio 25940 RTP/AVP 96 97 98 8 0 101\r\nb=AS:125\r\na=rtpmap:96 opus/48000/2\r\na=rtpmap:97"..., rcv = {src_ip = {af = 2, len = 4, u = {addrl = {3334267998, 0}, addr32 = {3334267998, 0, 0, 0}, addr16 = {58462, 50876, 0, 0, 0, 0, 0, 0}, 

                addr = "^\344\274\306", '\000' <repeats 11 times>}}, dst_ip = {af = 2, len = 4, u = {addrl = {2667915013, 0}, addr32 = {2667915013, 0, 0, 0}, addr16 = {9989, 40709, 0, 0, 0, 0, 0, 0}, 

                addr = "\005'\005\237", '\000' <repeats 11 times>}}, src_port = 54927, dst_port = 5060, proto_reserved1 = 1, proto_reserved2 = 0, src_su = {s = {sa_family = 2, 

                sa_data = "\326\217^\344\274\306\000\000\000\000\000\000\000"}, sin = {sin_family = 2, sin_port = 36822, sin_addr = {s_addr = 3334267998}, sin_zero = "\000\000\000\000\000\000\000"}, sin6 = {

                sin6_family = 2, sin6_port = 36822, sin6_flowinfo = 3334267998, sin6_addr = {__in6_u = {__u6_addr8 = '\000' <repeats 15 times>, __u6_addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, __u6_addr32 = {0, 0, 0, 0}}}, 

                sin6_scope_id = 1979090440}}, bind_address = 0x7fc27e03d9f0, proto = 2 '\002'}, 

          buf = 0x7fc2761efc08 "INVITE sip:toto4.toto.com.Thibault@sip-staging.serveur.com SIP/2.0\r\nVia: SIP/2.0/TCP 172.16.224.222:54924;branch=z9hG4bK44b87ead2c84b31a;rport\r\nContact: <sip:toto4.toto.com.Ipod_tgu@172.16.224.222:549"..., len = 959, new_uri = {s = 0x0, len = 0}, dst_uri = {s = 0x0, len = 0}, parsed_uri_ok = 0, parsed_uri = {user = {

              s = 0x7fc2761efba4 "toto4.toto.com.Thibault@172.16.230.61:52915;transport=tcpP/2sip:94.228.188.198:52919;transport=tcp92INVITE sip:toto4.toto.com.Thibault@sip-staging.serveur.com SIP/2.0\r\nVia: SIP/2.0/TCP 172.16.224.222:"..., len = 23}, passwd = {s = 0x0, len = 0}, host = {

              s = 0x7fc2761efbbc "172.16.230.61:52915;transport=tcpP/2sip:94.228.188.198:52919;transport=tcp92INVITE sip:toto4.toto.com.Thibault@sip-staging.serveur.com SIP/2.0\r\nVia: SIP/2.0/TCP 172.16.224.222:54924;branch=z9hG4bK44b8"..., len = 13}, port = {

              s = 0x7fc2761efbca "52915;transport=tcpP/2sip:94.228.188.198:52919;transport=tcp92INVITE sip:toto4.toto.com.Thibault@sip-staging.serveur.com SIP/2.0\r\nVia: SIP/2.0/TCP 172.16.224.222:54924;branch=z9hG4bK44b87ead2c84b31a;r"..., len = 5}, params = {

              s = 0x7fc2761efbd0 "transport=tcpP/2sip:94.228.188.198:52919;transport=tcp92INVITE sip:toto4.toto.com.Thibault@sip-staging.serveur.comSIP/2.0\r\nVia: SIP/2.0/TCP 172.16.224.222:54924;branch=z9hG4bK44b87ead2c84b31a;rport\r\n"..., len = 13}, sip_params = {s = 0x7fc27e02a260 ' ' <repeats 88 times>, "HK\001~\302\177", len = 13}, headers = {s = 0x0, len = 0}, port_no = 52915, proto = 2, type = SIP_URI_T, 

            flags = (unknown: 0), transport = {

              s = 0x7fc2761efbd0 "transport=tcpP/2sip:94.228.188.198:52919;transport=tcp92INVITE sip:toto4.toto.com.Thibault@sip-staging.serveur.comSIP/2.0\r\nVia: SIP/2.0/TCP 172.16.224.222:54924;branch=z9hG4bK44b87ead2c84b31a;rport\r\n"..., len = 13}, ttl = {s = 0x0, len = 0}, user_param = {s = 0x0, len = 0}, maddr = {s = 0x0, len = 0}, method = {s = 0x0, len = 0}, lr = {s = 0x0, len = 0}, r2 = {s = 0x0, len = 0}, gr = {

              s = 0x0, len = 0}, transport_val = {

              s = 0x7fc2761efbda "tcpP/2sip:94.228.188.198:52919;transport=tcp92INVITE sip:toto4.toto.com.Thibault@sip-staging.serveur.comSIP/2.0\r\nVia: SIP/2.0/TCP 172.16.224.222:54924;branch=z9hG4bK44b87ead2c84b31a;rport\r\nContact: <"..., len = 3}, ttl_val = {s = 0x0, len = 0}, user_param_val = {s = 0x0, len = 0}, maddr_val = {s = 0x0, len = 0}, method_val = {s = 0x0, len = 0}, lr_val = {s = 0x0, len = 0}, r2_val = {

              s = 0x0, len = 0}, gr_val = {s = 0x0, len = 0}}, parsed_orig_ruri_ok = 0, parsed_orig_ruri = {user = {s = 0x0, len = 0}, passwd = {s = 0x0, len = 0}, host = {s = 0x0, len = 0}, port = {s = 0x0, 

              len = 0}, params = {s = 0x0, len = 0}, sip_params = {s = 0x0, len = 0}, headers = {s = 0x0, len = 0}, port_no = 0, proto = 0, type = ERROR_URI_T, flags = (unknown: 0), transport = {s = 0x0, len = 0}, 

            ttl = {s = 0x0, len = 0}, user_param = {s = 0x0, len = 0}, maddr = {s = 0x0, len = 0}, method = {s = 0x0, len = 0}, lr = {s = 0x0, len = 0}, r2 = {s = 0x0, len = 0}, gr = {s = 0x0, len = 0}, 

            transport_val = {s = 0x0, len = 0}, ttl_val = {s = 0x0, len = 0}, user_param_val = {s = 0x0, len = 0}, maddr_val = {s = 0x0, len = 0}, method_val = {s = 0x0, len = 0}, lr_val = {s = 0x0, len = 0}, 

            r2_val = {s = 0x0, len = 0}, gr_val = {s = 0x0, len = 0}}, add_rm = 0x7fc2761f4ac8, body_lumps = 0x0, reply_lump = 0x0, 

          add_to_branch_s = "z9hG4bK4d8.005be33152cbbe2a3c79d27fff052452.3", '\000' <repeats 12 times>, add_to_branch_len = 45, hash_index = 2260, msg_flags = 266481, flags = 34, set_global_address = {s = 0x0, 

            len = 0}, set_global_port = {s = 0x0, len = 0}, force_send_socket = 0x7fc27e03d9f0, path_vec = {s = 0x0, len = 0}, instance = {s = 0x0, len = 0}, reg_id = 0, ruid = {s = 0x0, len = 0}, location_ua = {

            s = 0x0, len = 0}, ldv = {flow = {decoded = 0, rcv = {src_ip = {af = 0, len = 0, u = {addrl = {0, 0}, addr32 = {0, 0, 0, 0}, addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, addr = '\000' <repeats 15 times>}}, 

                dst_ip = {af = 0, len = 0, u = {addrl = {0, 0}, addr32 = {0, 0, 0, 0}, addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, addr = '\000' <repeats 15 times>}}, src_port = 0, dst_port = 0, proto_reserved1 = 0, 

        __FUNCTION__ = "run_failure_handlers"

#3  0x00007fc27cd7ba3b in t_should_relay_response (Trans=0x7fc2761f2fc0, new_code=408, branch=3, should_store=0x7fff9a9b9248, should_relay=0x7fff9a9b924c, cancel_data=0x7fff9a9b92e0, reply=0xffffffffffffffff)

    at t_reply.c:1342

        branch_cnt = 4

        picked_code = 408

        new_branch = 1

        inv_through = 0

        extra_flags = 96

        i = 32706

        replies_dropped = 0

        __FUNCTION__ = "t_should_relay_response"

#4  0x00007fc27cd7e7d6 in relay_reply (t=0x7fc2761f2fc0, p_msg=0xffffffffffffffff, branch=3, msg_status=408, cancel_data=0x7fff9a9b92e0, do_put_on_wait=0) at t_reply.c:1745

        relay = -1

        save_clone = 0

        buf = 0x0

        res_len = 0

        relayed_code = 0

        relayed_msg = 0x0

        reply_bak = 0x1

        bm = {to_tag_val = {s = 0x200000000 <error: Cannot access memory at address 0x200000000>, len = 1981755328}}

        totag_retr = 0

        reply_status = RPS_ERROR

        uas_rb = 0xffffffffffffffff

        to_tag = 0x7fc2761f3780

        reason = {s = 0x735c44 "Request Timeout", len = -1701080344}

        onsend_params = {req = 0x76203528, rpl = 0x7fc2761f35c0, param = 0x18f59272ffffffff, code = 418744713, flags = 320, branch = 0, t_rbuf = 0x3ef3ee10, dst = 0x415ed0 <_start>, send_buf = {

            s = 0x7fff9a9b92c0 "\020\223\233\232\377\177", len = 2094439873}}

        ip = {af = 2593886736, len = 32767, u = {addrl = {140473294816037, 18446744069414584320}, addr32 = {2094433061, 32706, 0, 4294967295}, addr16 = {33573, 31958, 32706, 0, 0, 0, 65535, 65535}, 

            addr = "%\203\326|\302\177\000\000\000\000\000\000\377\377\377\377"}}

        __FUNCTION__ = "relay_reply"

#5  0x00007fc27cda99d8 in fake_reply (t=0x7fc2761f2fc0, branch=3, code=408) at timer.c:328

        cancel_data = {cancel_bitmap = 0, reason = {cause = 0, u = {text = {s = 0x0, len = 1981755328}, e2e_cancel = 0x0, packed_hdrs = {s = 0x0, len = 1981755328}}}}

        do_cancel_branch = 1

        reply_status = 15561

#6  0x00007fc27cda9e5f in final_response_handler (r_buf=0x7fc2761f36e8, t=0x7fc2761f2fc0) at timer.c:500

        silent = 0

        branch_ret = 0

        prev_branch = 1056173584

        now = 0

#7  0x00007fc27cda9f02 in retr_buf_handler (ticks=418744835, tl=0x7fc2761f3708, p=0xfffffffe) at timer.c:558

        rbuf = 0x7fc2761f36e8

        fr_remainder = 2593887152

        retr_remainder = 32706

        retr_interval = 1979369672

        new_retr_interval_ms = 140473182140168

        crt_retr_interval_ms = 140473179752648

        t = 0x7fc2761f2fc0

        __FUNCTION__ = "retr_buf_handler"

#8  0x000000000048d82f in slow_timer_main () at timer.c:1130

        n = 12

        ret = 1

        tl = 0x7fc2761f3708

        i = 147

        __FUNCTION__ = "slow_timer_main"


2015-09-15 10:35 GMT+02:00 Federico Cabiddu <federico.cabiddu@gmail.com>:
Hi Thibault,
I have not been able to get the crash reproducing the scenario you described.
Could you try the last 4.3.x code? Are you still seeing the crash?

Regards,

Federico

On Fri, Sep 11, 2015 at 11:34 PM, Thibault Gueslin <thibault.gueslin@gmail.com> wrote:
Hi Federico,

I will try last code in 4.3.x branch.

The scenario is very easy: I am calling a SIP client (running on a mobile)
First the client is stopped. Then launch the app. As expected, the calll is presented after it has registered.
Then kill the application (before answering), then launching again the app, call is presented...
Then waiting for call timeout.
It works one or 2 times then call never timeouts on the client which initiates the call
(which means Kamailio is dead and does not send 408 Timeout)


2015-09-10 19:33 GMT+02:00 Federico Cabiddu <federico.cabiddu@gmail.com>:
Hi Thibault,
have you tried last tsilo code from 4.3.x branch?
Recently there has been a fix (https://github.com/kamailio/kamailio/commit/6ce6803d57dabe287d7d6fa859e93c1df402d821) for an issue that may be related to yours. 
I'll keep investigating to see if I can spot something else. In the meanwhile could you describe your scenario? Are you storing multiple transactions per ruri? Did any of them got a final reply before the crash?

Regards,

Federico

On Thu, Sep 10, 2015 at 3:00 PM, Daniel-Constantin Mierla <miconda@gmail.com> wrote:


On 10/09/15 14:36, Thibault Gueslin wrote:


2015-09-10 14:25 GMT+02:00 Daniel-Constantin Mierla <miconda@gmail.com>:
Do you have msrp enabled in configuration file

I don't think so

The last frames of backtrace indicates code related to msrp, but might be just some code lines mismatching.

The issue seems to be in tsilo. I looke over the code and I spotted some "unclear" mechanisms that can lead to race conditions, which may result in invalid access to memory, as it happens in this case, ptr becomes 0x8b08578b49642454 -- from my short investigation, that is likely to be due to following a ->next field in a freed structure.

Not being the author of tsilo module, I can't do much more right now. I will open an issue on bug tracker explaining what I found, assigning Federico (cc-ed, author of the module) to analyze and see if anything is wrong there.

Cheers,
Daniel