[SR-Users] t_reply in failure route with dialog module

Efelin Novak efelin.novak at gmail.com
Mon Nov 25 12:10:13 CET 2013


Hi Daniel,

sorry it took me more than I expected. Is this sufficient? Meanwhile I
found out that this happens when fr_inv_timer triggers and dialog module is
turned on.

Backtrace:

bt
#0  futex_release (lock=0xb4b90798) at ../../mem/../futexlock.h:137
#1  next_state_dlg (dlg=dlg at entry=0xb4bc5a38, event=event at entry=4,
old_state=old_state at entry=0xbfef08a0, new_state=new_state at entry=0xbfef089c,
unref=unref at entry=0xbfef08a4) at dlg_hash.c:950
#2  0xb6b05479 in dlg_onreply (t=0xb4bc5ce8, type=1048576,
param=0xbfef097c) at dlg_handlers.c:469
#3  0xb6cc2336 in run_trans_callbacks_internal (cb_lst=0xb4bc5d28,
type=type at entry=1048576, trans=0xb4bc5ce8, params=params at entry=0xbfef097c)
at t_hooks.c:290
#4  0xb6cc25da in run_trans_callbacks_with_buf (type=type at entry=1048576,
rbuf=rbuf at entry=0xb4bc5d54, req=req at entry=0xb4bc6a08,
repl=repl at entry=0xffffffff,
flags=flags at entry=505) at t_hooks.c:336
#5  0xb6cee953 in _reply_light (trans=trans at entry=0xb4bc5ce8,
    buf=0xb731f588 "SIP/2.0 505 Error\r\nVia: SIP/2.0/UDP
192.168.21.4;branch=z9hG4bKa6f5.8ca73657.0\r\nVia: SIP/2.0/UDP
10.0.0.95;rport=5060;branch=z9hG4bKBc143aFe81jDj\r\nFrom: \"USER\" <
sip:USER at 10.0.0.95>;t"..., len=len at entry=434, code=code at entry=505, lock=0,
bm=0xbfef0e48, to_tag_len=<error reading variable: Unhandled dwarf
expression opcode 0xfa>,
    to_tag=<error reading variable: Unhandled dwarf expression opcode
0xfa>) at t_reply.c:628
#6  0xb6ceedeb in _reply (trans=0xb4bc5ce8, trans at entry=0x1f9,
p_msg=p_msg at entry=0xb731f568, code=505, text=0xb731f568 "Error", lock=0) at
t_reply.c:782
#7  0xb6ceefd7 in t_reply_unsafe (t=0x1f9, t at entry=0xb4bc5ce8,
p_msg=0xb731f568, p_msg at entry=0xb6d196c0, code=0, text=text at entry=0xb731f568
"Error") at t_reply.c:1531
#8  0xb6cd6904 in w_t_reply (msg=0xb6d196c0, p1=0xb731c988 "8]0\267\001",
p2=0xb731c9b8 "8P0\267 ") at tm.c:1283
#9  0x080651cf in do_action (h=h at entry=0xbfef1258, a=a at entry=0xb7305d98,
msg=msg at entry=0xb6d196c0) at action.c:1080
#10 0x08064137 in run_actions (h=h at entry=0xbfef1258, a=a at entry=0xb73053c8,
msg=msg at entry=0xb6d196c0) at action.c:1573
#11 0x0806cedd in run_top_route (a=0xb73053c8, msg=msg at entry=0xb6d196c0,
c=c at entry=0x0) at action.c:1658
#12 0xb6ceba98 in run_failure_handlers (t=t at entry=0xb4bc5ce8,
rpl=0xffffffff, code=408, extra_flags=96) at t_reply.c:1028
#13 0xb6cece22 in t_should_relay_response (Trans=Trans at entry=0xb4bc5ce8,
new_code=new_code at entry=408, branch=branch at entry=0,
should_store=should_store at entry=0xbfef1464,
    should_relay=should_relay at entry=0xbfef1460,
cancel_data=cancel_data at entry=0xbfef1510, reply=reply at entry=0xffffffff) at
t_reply.c:1304
#14 0xb6cef076 in relay_reply (t=t at entry=0xb4bc5ce8,
p_msg=p_msg at entry=0xffffffff,
branch=branch at entry=0, msg_status=msg_status at entry=408,
cancel_data=cancel_data at entry=0xbfef1510,
    do_put_on_wait=do_put_on_wait at entry=0) at t_reply.c:1707
#15 0xb6cc28b4 in fake_reply (t=t at entry=0xb4bc5ce8, branch=0,
code=code at entry=408) at timer.c:354
#16 0xb6cc34b0 in final_response_handler (t=0xb4bc5ce8, r_buf=0xb4bc5dc8)
at timer.c:526
#17 retr_buf_handler (ticks=272914717, tl=0xb4bc5ddc, p=0x3e8) at
timer.c:584
#18 0x08183325 in timer_list_expire (slow_mark=12, slow_l=0xb49f2438,
h=0xb49f2338, t=272914717) at timer.c:894
#19 timer_handler () at timer.c:959
#20 timer_main () at timer.c:998
#21 0x080b8c65 in main_loop () at main.c:1709
#22 0x08063bfe in main (argc=1, argv=0xbfef1844) at main.c:2566

(I have changed  USER and IP address to hide my network :))

After this when I hit 'n' multiple times I got following output in infinite
loop:

22 0x08063bfe in main (argc=1, argv=0xbfef1844) at main.c:2566
(gdb) n
next_state_dlg (dlg=dlg at entry=0xb4bc5a38, event=event at entry=4,
old_state=old_state at entry=0xbfef08a0, new_state=new_state at entry=0xbfef089c,
unref=unref at entry=0xbfef08a4) at dlg_hash.c:952
952        LM_DBG("dialog %p changed from state %d to "
(gdb)
955    }
(gdb)
dlg_onreply (t=0xb4bc5ce8, type=1048576, param=0xbfef097c) at
dlg_handlers.c:470
470        dlg_run_event_route(dlg, (rpl==FAKED_REPLY)?NULL:rpl, old_state,
new_state);
(gdb)
472        if (new_state==DLG_STATE_EARLY) {
(gdb)
479        if (new_state==DLG_STATE_CONFIRMED_NA &&
(gdb)
540        if ( new_state==DLG_STATE_DELETED
(gdb)
542                        || old_state==DLG_STATE_EARLY) ) {
(gdb)
541                    && (old_state==DLG_STATE_UNCONFIRMED
(gdb)
559        if (unref) dlg_unref(dlg, unref);
(gdb)
563        dlg_release(dlg);
(gdb)
565    }
(gdb)
run_trans_callbacks_internal (cb_lst=0xb4bc5d28, type=type at entry=1048576,
trans=0xb4bc5ce8, params=params at entry=0xbfef097c) at t_hooks.c:292
292            cbp=cbp->next;
(gdb)
283        while(cbp){
(gdb)
286            if ( (cbp->types)&type ) {
(gdb)
292            cbp=cbp->next;
(gdb)
283        while(cbp){
(gdb)
286            if ( (cbp->types)&type ) {
(gdb)
287                DBG("DBG: trans=%p, callback type %d, id %d entered\n",
(gdb)
290                cbp->callback( trans, type, params );
(gdb)
289                params->param = &(cbp->param);
(gdb)
290                cbp->callback( trans, type, params );
(gdb)
292            cbp=cbp->next;
(gdb)
283        while(cbp){
(gdb)
286            if ( (cbp->types)&type ) {
(gdb)
292            cbp=cbp->next;
(gdb)
283        while(cbp){
(gdb)
286            if ( (cbp->types)&type ) {
(gdb)
287                DBG("DBG: trans=%p, callback type %d, id %d entered\n",
(gdb)
290                cbp->callback( trans, type, params );
(gdb)
289                params->param = &(cbp->param);
(gdb)
290                cbp->callback( trans, type, params );
(gdb)
292            cbp=cbp->next;
(gdb)
283        while(cbp){

Regards

Efelin


2013/11/20 Daniel-Constantin Mierla <miconda at gmail.com>

>  Hello,
>
> I will investigate more -- meanwhile had some traveling. It would speed up
> if you can send the backtrace of one process that blocks when you applied
> the patch.
>
> You need to connect with gdb to it:
>
> gdb /path/to/kamailio _PID_
>
> replace _PID_ with the pid of blocked kamailio process.
>
> Cheers,
> Daniel
>
>
> On 11/15/13 5:50 PM, Efelin Novak wrote:
>
>    Hi Daniel,
>
>  thanks for a reply. I applied the patch and now the Kamailio just prints
>
> WARNING: tm [t_lookup.c:1564]: t_unref(): WARNING: script writer didn't
> release transaction
>
>  and than freezes without any log. It does not resend the incoming
> "winning" failure reply neither response to any other messages, not even to
> a new calls. It just freezes. All kamailio processes are running and eating
> the whole 4-core processor.
>
>  Restart of Kamailio solves this problem.
>
>  Any ideas how to continue with debug?
>
>  Thanks
>
>  Efelin
>
>
>  2013/11/15 Daniel-Constantin Mierla <miconda at gmail.com>
>
>>  Hello,
>>
>> can you try attached patch?
>>
>> Let me know if all goes fine and I will commit it to the repository.
>>
>> Cheers,
>> Daniel
>>
>>
>> On 11/15/13 10:25 AM, Efelin Novak wrote:
>>
>>   Hi,
>>
>>  when I use t_reply("505", "Error"); in my failure route, the response is
>> not forwarded and following is written into a log:
>>
>> kamailio[26216]: WARNING: tm [t_lookup.c:1559]: t_unref(): WARNING:
>> script writer didn't release transaction
>>
>> plus next line is written exactly 416000 times into a log afterwards:
>>
>> kamailio[32685]: CRITICAL: dialog [dlg_hash.c:794]: log_next_state_dlg():
>> bogus event 4 in state 5 for dlg 0xb4af6588 [2575:7017] with clid
>> '121d44f0-6555f4c8' and tags 'd12546d053aadc68o2' ''
>>
>>  My point is to change the incoming code from users and append a Q.850
>> reason code.
>>  Is there any other way how to do this or a way how to fix this?
>>  I'm using Kamilio 4.0.4 on Debian 7.1
>>
>>
>>  The code is as follows:
>> failure_route[MANAGE_FAILURE]
>> {
>>         if (t_is_canceled()) {
>>                 exit;
>>         }
>>         if($T_reply_code == 408 && isflagset(10))
>>         {
>>                 xlog("Ringing timeout");
>>                 append_to_reply("Reason: Q.850;cause=28\r\n");
>>                 t_reply("505", "Error");
>>         }
>> }
>>
>>
>>  _______________________________________________
>> SIP Express Router (SER) and Kamailio (OpenSER) - sr-users mailing listsr-users at lists.sip-router.orghttp://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-users
>>
>>
>> --
>> Daniel-Constantin Mierla - http://www.asipto.comhttp://twitter.com/#!/miconda - http://www.linkedin.com/in/miconda
>> Kamailio Advanced Trainings - Berlin, Nov 25-28
>>   - more details about Kamailio trainings at http://www.asipto.com -
>>
>>
>> _______________________________________________
>> SIP Express Router (SER) and Kamailio (OpenSER) - sr-users mailing list
>> sr-users at lists.sip-router.org
>> http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-users
>>
>>
>
> --
> Daniel-Constantin Mierla - http://www.asipto.comhttp://twitter.com/#!/miconda - http://www.linkedin.com/in/miconda
> Kamailio Advanced Trainings - Berlin, Nov 25-28
>   - more details about Kamailio trainings at http://www.asipto.com -
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.sip-router.org/pipermail/sr-users/attachments/20131125/acfcc4f3/attachment.html>


More information about the sr-users mailing list