[sr-dev] Kamailio 3.1.4 Crash

Timo Reimann timo.reimann at 1und1.de
Tue Aug 9 18:25:50 CEST 2011


Hey Brandon,


On 09.08.2011 17:54, Brandon Armstead wrote:
>    Looks like I spoke too soon!   It is still happening.
> 
> Any additional thoughts?  All and any help is greatly appreciated.

My original theory with Anton's issue was (and still is) that the dialog
module is trying to touch a dialog which has already terminated. When
provoking things like that through modifications in the dialog module,
we encountered crashes at similar locations in the code.

Is there any chance you can provide me with some (anonymized) SIP
traces? Dissecting this problem has proven to be very hard even when
given a lot of details, such as Anton's core dumps. Not being able to
reconstruct the call flow has been a show stopper so far.


> On Tue, Aug 9, 2011 at 8:37 AM, Brandon Armstead <brandon at cryy.com
> <mailto:brandon at cryy.com>> wrote:
> 
>     Timo,
> 
>     Looks like that worked - going to keep watching it and see what happens.
> 
>     However I am now getting:
> 
>     Aug  9 15:34:00  /usr/local/sbin/kamailio[3040]: CRITICAL: dialog
>     [dlg_timer.c:138]: Trying to insert a bogus dlg tl=0x7f8dc8089368
>     tl->next=0x7f8dc80202e0 tl->prev=0x7f8dc8108b68
>     Aug  9 15:34:00  /usr/local/sbin/kamailio[3040]: CRITICAL: dialog
>     [dlg_handlers.c:373]: Unable to insert dlg 0x7f8dc8089318
>     [603:994585481] on event 3 [2->3] with clid
>     'CINMGC0320110809153354004076 at XXX.XXX.XXX.XXX' and tags
>     'VPSF506071629460' 'gK0cc7be82'
> 
>     Which looks similar to the original thread?  Not sure if there is
>     still an underlying issue that I should be wary of?

This seems to be related to setting up the dialog timer on reception of
a 200 OK message when the given dialog was about to transition from the
"early" state (18x reply seen) to the "confirmed without ACK" state (200
OK seen, ACK still outstanding).

Not quite sure what this means but it could possibly be just another
manifestation of the same bug.


Cheers,

--Timo



>     On Tue, Aug 9, 2011 at 8:16 AM, Brandon Armstead <brandon at cryy.com
>     <mailto:brandon at cryy.com>> wrote:
> 
>         Timo,
> 
>            I have actually been researching that thread - it does look
>         *similar* however it does not look 100% related.  I am checking
>         out that commit now however - and will see if it resolves the
>         same issue.
> 
>         As for dlg_end_dlg - I am not calling this via FIFO or anything
>         - I am simply calling dlg_manage() in the routing config -
>         however I am not sure if this is being called internally (I
>         assume that it is) upon a dialog cleanup?
> 
>         Sincerely,
>         Brandon Armstead
> 
> 
>         On Tue, Aug 9, 2011 at 8:08 AM, Timo Reimann
>         <timo.reimann at 1und1.de <mailto:timo.reimann at 1und1.de>> wrote:
> 
>             Hello Brandon,
> 
> 
>             On 09.08.2011 16:17, Brandon Armstead wrote:
>             > Hello,
>             >
>             >    Any further insight any of you can provide is very much
>             appreciated.
>             >
>             > Here is the core dump syslog:
> 
>             [snip!]
> 
>             A few months ago, Anton Roman provided a core dump looking
>             very similar
>             to yours. I wasn't exactly able to pin down the cause but
>             suspected the
>             dlg_end_dlg() function.
> 
>             Is there a chance you used that function around the time the
>             crash happened?
> 
> 
>             Cheers,
> 
>             --Timo



More information about the sr-dev mailing list