[SR-Users] Dialogs not removed from memory, and occasionally persistent in DB as well

Timo Reimann sr at foo-lounge.de
Mon Feb 20 22:55:58 CET 2012


Hey all,


Am 31.01.2012 um 14:06 schrieb Timo Reimann:
> Am 31.01.2012 um 11:32 schrieb Marius Pedersen:
> 
>> On 01/31/2012 09:11 AM, Øyvind Kolbu wrote:
>>> On 2012-01-20 at 23:32, Timo Reimann wrote:
>>> 
>>> Hi and sorry for the late response!
>>> 
>>>> Another possible root cause: After calling dlg_manage() on an INVITE, you
>>>> do not forward the request (e.g., by calling exit() instead). Could that
>>>> be the case? If so, the solution would be (again) to defer dialog
>>>> tracking unless you're sure the INVITE will be routed.
>>> 
>>> Thanks, this was indeed at least the main problem! We have no replaced a
>>> lot of sl_send_reply("123", "message") with t_newtran() +
>>> t_reply("123","message") before exit(), and that solved most of the hanging
>>> dialogs. These were by the way hanging in state 1.
>>> 
>>>> If not, the last thing I can think of to try is to do some tracing (using
>>>> ngrep or tcpdump, for example) and attempt to catch a dialog that
>>>> dangles. If you succeed at that, analyzing the trace will probably help
>>>> in determining the issue.
>>> 
>>> We still have a few calls, probably one to two a day, which get stuck in
>>> state 4. They have proper INVITE, 200 OK and later BYE. We have tracked
>>> the problem down to that once in a while Asterisk, our bridge to PSTN,
>>> issues double INVITEs at the extact same time for the same call and in this
>>> case there seem to be a race within Kamailio. The call is properly setup
>>> and termitated, but the entry is still in the dialog table.
>>> 
>>> Any ideas how to cope with the double INVITEs? We do btw use
>>> dlg_match_mode = 1, as we used that in Kamailio 1.5 and that worked like a
>>> charm. Have not tested altering it to either 0 or 2.
>>> 
>>> 
>> Some extra information:
>> 
>> We also still get some dialogs stuck in state 1 when we see these double invites (but the call is not set up due to busy, hang-up etc).
>> 
>> For these events we also (always?) see one or more of these messages in the logs:
>> 
>> CRITICAL: dialog [dlg_hash.c:650]: bogus event 6 in state 1 for dlg 0xb5e88dc4 [2445:97666510] with clid '2adcd4b23355a3aa3a4ae5a73fe72631 at pstn-gateway-ip' and tags 'as485e20a7' ''
>> 
>> CRITICAL: dialog [dlg_hash.c:650]: bogus event 7 in state 1 for dlg 0xb5e88dc4 [2445:97666510] with clid '2adcd4b23355a3aa3a4ae5a73fe72631 at pstn-gateway-ip' and tags 'as485e20a7' ''
>> 
>> Also sometimes event 8.
>> 
>> This happens to a very small minority of calls (<1% I'd guess).
> 
> I'm currently in the process of investigating a dialog-related issue together with Uri (see CC). It may be related to your problem, so let's see if I can find something out that helps you as well. If not, I/we should take a dedicated look at your case.
> 
> Unfortunately, I'm currently short on time so I cannot give any guarantees as to when I'll find the time to get to these dialog-related things. I promise to get back to you folks ASAP though, so please hang on.

Uri's issue was fixed in the master branch. I don't think it has anything to do with your case, however.

Before getting back to you in detail: Is the problem still persisting?


Cheers,

--Timo


More information about the sr-users mailing list