[SR-Users] problem unreferencing dialog in dialog module
Timo Reimann
timo.reimann at 1und1.de
Thu Mar 3 11:11:15 CET 2011
Hey,
On 03.03.2011 10:19, Anton Roman wrote:
> Checking the time staps from the acc and the crash log, the BYE for
> the dialog was before the crash but the To-tag is not printed from
> dlg_hash.c, although it is in the acc for INVITE and BYE. Do you
> have parallel forking in front of this SIP server? I mean, is there
> another proxy that can do parallel forking then send two or more
> branches to this instance?
>
> AFAIK the the client who is sending that calls is not doing parallel
> forking, they are sending calls over a SIP trunk to our Kamailio. They
> are calling to PSTN numbers and we are sending that calls to a gateway,
> so they shouldn't do parallel forking, I'll get some traces to check it.
Your trace shows that there are two worker processes dealing with the
segfault-triggering dialog, process ID 32155 and 32158. I cannot see
from your trace what module caused the latter process to execute
unref_dlg() in dlg_hash.c, however.
What I can tell though is that the crash happens because too much dialog
reference counter decrementing takes place. Although I have no clue why,
I believe the implementation of unref_dlg_unsafe() (a macro) could be
somewhat more robust by not unlinking and destroying a dialog when the
counter drops below zero. That is, instead of running the following block
if ((_dlg)->ref<=0) { \
unlink_unsafe_dlg( _d_entry, _dlg);\
LM_DBG("ref <=0 for dialog %p\n",_dlg);\
destroy_dlg(_dlg);\
}\
for _dlg->ref <= 0, I see no reason to change the compare operator to ==.
Of course, that just cures the symptoms. A coredump would be really
helpful in identifying the root of the crash problem but I don't know
why it wasn't generated in your case. Your configuration looks good to me.
Cheers,
--Timo
More information about the sr-users
mailing list