On Monday 03 November 2014, Daniel-Constantin Mierla wrote:
Reference counters are good as long as they are used
in predictable
circumstances. The problem encountered so far were related to the fact
that not all calls have proper signaling (e.g., network issues, buggy
clients), the reason for this cleanup routine as well
True. But even then refcounting shouldn't be bypassed. For each (cleanup)
situation the expected refcount should be compared to the actual refcount
before deleting.
For the situation at hand, if the refcount is >1, something is handling the
dlg (or has buggy code not unreffing after finishing).
I think my fix catches those situations tough. When the ACK is missing, the
refcount should be 1. When another process is handling the dlg, the refcount
will be >1.
(i.e., also sip
protocol relies on a time interval for not receiving ACK).
This specific situation is not about missing ACK. That would be
DLG_STATE_CONFIRMED_NA.
My goal was to figure out what was the situation, see
if it is something
predictable that can be fixed in source code -- dialogs staying too long
in a state that shouldn't take too long are susceptible to issues and it
is better if we know what was the reason.
It is weird that the TMCB_DESTROY callback didn't cleanup the dialog. I
haven't investigated the cause as i could fix the segfault pretty easily.
--
Greetings,
Alex Hermann