On 03/11/14 16:32, Alex Hermann wrote:
On Monday 03 November 2014, Daniel-Constantin Mierla
wrote:
what would be the situation to happen like that?
Have you spotted a case
when an non established dialog lasting for more than 5 minutes is still
referenced externally by pointer?
I had a segfault because another process was
handling a very late reply. I
don't know the exact circumstances that led to it.
For cases when it can take many
minutes to come back to a dialog, the dialog ids should be cloned and
used for searching dialog when needed.
The whole idea of refcounting is not to
hold a lock during processing, but
just during obtaining a pointer to the object. Refcounted objects should never
be deleted when they are still being referenced by others.
In this case, the process handling the reply behaved well and had upped the
refcounter. The timer process ignored the refcount and freed the object still
in use.
Now that i think of it, the proper thing to do here would probably be just to
unref the dlg instead of open-coding around it.
Reference counters are good as long as they are used in predictable
circumstances. The problem encountered so far were related to the fact
that not all calls have proper signaling (e.g., network issues, buggy
clients), the reason for this cleanup routine as well (i.e., also sip
protocol relies on a time interval for not receiving ACK).
My goal was to figure out what was the situation, see if it is something
predictable that can be fixed in source code -- dialogs staying too long
in a state that shouldn't take too long are susceptible to issues and it
is better if we know what was the reason.
Daniel
--
Daniel-Constantin Mierla
http://twitter.com/#!/miconda -
http://www.linkedin.com/in/miconda
Kamailio Advanced Training, Nov 24-27, Berlin -
http://www.asipto.com