On 03/11/14 16:32, Alex Hermann wrote:
On Monday 03 November 2014, Daniel-Constantin Mierla wrote:
what would be the situation to happen like that? Have you spotted a case when an non established dialog lasting for more than 5 minutes is still referenced externally by pointer?
I had a segfault because another process was handling a very late reply. I don't know the exact circumstances that led to it.
For cases when it can take many minutes to come back to a dialog, the dialog ids should be cloned and used for searching dialog when needed.
The whole idea of refcounting is not to hold a lock during processing, but just during obtaining a pointer to the object. Refcounted objects should never be deleted when they are still being referenced by others.
In this case, the process handling the reply behaved well and had upped the refcounter. The timer process ignored the refcount and freed the object still in use.
Now that i think of it, the proper thing to do here would probably be just to unref the dlg instead of open-coding around it.
Reference counters are good as long as they are used in predictable circumstances. The problem encountered so far were related to the fact that not all calls have proper signaling (e.g., network issues, buggy clients), the reason for this cleanup routine as well (i.e., also sip protocol relies on a time interval for not receiving ACK).
My goal was to figure out what was the situation, see if it is something predictable that can be fixed in source code -- dialogs staying too long in a state that shouldn't take too long are susceptible to issues and it is better if we know what was the reason.
Daniel