[sr-dev] dedicated mutex for t_continue()

Daniel-Constantin Mierla miconda at gmail.com
Tue May 12 10:34:12 CEST 2015


Hi Jason,

ok, would be great to sort it out properly, thanks for your time!

Cheers,
Daniel

On 12/05/15 10:10, Jason Penton wrote:
> Hey Daniel,
>
> Okay great, let me look into this. It will be great if we have one
> less mutex to worry about ;) - If not required I will remove and commit.
>
> Cheers
> Jason
>
> On Mon, 11 May 2015 at 09:55 Daniel-Constantin Mierla
> <miconda at gmail.com <mailto:miconda at gmail.com>> wrote:
>
>     Hi Jason,
>
>     over the weekend I pushed a patch that disables the use of dedicated
>     mutex for t_continue(). It can be enabled by defining
>     ENABLE_ASYNC_MUTEX.
>
>     While investigated some reports of crash when removing from time, I
>     found the potential of a race when t_coninue() is executed at the time
>     the fr_timer for suspended transaction elapsed. The timer process will
>     get the transaction out and remove it from timer under the reply lock
>     and the worker doing t_continue() will get it out under the async
>     lock.
>
>     I looked at the commit you did when introducing the dedicated async
>     mutex, the note being:
>
>             - "dedicated lock to prevent multiple invocations of
>     suspend on
>     tz (reply lock used to be used)"
>
>     Perhaps tz is tx and stands for transmission - however, the reply lock
>     should be safe for this case as well. Moreover, the continue is
>     like the
>     suspended branch got a reply and transaction continues processing,
>     which
>     implies the reply lock is aquired (like execution of failure_route,
>     which can also happen if fr_timer elapses before t_continue() is
>     executed).
>
>     Given those, I don't see anymore a reason for dedicated async mutex.
>     Also, it protects to races of using two mutexes, which can easily lead
>     to deadlocks (e.g., one process acquires the reply lock and tries
>     to get
>     the async lock while another one wanted first the reply lock and later
>     the async lock).
>
>     For now I disabled the code with defines, as I wanted to discuss
>     and be
>     sure I haven't overlooked any issue you tried to avoid with the
>     dedicated mutex. Let me know what you think about.
>
>     Cheers,
>     Daniel
>
>     --
>     Daniel-Constantin Mierla
>     http://twitter.com/#!/miconda <http://twitter.com/#%21/miconda> -
>     http://www.linkedin.com/in/miconda
>     Kamailio World Conference, May 27-29, 2015
>     Berlin, Germany - http://www.kamailioworld.com
>

-- 
Daniel-Constantin Mierla
http://twitter.com/#!/miconda - http://www.linkedin.com/in/miconda
Kamailio World Conference, May 27-29, 2015
Berlin, Germany - http://www.kamailioworld.com

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.sip-router.org/pipermail/sr-dev/attachments/20150512/ad848153/attachment-0001.html>


More information about the sr-dev mailing list