[sr-dev] dedicated mutex for t_continue()

Jason Penton jason.penton at gmail.com
Wed May 13 09:10:11 CEST 2015


Apologies, committed the wrong code. I sort out now

On Wed, 13 May 2015 at 08:59 Daniel-Constantin Mierla <miconda at gmail.com>
wrote:

>  Hi Jason,
>
> the commit doesn't seem correct, because looks like you removed the
> locking completely. It should still stay with locking on reply mutex.
>
> Can you revert the last commit? Then I can take care of removing the
> unnecessary code -- note that the code before your commit was already using
> the reply lock, so was like it should just be, with dedicated mutex code
> being inside defines which were not enabled.
>
> Cheers,
> Daniel
>
>
> On 13/05/15 08:56, Jason Penton wrote:
>
> Hey Daniel,
>
> You are correct this is no longer needed - we actually only used this to
> be able to set a flag on the transaction that we later used for branch
> picking. We made a fix a while back for doing the correct branch picking
> which no longer required this extra flag and therefore no longer required
> the mutex. I have tested in our env and all looks good.
>
>  p.s. I have committed the cleanup.
>
>  Cheers
> Jason
>
> On Tue, 12 May 2015 at 10:34 Daniel-Constantin Mierla <miconda at gmail.com>
> wrote:
>
>>  Hi Jason,
>>
>> ok, would be great to sort it out properly, thanks for your time!
>>
>> Cheers,
>> Daniel
>>
>>
>> On 12/05/15 10:10, Jason Penton wrote:
>>
>> Hey Daniel,
>>
>> Okay great, let me look into this. It will be great if we have one less
>> mutex to worry about ;) - If not required I will remove and commit.
>>
>>  Cheers
>> Jason
>>
>> On Mon, 11 May 2015 at 09:55 Daniel-Constantin Mierla <miconda at gmail.com>
>> wrote:
>>
>>> Hi Jason,
>>>
>>> over the weekend I pushed a patch that disables the use of dedicated
>>> mutex for t_continue(). It can be enabled by defining ENABLE_ASYNC_MUTEX.
>>>
>>> While investigated some reports of crash when removing from time, I
>>> found the potential of a race when t_coninue() is executed at the time
>>> the fr_timer for suspended transaction elapsed. The timer process will
>>> get the transaction out and remove it from timer under the reply lock
>>> and the worker doing t_continue() will get it out under the async lock.
>>>
>>> I looked at the commit you did when introducing the dedicated async
>>> mutex, the note being:
>>>
>>>         - "dedicated lock to prevent multiple invocations of suspend on
>>> tz (reply lock used to be used)"
>>>
>>> Perhaps tz is tx and stands for transmission - however, the reply lock
>>> should be safe for this case as well. Moreover, the continue is like the
>>> suspended branch got a reply and transaction continues processing, which
>>> implies the reply lock is aquired (like execution of failure_route,
>>> which can also happen if fr_timer elapses before t_continue() is
>>> executed).
>>>
>>> Given those, I don't see anymore a reason for dedicated async mutex.
>>> Also, it protects to races of using two mutexes, which can easily lead
>>> to deadlocks (e.g., one process acquires the reply lock and tries to get
>>> the async lock while another one wanted first the reply lock and later
>>> the async lock).
>>>
>>> For now I disabled the code with defines, as I wanted to discuss and be
>>> sure I haven't overlooked any issue you tried to avoid with the
>>> dedicated mutex. Let me know what you think about.
>>>
>>> Cheers,
>>> Daniel
>>>
>>> --
>>> Daniel-Constantin Mierla
>>> http://twitter.com/#!/miconda - http://www.linkedin.com/in/miconda
>>> Kamailio World Conference, May 27-29, 2015
>>> Berlin, Germany - http://www.kamailioworld.com
>>>
>>>
>> --
>> Daniel-Constantin Mierlahttp://twitter.com/#!/miconda - http://www.linkedin.com/in/miconda
>> Kamailio World Conference, May 27-29, 2015
>> Berlin, Germany - http://www.kamailioworld.com
>>
>>
> --
> Daniel-Constantin Mierlahttp://twitter.com/#!/miconda - http://www.linkedin.com/in/miconda
> Kamailio World Conference, May 27-29, 2015
> Berlin, Germany - http://www.kamailioworld.com
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.sip-router.org/pipermail/sr-dev/attachments/20150513/323b0f89/attachment-0001.html>


More information about the sr-dev mailing list