[sr-dev] RLS/PUA concurrency issue

Daniel-Constantine Mierla miconda at gmail.com
Fri Aug 12 17:55:25 CEST 2011


Hi Peter,

Indeed the update of tm function should be safe and the rest looks ok.

I was thinking that it may make sense to have a generic function to get the dialog, with a parameter to control if it looks for temporary or confirmed dialogs, then the other functions to get the dialog to be wrappers around this one. It will save one more iteration through the hash table slot. Considering that in this case most of the time the dialog is found, shouldnt be any relevant impact, anyhow.

You can merge from my point of view. Btw, I noticed two branches, puafix and pua_fix.

Cheers,
Daniel

On Aug 11, 2011, at 6:47 PM, Peter Dunkley <peter.dunkley at crocodile-rcs.com> wrote:

> Hello,
> 
> I have committed my fix to the pd/pua_fix branch (commit id: b93149c756d3e983c70608938f1142ed43ee1834).
> 
> I would appreciate it if some other on this list could look it over before I push it into master.
> 
> Thanks,
> 
> Peter
> 
> On Thu, 2011-08-11 at 15:25 +0100, Peter Dunkley wrote:
>> Hi,
>> 
>> I have a candidate code fix that I have been testing today.  It looks good so far.
>> 
>> If the things continue to go well I will check it into a branch before the end of the day.  I'd appreciate it if some others could look over it before I put it back into master.
>> 
>> Thanks,
>> 
>> Peter
>> 
>> 
>> On Thu, 2011-08-11 at 15:20 +0200, Daniel-Constantin Mierla wrote:
>>> Hello,
>>> 
>>> ... just few more thoughts since I was offline -- I kind of understood that the issue was over UDP, with TCP the UA should re-use the connection (kamailio does it if it is not configured to close the connection immediately), so the order should be ensured.
>>> 
>>> Meanwhile, the other soulution would have been to use the new async module, like:
>>> - if the subscribe dialog does not exist, call async_route() with some sleep interval (1 sec) which should allow the 200ok to come and be processed
>>> 
>>> That as a workaround, nicer should be fixed in the code, if Peter did it, that is great, otherwise I will look over it in the next days -- either with creation of 'early' dialog on SUBSCRIBE, accept NOTIFY and confirmation on 200ok or a queue to keep a list of pending NOTIFYs for processing.
>>> 
>>> Cheers,
>>> Daniel
>>> 
>>> On Thu, Aug 11, 2011 at 9:51 AM, Andrew Miller <andrew.miller at crocodile-rcs.com> wrote:
>>> FYI, Just seen an internal e-mail from Peter saying he thinks he has fixed it. I will test this as soon as I get into the office this morning and we will report back
>>> 
>>> Didn't want anyone putting more time into this if Peter has fixed it.
>>> 
>>> Andy
>>> 
>>> On 11/08/2011 08:42, Andrew Miller wrote:
>>> Klaus,
>>> 
>>> Sorry we should have mentioned that we did investigate this option, however we do not think it will work.
>>> 
>>> I may have got the details below wrong, as Peter has been looking at this, however the reason is something like the following:
>>> 
>>> The main job of the 200 handler is to write a database entry that binds the incoming RLS subscription to the back-end presence subscription. A pointer to the RLS subscription is bound to the INVITE transaction, and is therefore available to the 200 call back function. The same information is not available to the NOTIFY handler - it expects to get this information FROM the database. Therefore, we cannot (unfortunately) handle the NOTIFY in exactly the same way as the 200.
>>> 
>>> Is that right Peter?
>>> 
>>> Andy
>>> 
>>> 
>>> On 11/08/2011 08:30, Klaus Darilion wrote:
>>> 
>>> Am 10.08.2011 18:54, schrieb Daniel-Constantin Mierla:
>>> Hello,
>>> 
>>> I would like to look closer at the issue and figure out possible
>>> solution, but I am traveling for time being, so just quick thoughts.
>>> 
>>> One approach would the similar solution as for the fast CANCEL (which
>>> gets to the server before the INVITE). What we do (in config), we check
>>> if there is an INVITE transaction for the CANCEL and if not we just drop
>>> the CANCEL (no reply). That will force the UA to do retransmissions,
>>> which eventually will come after the INVITE is received/processed.
>>> Does not work with TCP requests.
>>> 
>>> The second idea would be to have a pending queue, keep the NOTIFY for a
>>> while there and when 200ok is coming, look in the queue if it is
>>> something for respective dialog. If no dialog is created after a while,
>>> request that are older in the queue will be just discarded.
>>> The NOTIFY is in an implicit 200 OK. So if there is an ongoing SUBSCRIBE
>>> transaction which matches the NOTIFY, the NOTIFY should trigger the same
>>> actions as the 200 OK. The later arriving 200 OK can then be ignored.
>>> 
>>> regards
>>> klaus
>>> 
>>> Cheers,
>>> Daniel
>>> 
>>> On 8/10/11 6:19 PM, Andrew Miller wrote:
>>> Sorry Pete,
>>> 
>>> That seems to make things better, but does not solve the issue for me.
>>> 
>>> Most times this now clean when a client logs in, but about 1 in 10 I
>>> am still getting an error message. In one case I had 9 error messages
>>> on one log-in.
>>> 
>>> Andy.
>>> 
>>> On 10/08/2011 15:58, Peter Dunkley wrote:
>>> I've been playing around with this here and making presence and rls
>>> use TCP instead of UDP seems to help with this problem.  Presumably
>>> this is because using TCP enforces in-order delivery of messages.
>>> 
>>> To make presence and rls use TCP I:
>>> 
>>>   * Added a ;transport=tcp parameter to the SIP URI I had set for
>>>     presence server_address
>>>   * Added a ;transport=tcp parameter to the SIP URI I had set for rls
>>>     server_address
>>>   * Set the rls outbound_proxy parameter to
>>>     "sip:127.0.0.1;transport=tcp"
>>> 
>>> 
>>> It's not a proper fix, but I think it works around the issue.
>>> 
>>> Regards,
>>> 
>>> Peter
>>> 
>>> On Mon, 2011-08-01 at 13:40 +0200, Klaus Darilion wrote:
>>> Am 01.08.2011 12:28, schrieb Andrew Miller:
>>> I attempted to insert a dialog entry in the hash table on sending the
>>> SUBSCRIBE, unfortunately this did not cure the problem
>>> 
>>> Has anyone any suggestions for the cleanest and easiest method to ensure
>>> that the 200 is handled before the NOTIFY?
>>> The cleanest solution would be to establish the dialog when the NOTIFY
>>> is received although the 200 OK is missing.
>>> 
>>> The NOTIFY can be seen as an implicit 200 OK.
>>> 
>>> regards
>>> Klaus
>>> 
>>> -- 
>>> Daniel-Constantin Mierla -- http://www.asipto.com
>>> Kamailio Advanced Training, Oct 10-13, Berlin: http://asipto.com/u/kat
>>> http://linkedin.com/in/miconda -- http://twitter.com/miconda 
>>> 
>>> _______________________________________________
>>> sr-dev mailing list
>>> sr-dev at lists.sip-router.org
>>> http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-dev
>> 
>> _______________________________________________
>> sr-dev mailing list
>> sr-dev at lists.sip-router.org
>> http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-dev
>> 
> -- 
> Peter Dunkley
> Technical Director
> Crocodile RCS Ltd
> _______________________________________________
> sr-dev mailing list
> sr-dev at lists.sip-router.org
> http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.sip-router.org/pipermail/sr-dev/attachments/20110812/6a93e0e1/attachment.htm>


More information about the sr-dev mailing list