[sr-dev] RLS/PUA concurrency issue

Peter Dunkley peter.dunkley at crocodile-rcs.com
Fri Aug 12 18:15:44 CEST 2011


Hi Daniel,

I will merge back to master now.

There are two branches because I had some finger trouble.  Sorry about
that.

Thanks,

Peter

On Fri, 2011-08-12 at 17:55 +0200, Daniel-Constantine Mierla wrote:
> Hi Peter,
> 
> 
> Indeed the update of tm function should be safe and the rest looks ok.
> 
> 
> I was thinking that it may make sense to have a generic function to
> get the dialog, with a parameter to control if it looks for temporary
> or confirmed dialogs, then the other functions to get the dialog to be
> wrappers around this one. It will save one more iteration through the
> hash table slot. Considering that in this case most of the time the
> dialog is found, shouldnt be any relevant impact, anyhow.
> 
> 
> You can merge from my point of view. Btw, I noticed two branches,
> puafix and pua_fix.
> 
> 
> Cheers,
> Daniel
> 
> On Aug 11, 2011, at 6:47 PM, Peter Dunkley
> <peter.dunkley at crocodile-rcs.com> wrote:
> 
> 
> 
> > Hello,
> > 
> > I have committed my fix to the pd/pua_fix branch (commit id:
> > b93149c756d3e983c70608938f1142ed43ee1834).
> > 
> > I would appreciate it if some other on this list could look it over
> > before I push it into master.
> > 
> > Thanks,
> > 
> > Peter
> > 
> > On Thu, 2011-08-11 at 15:25 +0100, Peter Dunkley wrote:
> > 
> > > Hi,
> > > 
> > > I have a candidate code fix that I have been testing today.  It
> > > looks good so far.
> > > 
> > > If the things continue to go well I will check it into a branch
> > > before the end of the day.  I'd appreciate it if some others could
> > > look over it before I put it back into master.
> > > 
> > > Thanks,
> > > 
> > > Peter
> > > 
> > > 
> > > On Thu, 2011-08-11 at 15:20 +0200, Daniel-Constantin Mierla wrote:
> > > 
> > > > Hello,
> > > > 
> > > > ... just few more thoughts since I was offline -- I kind of
> > > > understood that the issue was over UDP, with TCP the UA should
> > > > re-use the connection (kamailio does it if it is not configured
> > > > to close the connection immediately), so the order should be
> > > > ensured.
> > > > 
> > > > Meanwhile, the other soulution would have been to use the new
> > > > async module, like:
> > > > - if the subscribe dialog does not exist, call async_route()
> > > > with some sleep interval (1 sec) which should allow the 200ok to
> > > > come and be processed
> > > > 
> > > > That as a workaround, nicer should be fixed in the code, if
> > > > Peter did it, that is great, otherwise I will look over it in
> > > > the next days -- either with creation of 'early' dialog on
> > > > SUBSCRIBE, accept NOTIFY and confirmation on 200ok or a queue to
> > > > keep a list of pending NOTIFYs for processing.
> > > > 
> > > > Cheers,
> > > > Daniel
> > > > 
> > > > On Thu, Aug 11, 2011 at 9:51 AM, Andrew Miller
> > > > <andrew.miller at crocodile-rcs.com> wrote:
> > > > 
> > > >         FYI, Just seen an internal e-mail from Peter saying he
> > > >         thinks he has fixed it. I will test this as soon as I
> > > >         get into the office this morning and we will report back
> > > >         
> > > >         Didn't want anyone putting more time into this if Peter
> > > >         has fixed it.
> > > >         
> > > >         Andy
> > > >         
> > > >         On 11/08/2011 08:42, Andrew Miller wrote:
> > > >         
> > > >                 Klaus,
> > > >                 
> > > >                 Sorry we should have mentioned that we did
> > > >                 investigate this option, however we do not think
> > > >                 it will work.
> > > >                 
> > > >                 I may have got the details below wrong, as Peter
> > > >                 has been looking at this, however the reason is
> > > >                 something like the following:
> > > >                 
> > > >                 The main job of the 200 handler is to write a
> > > >                 database entry that binds the incoming RLS
> > > >                 subscription to the back-end presence
> > > >                 subscription. A pointer to the RLS subscription
> > > >                 is bound to the INVITE transaction, and is
> > > >                 therefore available to the 200 call back
> > > >                 function. The same information is not available
> > > >                 to the NOTIFY handler - it expects to get this
> > > >                 information FROM the database. Therefore, we
> > > >                 cannot (unfortunately) handle the NOTIFY in
> > > >                 exactly the same way as the 200.
> > > >                 
> > > >                 Is that right Peter?
> > > >                 
> > > >                 Andy
> > > >                 
> > > >                 
> > > >                 On 11/08/2011 08:30, Klaus Darilion wrote:
> > > >                 
> > > >                         
> > > >                         Am 10.08.2011 18:54, schrieb
> > > >                         Daniel-Constantin Mierla:
> > > >                         
> > > >                                 Hello,
> > > >                                 
> > > >                                 I would like to look closer at
> > > >                                 the issue and figure out
> > > >                                 possible
> > > >                                 solution, but I am traveling for
> > > >                                 time being, so just quick
> > > >                                 thoughts.
> > > >                                 
> > > >                                 One approach would the similar
> > > >                                 solution as for the fast CANCEL
> > > >                                 (which
> > > >                                 gets to the server before the
> > > >                                 INVITE). What we do (in config),
> > > >                                 we check
> > > >                                 if there is an INVITE
> > > >                                 transaction for the CANCEL and
> > > >                                 if not we just drop
> > > >                                 the CANCEL (no reply). That will
> > > >                                 force the UA to do
> > > >                                 retransmissions,
> > > >                                 which eventually will come after
> > > >                                 the INVITE is
> > > >                                 received/processed.
> > > >                         
> > > >                         Does not work with TCP requests.
> > > >                         
> > > >                         
> > > >                                 The second idea would be to have
> > > >                                 a pending queue, keep the NOTIFY
> > > >                                 for a
> > > >                                 while there and when 200ok is
> > > >                                 coming, look in the queue if it
> > > >                                 is
> > > >                                 something for respective dialog.
> > > >                                 If no dialog is created after a
> > > >                                 while,
> > > >                                 request that are older in the
> > > >                                 queue will be just discarded.
> > > >                         
> > > >                         The NOTIFY is in an implicit 200 OK. So
> > > >                         if there is an ongoing SUBSCRIBE
> > > >                         transaction which matches the NOTIFY,
> > > >                         the NOTIFY should trigger the same
> > > >                         actions as the 200 OK. The later
> > > >                         arriving 200 OK can then be ignored.
> > > >                         
> > > >                         regards
> > > >                         klaus
> > > >                         
> > > >                         
> > > >                                 Cheers,
> > > >                                 Daniel
> > > >                                 
> > > >                                 On 8/10/11 6:19 PM, Andrew
> > > >                                 Miller wrote:
> > > >                                 
> > > >                                         Sorry Pete,
> > > >                                         
> > > >                                         That seems to make
> > > >                                         things better, but does
> > > >                                         not solve the issue for
> > > >                                         me.
> > > >                                         
> > > >                                         Most times this now
> > > >                                         clean when a client logs
> > > >                                         in, but about 1 in 10 I
> > > >                                         am still getting an
> > > >                                         error message. In one
> > > >                                         case I had 9 error
> > > >                                         messages
> > > >                                         on one log-in.
> > > >                                         
> > > >                                         Andy.
> > > >                                         
> > > >                                         On 10/08/2011 15:58,
> > > >                                         Peter Dunkley wrote:
> > > >                                         
> > > >                                                 I've been
> > > >                                                 playing around
> > > >                                                 with this here
> > > >                                                 and making
> > > >                                                 presence and rls
> > > >                                                 use TCP instead
> > > >                                                 of UDP seems to
> > > >                                                 help with this
> > > >                                                 problem.
> > > >                                                  Presumably
> > > >                                                 this is because
> > > >                                                 using TCP
> > > >                                                 enforces
> > > >                                                 in-order
> > > >                                                 delivery of
> > > >                                                 messages.
> > > >                                                 
> > > >                                                 To make presence
> > > >                                                 and rls use TCP
> > > >                                                 I:
> > > >                                                 
> > > >                                                   * Added
> > > >                                                 a ;transport=tcp
> > > >                                                 parameter to the
> > > >                                                 SIP URI I had
> > > >                                                 set for
> > > >                                                     presence
> > > >                                                 server_address
> > > >                                                   * Added
> > > >                                                 a ;transport=tcp
> > > >                                                 parameter to the
> > > >                                                 SIP URI I had
> > > >                                                 set for rls
> > > >                                                 
> > > >                                                 server_address
> > > >                                                   * Set the rls
> > > >                                                 outbound_proxy
> > > >                                                 parameter to
> > > >                                                 
> > > >                                                 "sip:127.0.0.1;transport=tcp"
> > > >                                                 
> > > >                                                 
> > > >                                                 It's not a
> > > >                                                 proper fix, but
> > > >                                                 I think it works
> > > >                                                 around the
> > > >                                                 issue.
> > > >                                                 
> > > >                                                 Regards,
> > > >                                                 
> > > >                                                 Peter
> > > >                                                 
> > > >                                                 On Mon,
> > > >                                                 2011-08-01 at
> > > >                                                 13:40 +0200,
> > > >                                                 Klaus Darilion
> > > >                                                 wrote:
> > > >                                                 
> > > >                                                         Am
> > > >                                                         01.08.2011 12:28, schrieb Andrew Miller:
> > > >                                                         
> > > >                                                                 I attempted to insert a dialog entry in the hash table on sending the
> > > >                                                                 SUBSCRIBE, unfortunately this did not cure the problem
> > > >                                                                 
> > > >                                                                 Has anyone any suggestions for the cleanest and easiest method to ensure
> > > >                                                                 that the 200 is handled before the NOTIFY?
> > > >                                                         
> > > >                                                         The
> > > >                                                         cleanest
> > > >                                                         solution
> > > >                                                         would be
> > > >                                                         to
> > > >                                                         establish the dialog when the NOTIFY
> > > >                                                         is
> > > >                                                         received
> > > >                                                         although
> > > >                                                         the 200
> > > >                                                         OK is
> > > >                                                         missing.
> > > >                                                         
> > > >                                                         The
> > > >                                                         NOTIFY
> > > >                                                         can be
> > > >                                                         seen as
> > > >                                                         an
> > > >                                                         implicit
> > > >                                                         200 OK.
> > > >                                                         
> > > >                                                         regards
> > > >                                                         Klaus
> > > >                                                         
> > > > 
> > > > -- 
> > > > Daniel-Constantin Mierla -- http://www.asipto.com
> > > > Kamailio Advanced Training, Oct 10-13, Berlin:
> > > > http://asipto.com/u/kat
> > > > http://linkedin.com/in/miconda -- http://twitter.com/miconda 
> > > > 
> > > > 
> > > > _______________________________________________
> > > > sr-dev mailing list
> > > > sr-dev at lists.sip-router.org
> > > > http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-dev
> > > 
> > > 
> > > _______________________________________________
> > > sr-dev mailing list
> > > sr-dev at lists.sip-router.org
> > > http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-dev
> > > 
> > 
> > -- 
> > Peter Dunkley
> > Technical Director
> > Crocodile RCS Ltd
> > _______________________________________________
> > sr-dev mailing list
> > sr-dev at lists.sip-router.org
> > http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-dev
> > 
> 
> _______________________________________________
> sr-dev mailing list
> sr-dev at lists.sip-router.org
> http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-dev


-- 
Peter Dunkley
Technical Director
Crocodile RCS Ltd
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.sip-router.org/pipermail/sr-dev/attachments/20110812/239b996b/attachment-0001.htm>


More information about the sr-dev mailing list