[sr-dev] RLS/PUA concurrency issue

Andrew Miller andrew.miller at crocodile-rcs.com
Mon Aug 1 12:28:33 CEST 2011


Folks,

I've been having a bit of a battle with a concurrency issue.

If we have a reasonable number of contacts in an RLS resource list 
(around 50 does it on my test server), we see a get the following error 
message thrown up between 2 and 6 times whenever the client logs in.
  ERROR: rls [resource_notify.c:663]: no presence dialog record for 
non-TERMINATED state uri pres_uri = sip:0033 at lab8.croc.internal 
watcher_uri = sip:ernie at lab8.croc.internal
(I've extended the debug here to include the URIs, so I can see what is 
not being found)

It is not always the same URIs that go missing, nor is it always the 
same number of faults.

On investigation this turns out to be a race condition.

subs_cback_func (pua/send_subscribe.c) locks the presentity hash table 
and inserts a dialog entry when it receives a 200 to the subscribe.
rls_handle_notify (rls/resource_notify.c) calls pua_get_record_id 
(pua/hash.c get_record_id()) which also locks the presentity hash table 
looks up the dialog.

It seems that in some cases the NOTIFY is getting the lock before the 
200 to the SUBSCRIBE. Thus the NOTIFY handler is looking for the dialog 
before the 200 handler has inserted it.

I attempted to insert a dialog entry in the hash table on sending the 
SUBSCRIBE, unfortunately this did not cure the problem

Has anyone any suggestions for the cleanest and easiest method to ensure 
that the 200 is handled before the NOTIFY?

Andy Miller
Crocodile RCS




More information about the sr-dev mailing list