[sr-dev] Throttling NOTIFY requests from presence

Anca Vamanu anca.vamanu at 1and1.ro
Mon Apr 2 16:39:17 CEST 2012


Hi Peter,


Sounds like a useful feature. Indeed 5 seconds is immediate for normal 
presence (availability status)and we shouldn't require more frequent 
notifications. Klaus was talking about 'dialog' event where this feature 
is actually not useful as the status changes quite fast.

More than this, as you mentioned, having a special notifier process will 
also solve the problem of bad cseq due to concurrency, so even the more 
useful.

 From what I understand this solution will work only if using DB_ONLY 
mode. WRITE_THROUGH mode is actually useless if this feature is used(as 
it was actually helping not to query the active_watchers table for 
sending notifies) . This should be kept in mind and mentioned in the 
documentation.


On 03/30/2012 03:10 PM, Peter Dunkley wrote:
> Hi,
>
> With further thinking I'd like to do something along these lines:
> - Leave the presentity table and handling alone.
> - When a new modparam (notifier_poll_rate) is non zero set a flag 
> (with value based on a hash of the Call-ID of the dialog) in the 
> dialog indicating a NOTIFY is required (either for all watchers of a 
> presentity after a PUBLISH, or just one dialog after a SUBSCRIBE), but 
> do not send NOTIFY requests immediately.
> - Have a notifier process (running on a timer as per my last change to 
> RLS) that checks for dialogs needing updated (a subset each time it is 
> run based on flag value) and sends NOTIFY requests.
> - Each time the notifier runs 1/(waitn_time * notifier_poll_rate) of 
> the dialogs is checked.  waitn_time is another new modparam.
> - Sensible values of notifier_poll_rate and waitn_time are 10 and 5 
> (default) respectively.  This means the notifier checks for work 10 
> times a second, and NOTIFY requests take at most 5 seconds to be 
> sent.  The default for notifier_poll_rate will be 0 which means use 
> the existing behaviour.
>
> I will need to make sure that, when in DB only mode and with the 
> notifier enabled, I do not retrieve the dialogs when doing the 
> immediate processing of a PUBLISH.  Simply doing an update() on the 
> flags and checking affected_rows() (if available) should be enough.  
> Similarly, when I receive a SUBSCRIBE I should update the remote CSeq 
> and the flags at the same time, and I can use affected_rows() (if 
> available) to confirm the dialog exists and send the right response.  
> This should minimise the DB impact of this change.

Indeed, no need to query the active_watchers table from anywhere else 
except the notifier process. And actually, when Publish is received you 
don't even need to know if there were rows to update or not.

The solution sounds very good to me.

Good luck on implementing it and thanks for sharing :) .

Regards,
Anca


>
> This should have the effect of maintaining the existing behaviour for 
> those who need/want it, evening out the sending of NOTIFY requests as 
> much as possible, ensuring a NOTIFY is always sent in a timely fashion 
> (within 5 seconds should count as immediate for a NOTIFY after a 
> SUBSCRIBE).
>
> Once this works with a single notifier process it should be quite 
> simple to extend it to having a pool of notifier processes if that is 
> required.
>
> How does this sound?
>
> Regards,
>
> Peter
>
>
> On Fri, 2012-03-30 at 09:26 +0100, Peter Dunkley wrote:
>> Hi,
>>
>> When running a soak on presence with RLS NOTIFY requests are by far 
>> the most common.  Further, especially with RLS in place, performance 
>> of Kamailio presence can be quite limited.  The writers of the 
>> presence RFCs do seem to acknowledge that this is indeed an issue - 
>> hence the restrictions on the rate at which NOTIFY requests can be sent.
>>
>> I am looking for ways to sensibly reduce the number of NOTIFYs 
>> generated, and in this case, it seems that the presence module is 
>> actually sending them more frequently than the specification says it 
>> should anyway.
>>
>> Reducing the number of NOTIFYs sent doesn't just help with presence, 
>> but it should also reduce the amount of work that the RLS module has 
>> to do too - as it will receive (and have to process) less NOTIFYs as 
>> well.
>>
>> Of course I am open to other ways to approach this problem.  Another 
>> thing I have been considering is setting the flags and fields I have 
>> discussed below but not sending NOTIFYs right away at all.  I would 
>> then have a timer task (perhaps at the 100ms you suggest below) that 
>> generates outstanding NOTIFYs - but not at a rate of more than once 
>> per five seconds for each presentity.  Effectively, this timer task 
>> would be a notifier, which I think is one of the things suggested 
>> recently on the list.
>>
>> Regards,
>>
>> Peter
>>
>> On Fri, 2012-03-30 at 10:01 +0200, Klaus Darilion wrote:
>>> On 29.03.2012 23:19, Peter Dunkley wrote:
>>> >  Hi,
>>> >
>>> >  RFC 3856 section 6.10 states: "A PA SHOULD NOT generate notifications for
>>> >  a single presentity at a rate of more than once every five seconds."
>>>
>>> I wonder if this is useful. E.g. a user tries to call somebody, but the
>>> target is busy. Thus, the call will last maybe 3 seconds having the
>>> dialog states: trying, proceeding, early, terminated.
>>>
>>> How is it supposed to work? It just sends trying, but no the others? Or
>>> will they be queued, so only trying, and 5 seconds later, terminated is
>>> sent?
>> I don't understand what this has to do with calls.  I am just talking 
>> about NOTIFY requests from presence after a change in presentity or a 
>> SUBSCRIBE.
>>
>>> >
>>> >  I would like to add this to the presence module (making the rate
>>> >  configurable).
>>> >
>>> >  I have an idea as to how I would like to do it:
>>> >  - Add a last notified time-stamp field to each presentity
>>> >  - Add a updated since last notified flag field to each presentity
>>> >  - Add a notify required flag field to each active_watcher
>>> >
>>> >  - When a presentity is updated the last notified time-stamp is checked.
>>> >  If the time is far enough in the past the notifies are sent and the
>>> >  time-stamp is updated.  If enough time has not passed the updated since
>>> >  flag is set for the presentity and the notify required flag is set for all
>>> >  active_watchers of that presentity.
>>> >  - When a presentity is subscribed to (this includes re-subscribes) the
>>> >  last notified time-stamp is checked.  If the time is far enough in the
>>> >  past the (single) notify is sent and the time-stamp is updated.  If enough
>>> >  time has not passed the updated since flag is set for the presentity and
>>> >  the notify required flag is set for this active_watcher record.
>>>
>>> RFC3265 states that a NOTIFY MUST sent immediately after every SUBSCRIBE.
>>>
>> Ah...  But define immediately.  A 2XX response should also be sent 
>> immediately after every SUBSCRIBE, but the timeout is 64*T1 (32 
>> seconds).  So a NOTIFY that, in many cases, comes out right away, or 
>> in the worst case within 5 to 10 seconds, should count as immediate.
>>
>>> >  - A timer (the minimum time between subscribes - default 5 seconds) is
>>> >  run.  On expiry a query is done on the presentity table for presentities
>>> >  that have been updated _AND_ the last notified time is more than the
>>> >  minimum time ago.  For each of these presentities, a query is done on the
>>> >  active_watchers for watchers of that presentity that have the notify
>>> >  required flag set.  Notifies containing the presentities are then sent to
>>> >  the watchers waiting on them.
>>> >
>>> >  This should ensure that no presentity notifies more than it should, while
>>> >  ensuring that all changes are (eventually) sent out, and all subscribes
>>> >  result in a notify (eventually) being sent.  Eventually being typically
>>> >  within 5 seconds and in under 10 seconds in the worst case (assuming the
>>> >  default setting of 5 seconds).
>>> >
>>> >  Can anyone see any problems with this?
>>>
>>> I wonder why you delay NOTIFYs for (re)SUBSCRIBEs? On (re)SUBSCRIBEs
>>> just NOTIFY with last published state and do not set the "notify
>>> required" flag. Then it may happen that the minimum NOTIFY interval is
>>> not maintained the first few seconds after a (re)SUBSCRIBE but IMO this
>>> is not that bad and enables immediate notification after every
>>> (re)SUBSCRIBE.
>>>
>> As you say, if I receive a PUBLISH and then a re-SUBSCRIBE to a 
>> presentity then I will send out NOTIFYs too often.  I have seen (bad) 
>> clients that re-PUBLISH immediately before re-SUBSCRIBE and 
>> un-PUBLISH immediately before un-SUBSCRIBE.  If a lot of people use 
>> these bad clients then you have a problem.
>>
>>> >  Are there any objections to me implementing this?
>>>
>>> Make it configurable :-) E.g. setting "minimum_notification_delay" to 0
>>> disables the feature.
>>>
>> Of course :-)
>>
>>> It may also increase DB load and cause more race-conditions (to be
>>> resolved with DB-transactions) due to more DB lookups.
>>>
>> It also helps with a race condition I think I am seeing (and similar 
>> to that recently discussed on this list) where you can have more than 
>> one NOTIFY outstanding on a dialog at a time, and if these get out of 
>> sequence then you do definitely get problems.
>>
>> While there may be more DB load overall, I suspect that because this 
>> helps spread it out somewhat it should make the performance more 
>> even.  Also many of these queries will be much simpler and lighter as 
>> I do not need to always retrieve the presentity field, which is 
>> particularly large.
>>
>>> E.g. what if the timer finds a presentity for which NOTIFYs needs to be
>>> sent and - while iterating over all the presentities - a new PUBLISH is
>>> received which triggers immediate NOTIFY. It may happen that both
>>> processes manipulate the active watchers table at the same time (notify
>>> required flag).
>> Is this not also a problem with rls_presentity which uses a very 
>> similar mechanism already?
>>
>>> >  In theory running a (5 second) timer could make presence "lumpy" in the
>>> >  same way RLS is (see my previous email).  However, if this proves to be
>>> >  the case I believe the same mechanism I have proposed for RLS can be used
>>> >  here.
>>>
>>> Why run the timer only every 5 seconds? Run it every second (or 100ms)
>>> then it should not be lumpy anymore.
>>
>>> regards
>>> Klaus
>>
>> _______________________________________________
>> sr-dev mailing list
>> sr-dev at lists.sip-router.org  <mailto:sr-dev at lists.sip-router.org>
>> http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-dev
>>
> -- 
> Peter Dunkley
> Technical Director
> Crocodile RCS Ltd
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.sip-router.org/pipermail/sr-dev/attachments/20120402/5000be61/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/png
Size: 1057 bytes
Desc: not available
URL: <http://lists.sip-router.org/pipermail/sr-dev/attachments/20120402/5000be61/attachment-0001.png>


More information about the sr-dev mailing list