Hi Daniel,

Calling srand() to re-seed frequently in RLS meant that in full-state NOTIFY requests the instance IDs within the resource nodes in the NOTIFYs were repeatable.  This is required because, when you have forking of back-end SUBSCRIBEs, you need to be able to uniquely (but consistently) identify presentities for different instances of the same resource.

Unfortunately, using srand() doesn't work here because you need to store the instance IDs so that they can be used in non-full-state NOTIFY requests (which may only contain updates for specific instances of resources rather than the full set).  If you had two instances of a resource A and B in a full-state NOTIFY A would get ID 1 and B would get ID 2.  If you then have a non-full-state NOTIFY updates for A must still have ID 1 and updates for B must still have ID 2.  However, using srand() to re-seed means that the first instance (which might be B is only B has been updated) gets ID 1.  So using srand() here fundamentally doesn't work.

The bad side-effect of re-seeding srand() frequently was that it broke other things that, quite rightly, relied on rand() giving random numbers.

This point is academic anyway.  PUA doesn't support multiple final responses/parallel-forking of back-end SUBSCRIBEs, and the structure and use of the rls_presentity table means that only a single presentity instance can be stored for a resource anyway.  Until these things change all that is needed for the instance ID is that it is unique within a resource node and consistent between NOTIFYs for that resource within a dialog.  A simple fixed string meets this requirement.  So I removed the srand() from PUA and fixed the branch-tag related problem it was causing us.

There is still a more general problem in Kamailio,  srand() is called during running (not just startup) in quite a lot of cases.  This is not ideal and could cause problems similar to the one we saw here at Crocodile in other use cases.  I think srand() should be called just once per process during init(), and probably with some seed value involving time and the process ID number.

It might also be worth having some Kamailio specific random number library functions in the core and simply telling developers that they should always use them and never srand(), rand() and so on.

Regards,

Peter

On Fri, 2012-04-20 at 12:39 +0200, Daniel-Constantin Mierla wrote:
Hello,

had not time yet to get and look properly at this thread, is the issue fixed by Peter's commit related to some srand() usage?

Cheers,
Daniel

On 4/16/12 1:15 PM, Hugh Waite wrote:
I now realise that rls was deliberately using the same sequence of random numbers to satisfy RFC4662 §5.5
5.5. Instance Attributes

Each resource element contains zero or more instance elements. These
instance elements are used to represent a single notifier for the
resource. For event packages that allow forking, multiple virtual
subscriptions may exist for a given resource. Multiple virtual
subscriptions are represented as multiple instance elements in the
corresponding resource element. For subscriptions in which forking
does not occur, at most one instance will be present for a given
resource.

The "id" attribute contains an opaque string used to uniquely
identify the instance of the resource. The "id" attribute is unique
only within the context of a resource. Construction of this string
is an implementation decision. Any mechanism for generating this
string is valid, as long as uniqueness within the resource is
assured.

Seeding srand() with the values 0, 1, 2... always gives the same sequence of ID strings. Maybe someone has an opinion on whether this should be implemented differently, or whether a value should be stored for situations where multiple resource instances are removed/reordered.

Maybe kamailio should use a unique ID (from srutils) in the Via header for uac requests (in modules/tm/h_table.h) anyway, to prevent a module from affecting Via branch params by using srand().

Regards,
Hugh

On 13/04/2012 15:26, Hugh Waite wrote:
Hi,
One of the places that uses rand() is modules/tm/h_table.c:init_synonym_id . Requests generated by kamailio (as a uac) use this to generate the Via branch tag. If any module calls srand(), it affects the Via branch for subsequent requests.

The actual bug we found is in rls/notify.c:generate_string(). The add_resource_instance() function will re-seed srand() with 0 (zero), leading to nearly every NOTIFY sent by rls having the same "random" number in the Via branch. I am sure this was the cause of lost replies, timeouts and dropped subscriptions that we were seeing (and appears to have gone away after removing it).

Although, I could just remove the srand from rls notify.c, I wondered if it should be using a different random function, and also whether init_synonym_id should use something more unique for the Via branch parameter.

A quick check has shown a few places that call srand() within the code, although they probably have less drastic consequences.

Regards,
Hugh

On 13/04/2012 14:01, Daniel-Constantin Mierla wrote:
Hello,

I haven't looked at the issue reported by Hugh, so by now just comments on srutils/sruid.

The idea was to have an unique id generator without linking to an external library -- my first purpose was to use it for temporary GRUU ids (RFC5627), I am just about to push to master the gruu support in registrar/usrloc.

Then I thought it might be useful in other places, such as dialog unique id.

I added it as part of lib, since its target usage was for modules so far, but if needed for some core processing, the two files (rather small by now) can be moved in the core.

Cheers,
Daniel

On 4/13/12 2:50 PM, Henning Westerholt wrote:
On Friday 13 April 2012, Hugh Waite wrote:
I have a question about random number generation within kamailio.

A number of modules use rand() to get a random value and in some places
is re-seeding with srand(). I believe this is dangerous because rand()
is used in the Via branch tag generator.
We have detected some real bugs (where srand is reseeding with 0 for
every message, causing transaction mis-matching) but I'm not sure of the
correct way to fix this (other than remove srand()).

Should all modules be using a 'core' random function (e.g. in srutils?)
? And if so, is this library documented?

Regards,
Hugh
Hi Hugh,

for the purpose getting a pseudo-random number (i.e. not for cryptographic
functionality) we should consolidate on a single random function. There is the
recent introduced srutils/sruid code, then there exists a (IMHO stronger)
pseudo-random number generator in rand/fastrand and then there is of course
rand().

Maybe Daniel can comment about the purpose of the srutils function, IMHO
consolidating on fastrand or one of the stronger function (d_rand etc..) from
stdlib.h would be fine.

The re-seeding the internal state of rand() with srand during runtime sounds
wrong toe me and should be removed/ fixed.

Viele Grüße/ best regards,

Henning Westerholt

_______________________________________________
sr-dev mailing list
sr-dev@lists.sip-router.org
http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-dev





-- 
Hugh Waite
Senior Design Engineer
Crocodile RCS Ltd.


_______________________________________________
sr-dev mailing list
sr-dev@lists.sip-router.org
http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-dev

-- 
Daniel-Constantin Mierla
Kamailio Advanced Training, April 23-26, 2012, Berlin, Germany
http://www.asipto.com/index.php/kamailio-advanced-training/
_______________________________________________
sr-dev mailing list
sr-dev@lists.sip-router.org
http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-dev

-- 
Peter Dunkley
Technical Director
Crocodile RCS Ltd