[Serusers] Re: usrloc loading

Thu Nov 30 23:33:17 CET 2006

samuel wrote:
> >Where is the time saving coming from then?
> 
> I think the idea behind was the following:
> The use case is for big providers with lots of entries in the usrloc
> database. A restart in such situation might lead to stop in the
> service for quite a few minutes (i don't recall the numbers) while the
> server is loading the data.

Numbers obviously depend on your hardware and whether you have a local
database. But they are somewhere in the range of 20 seconds for 50,000
entries, 2 minutes for 100,000 entries and 10 minuits for 500,000
entries.

Part of the problem and also of the memory usage problem is that the
database interface of SER requires that the entire table is slurped into
SER's process memory instead of fetching and processing it row by row.
This can cause funny behaviour during start-up and a near heart attack
for the sysadmin.

> If you split the data in chunks and load it sequentally, you can start
> serving without interrumption...

As far as I understand the announcement (haven't looked at the actual
code), the idea is to load everything inside an extra process. The
problem with that kind of speed-up is that your responses will not be
correct during the loading phase. I am not sure if this is better than
being down as it may cause support calls and false problem alerts. If
you are in a phase of troubles and have to restart often, this wrong
behaviour can go on for hours.

But anyways, in my experience with large scale installations, the whole
caching thing in usrloc is unnecessary. I have it on good authority that
a modern PC can handle more than 100,000 subscribers with a cacheless
usrloc and a local database. I once wrote a replacement module that did
lookup() directly to the database without any usrloc. It was able to
serve substantially more than 100,000 subscribers. (Disclaimer: This
actually depends on your usage patterns. I can't provide CPS values,
though.)

This leaves the registrar stuff. But that is writing to the database
anyways. What would be more important here is to have it transactional
in a sensible way. They way it works now is that if you have database
problems, you delay your response which makes your UAs re-send the
request which causes more database troubles. (This, BTW, is true for
INVITE processing as well -- here you process your request with all the
checks and database lookups and whatnots only to find out upon t_relay()
that, oops, re-sent INVITE, needs to be dropped, all for nothing).
True, this is not a problem if you use the right db_mode.

But there is another issue and that is reliability. At a certain point,
you need to have a second SIP server because your superiors read about
the five-nine thing. IMHO the easiest way to set this up is by having
several servers doing the exact same thing and then load balancing
traffic between them.  This is only possible if you have a cacheless
usrloc and if registrations are written to the database ASAP.

So, I do think that this cache is one of those optimizations that look
good on paper but in practice are missing the point. That, of course,
are just my sixteen øre. And just if someone cares to know, we are using
Andreas' usrloc-cl in production and appart from a segfault I introduced
while porting in our changes, it runs very smoothly.

Regards,
Martin