[Serusers] Re: usrloc loading

Tue Dec 5 10:54:13 CET 2006

Salut,

Jiri Kuthan wrote:
> At 23:33 30/11/2006, Martin Hoffmann wrote:
> >
> >Part of the problem and also of the memory usage problem is that the
> >database interface of SER requires that the entire table is slurped into
> >SER's process memory instead of fetching and processing it row by row.
> >This can cause funny behaviour during start-up and a near heart attack
> >for the sysadmin.
>
> it's a trade-off. I recall quite some providers who would have had
> a heart attack if usrloc was not cached.

This comment wasn't about the caching per se. The database interface
allows you to access all rows as an array. This is rarely if ever
needed. If the interface instead had a function a la
dbf->get_next_row(), you wouldn't need to slurp a table of thousands of
rows into pkg_mem first.

Another short-coming of the database API is that you can't do a "where
expires < now()". This, however, is only a problem if you teach SER not
to delete expired rows from the database and then forget to run the cron
job that does it (Reminds me that I owe Atle a cookie for that one).

> (think what happens when
> a popular IAD vendor sets its IADs to reregister at 3am)

If you have enough of those, the only thing you can do here is starting
to 503 them. Just an idea: The problem really is that all UDP processes
are stuck waiting for the database and new requests wouldn't get handled
(which causes a re-sent storm that eventually kills you). If one counts
the processes that are stuck, one can write a function that sends a
503 back if only one or two processes are left.

> The problem
> may not appear on SIP side but on DB side, though.
> 
> Basically, you can preload (which is what we do), not to cache 
> (which under some circumstances may cause real bad heart-attack)
> or perhaps something inbetween (less than 100% cache). Given
> other bottlenecks and price of memory, prelaoding seems feasibly
> the only down side is the loading time. This can be compensated
> by a reasonable network design with redundancy.

What you forget here is that your database has a query cache (or should
have). This one is much better suited for this because it can cope with
changes to the database from somewhere else. (We had to use serctl to
update aliases which sometimes didn't work. The resulting script that
tries to insert the alias, then checks whether it is actually there is
quite impressive).

Plus, usrloc is actually only one out of two or three querries you do
per INVITE: does_uri_exist() is probably done on every one (at least if
you have call forwarding) and avp_load() is likely to be done for all
incoming calls (That's 0.9, of course, dunno about 0.10 yet).

What killed me once wasn't usrloc but the avp_load(). And that was only
because the indexes on the table were screwed and the select did a full
table scan every time.

> >This leaves the registrar stuff. But that is writing to the database
> >anyways. What would be more important here is to have it transactional
> >in a sensible way. They way it works now is that if you have database
> >problems, you delay your response which makes your UAs re-send the
> >request which causes more database troubles. (This, BTW, is true for
> >INVITE processing as well -- here you process your request with all the
> >checks and database lookups and whatnots only to find out upon t_relay()
> >that, oops, re-sent INVITE, needs to be dropped, all for nothing).
> >True, this is not a problem if you use the right db_mode.
> 
> I think this is a good place for improvement indeed. We have been
> thinking of some aggregation of delayed writes but haven't moved
> forward on this yet.

I think a function "t_go_stateful()" might be enough (and use t_reply()
in the registrar). The function checks if a transaction for the request
exists and if so, ends processing right away. Otherwise it creates a
transaction in a prelimary state.

> Well -- it is certainly possible but you actually just push the problem
> from SER cluster to a DB cluster, which may bring you other type of
> headache.

Probably, but in this scenario I have several options to solve this,
depending on my actual load. I can start with a central database that is
accessed over the net, later switch to an elaborate scheme with
replication and finally switch to a MySQL cluster-esque solution.
High-performance databases are necessary in other applications, too, and
do exist.

I am a follower of the old Unix strategy that everything does one thing
and one thing only. Providing that data fast enough is the job of the
database.

Regards,
Martin

PS: Should we move this to serdev?