[Serdev] usrloc loading

Wed Jan 24 18:24:22 UTC 2007

On Jan 24, 2007 at 19:09, Andreas Granig <andreas.granig at inode.info> wrote:
> Hi all,
> 
> Greger V. Teigre wrote:
> >Good for a start, some design-principles...  Do you take my challenge? ;-)
> 
> I dare to jump in here for a comment :o)
> In my opinion, there are two reasonable concepts for scaling a system.
> 
> One is to use SIP proxies as balancers which route requests to the 
> appropriate proxies/registrars. Each proxy/registrar only hosts a subset 
> of the overall user-base, and the balancers know, which proxy/registrar 
> is responsible for which users (e.g. by hashing the r-uri). This way, 
> there's no need to share the location table. There must be a way though, 
> to keep each proxy/registrar redundant, e.g by using a standby node. 
> Nonetheless, the active node has to synchronize its location table to 
> the standby node, e.g. by using database replication, in case the 
> standby node has to take over.

Or they can keep the usrloc info just in-memory and use t_replicate() to
 replicate registers between primary and standby/backup (I assume that each
 node is in fact a master/backup pair). If the registration expiration
 is limited to a low enough value this solution works with minimal
 interruption even in the unlikely case that both primary & backup fail.
 This would have the advantage of not relying on the db for usrloc (db
 is in general very slow when shared and writes are involved) and
 allowing fast re-registrations.

> 
> The second possibility is to share a location table across all 
> proxies/registrars, e.g. by using a database cluster. This way, a load 
> balancer could route requests to any of the proxies, because each node 
> knows the location of each user. There's no need for standby nodes, 
> because the remaining nodes take over the load of the failed node.
> 
> Each of the solutions has its advantages and disadvantages, and both 
> rely on external software to work. The first approach uses database 
> replication, but needs some failover-management-software for node 
> fail-over, it needs changes in the provisioning work-flow if a new 
> proxy/registrar pair is added, each proxy/registrar needs a standby 
> node, etc. The second approach heavily relies on some sort of database 
> cluster and needs a cache-less usrloc, but is easier to scale and 
> doesn't need standby nodes.
> 
> Jiri seems to go for the first solution, I like the second one very 
> much. Both have proved to work, so I think it's just a matter of taste, 
> and maybe also experience with one or another subject area the two 
> solutions rely on (like experience with node-failover-software versus 
> experience with mysql-cluster).

I prefer the first solution. Let's just say that we did not have
 a good experience with usrloc + db cluster from the performance point
 of view.

Maybe we could start slowly documenting various scalability / high
availability scenarios.

Andrei