[Serusers] Carrier-grade framework for SER

Fri Jan 28 12:40:52 CET 2005

Dragos,

I completely agree with you.  When building architectures for scalability 
and redundancy, the correct match of technologies will often solve many of 
your problems.  And it does make sense to store as much subscriber 
information back-end as possible and let proven and standardized(?) systems 
handle replication etc.
    In my summary of issues I tried to separate the various issues from 
eachother without going as far as you do.  Dependent on where you come from, 
various combinations will look appealing.

    I have followed Diameter from a distance for a while, and while it does 
seem to take hold in the mobile world, I haven't seen many non-mobile 
implementations (it may be my limited perspective...). Most legacy subcriber 
databases have a RADIUS interface, but rarely Diameter. In fact, I know that 
Siemens have implemented a Mobile RADIUS with location-awareness in RADIUS. 
I think that is an example of the (still) appeal of RADIUS.
    I would love to have Diameter capabilities, but deploying Diameter in my 
organization is not really feasible at this point. Hence, we try to do as 
much as possible with Juha's RADIUS modules.   And here is the point, I 
think: One thing is what we would like to have, another thing is the context 
we work in and thus what we have.  I have RADIUS and do all provisioning in 
an LDAP back-end, other people use mysql also for the subscriber database 
(which BTW is simple as serweb has been developed and it works).

    The fact that you (at Fraunhofer) spend time on Diameter and SIP is only 
a confirmation of how "cutting-edge" such an approach is. (Well, with legacy 
systems, things do not have to be very cutting-edge to be very far away from 
implementation...) Secondly, Diameter is the successor of RADIUS, which is a 
very "simple" technology compared to Diameter.  Just as with our xmlrpc vs 
soap discussion, you gain complexity when you get functionality and for 
installation without the need for a certain functionality will only get 
complexity; and complexity = $$$.

    And do I think keeping location is a natural function of AAA? Hm, 
traditionally subscriber databases kept all the static information about the 
subscriber and the subscriber's services.  Location was application 
specific. In the mobile world, location is key to all services, so HLR (Home 
Location Registers) had location as a natural part of its functionality. For 
old/traditional dial-up services, location was more an attribute of the AAA 
request, so that authorization could be determined based on when and where 
the subscriber tried to get access.  Location keeps getting more and more 
important in TCP/IP based services as well, so one may (successfully) argue 
that location is a part of the AAA domain.  However, where do you stop?  Is 
the capabilities of your end-user device also information that should be in 
the AAA domain?  You need to know these capabilities in order to provide 
quality services...
    One inherent problem of using AAA for dynamic data is that back-end 
technologies that are very well adapted to AAA like LDAP (made for search 
speed) have a real problem with frequent writes.  Of course, if you use 
Oracle or some big SQL server as a back-end, you have solved that problem, 
but you cannot utilize from LDAP capabilities...

    Enough of this ranting... What I'm trying to say is that we need that 
"piece-meal" approach to scalability and redundancy: What can we do with 
existing installations today that gives us what we need and still leads us 
in the right direction?
    So, in order to better understand where we should be going, I would love 
to look at what you are doing 3GPP IMS. ;-) If you have some more 
information, why don't you post a link?

g-)

Dragos Vingarzan wrote:
> Hi all!
>
> Well, maybe I don't know what I am talking about, I just saw now this
> message, don't know the entire discussion, so anyway, here it goes.
>
> I am working on implementing 3GPP's IMS using ser and by doing this I
> am implementing the 3GPP AAA scenarios, which, as I understood, will
> evolve into IETF's "Diameter SIP authentication" (now is a Draft)
> http://www.potaroo.net/ietf/idref/draft-ietf-aaa-diameter-sip-app/ .
>
> Usrloc, although I use it locally is not anymore needed on a large
> distributed scale because the AAA servers keep track of where the
> users are registered and you can query them by standard request
> (Location Information Request, User Authorization Request) to find
> out where to forward messages.
>
> 1. more users, more servers. I think this should be the way to do it.
> I never liked replication because it's kind of a too complicated
> solution to work reliably.
> I really think that the Diameter SIP auth will scale very well. Keep
> in mind that it was designed to work in large mobile networks (3G,
> right?). The network should be splitted into different functional sip
> proxies, and in this case:
> - interogating(finds where a user is/should be registered and forwards
> the msg there)
> - serving (actually services the user).
>
> 2.the replication problem has just moved from ser to the AAA server.
> But the AAA server can be distributed so that each one will hold a
> limited number of users, and a special AAA routing node can be used
> to route requests toward the right AAA, which will respond to them.
> The AAA server holding users could be a point of failure so maybe you
> will still need to duplicate it's functionality but I think it's
> better to do a replication just to ensure availability/fail-safety
> then to do it in order to implement functionality.
> At SIP level the distribution can be done using the interogating
> nodes, first to redirect REGISTERs in load balancing if you want, to
> different ser servers; and then the users can be found by
> interogating the AAA servers which will track the users registration
> status.
> 3. the provisioning needs to be done in the AAA server only and all
> the ser servers in the farm will obey.
> I would suggest a "farm" architecture for ser, where all nodes are
> configured the same and then you could dinamically introduce or take
> back some such servers depending on load. All this would be done by
> provisioning on the AAA.
>
> 4. Actually 1. responded to this. Load balancing is controled in
> interogating sip-proxies and in AAA servers and not by "patches" like
> DNS SRV (please don't flame me on this one ;-) ).
>
> I, personaly consider that ser should be limited to what it does best
> - SIP - and other components should do provisioning, replication and
> user tracking. Anyway, user location is more of a AAA job, don't you
> think? This is my personal opinion and I might be wrong :)  . It is
> based on the patterns of the 3GPP and diameter-sip-app. Although I
> have reached the testing phase and it works, I have not yet performed
> large stress tests to see how it works because my target is to build
> a working 3GPP IMS, not a diameter-sip application as the draft says.
> So if there is interest for this, please let me know.
>
> Dragos
>
>
>
> Greger V. Teigre wrote:
>
>> Let me try to sort out the issues we are discussing here, so we at
>> least can see if we agree to the goals:
>>
>> 1. Reliability and scalability issues
>> -----------
>> Scenario: Tens of thousands or hundreds of thousands of users require
>> a reliable and scalable infrastructure
>> Goal: Find a good reference scenario for building a reliable and
>> scalable infrastructure of ser servers.
>> Problems: Everybody tries to solve this their own way and most keep
>> their solutions as a secret because it is a competitive advantage not
>> to tell anybody.
>>
>> **** I think that your solution to #1 will dominate the discussions
>> on the issues below.  Using RADIUS (and possibly LDAP back-ends) for
>> everything but usrloc is one solution that seems to be Juha's
>> scenario (and mine). Andreas uses mysql for subscriber info as well.
>> Do you have one server center with load balancing or
>> geographically-distributed server centers?  It will influence your
>> needs. So, let's sort out our scenarios before we discuss what is
>> the "best " solution.
>>
>> 2. Usrloc replication across standalone ser servers.
>> ------------
>> Scenario: Independent servers with independent databases run either
>> with some sort of load balancing or DNS SRV.
>> Goal: Make sure that all ser servers have updated usrloc information,
>> so each can handle any SIP message.
>> Problems: Distribute REGISTER messages to all servers; Make sure that
>> server unavailability does not corrupt the usrloc DB state
>>
>> *** We all have this issue.  It is my understanding that t_replicate:
>> a) uses SIP messages b) uses a best-effort algorithm (haven't looked
>> at the code...) c) can be used between several servers, but when you
>> introduce a new server, you need to change each server's ser.cfg
>> My suggestion for a simple solution based on the discussion so far:
>> Extend t_replicate with a guaranteed mode of replication.  mysql can
>> be used as a queue with replication states (or even a text-file for
>> that sake).  Whether SIP messages are used or TCP/IP-based FIFO is
>> really based on an estimation of network traffic.
>> Result: The least work and the code is an integrated part of ser.
>>
>> 3. Network-based provisioning of new users, aliases, etc
>> ------------
>> Scenario: One server need to be provisioned from a web server or
>> process running on a remote server
>> Goal: Allow ser to receive TCP/IP based provisioning messages
>> Problems: ser's FIFO does not have a TCP/IP interface
>>
>> *** I think this is an extension to ser that would benefit many
>> people.  I also believe that a provisioning interface should be SOAP
>> based due to share number of projects that probably will use the
>> interface for provisioning.
>>
>> 4. Replication of user database, aliases, etc across standalone ser
>> servers.
>> ------------
>> Scenario: Independent servers with independent databases run either
>> with some sort of load balancing or DNS SRV and subscriber
>> information is stored in sql tables
>> Goal: Make sure that each server recognizes all subscribers,
>> aliases, etc Problems: Make sure that all servers have updated
>> database tables *** RADIUS/LDAP solutions do not need to do this as 
>> RADIUS servers,
>> LDAP replication etc take care of both reliability and scalability.
>> However, I think ser support more than one RADIUS server. A defined
>> secondary server would be useful.
>> With SQL-based scenarios however, I see three natural solutions:
>> a) Rely on sql-based replication. Without checking this, I believe
>> ser always write such FIFO commands directly to the DB, so sql-level
>> replication should work
>> b) Extend ser's FIFO to also have a replication configuration, i.e.
>> in ser.cfg you define the peer servers that need replication. If the
>> extension to t_replicate uses TCP/IP based FIFO, the code can be
>> re-used. c) Implement provisioning systems so that each ser server
>> is updated through the TCP/IP-based FIFO
>>
>> To be honest, I'm not sure if I see the value of such an effort (b).
>> Also, as usage of sql for storage is just one of several modes, it is
>> probably not right to integrate such code into FIFO.  a) and b) are
>> more natural choices.
>>
>> --------------------------------------------------------------------------
>>
>>
>> My summary and conclusions:
>> - I believe a TCP/IP-based FIFO (#3) is a core feature that we all
>> can agree would be useful and natural to implement;
>> - I don't know the details of how t_replicate functions, but Juha's
>> opinion is that it takes care of all the issues Andreas points out
>> except one: The amount of traffic SIP messages create.  I will not
>> interfere with this discussion, of course, if t_replicate can handle
>> unavailable servers etc, that would be great. Anyway, a reliable
>> replication of usrloc is essential to a carrier-grade architecture
>> - After this discussion, I now believe we should keep provisioning
>> (#3) and the two types of replication (#2 and #4) separate also in
>> implementation.
>>
>> Well, my attempt at sorting out issues.  Any succes, you think? ;-)
>>
>> g-)
>>
>> Andreas Granig wrote:
>>
>>> Juha Heinanen wrote:
>>>
>>>> you can have any number of proxies participating in replication.
>>>
>>>
>>> What method are you thinking of? t_replicate() reports
>>>
>>>   ERROR: t_newtran: transaction already in process 0x4054d5ec
>>>
>>> if you call it twice, like
>>>
>>>   t_replicate("foohost", "5060");
>>>   t_replicate("barhost", "5060");
>>>
>>> Or do you mean something like
>>>
>>>   forward_tcp("foohost", "5060");
>>>   forward_tcp("barhost", "5060");
>>>
>>> and on the receiving hosts
>>>
>>>   if(/* register from replicating host */)
>>>     save_noreply("location");
>>>
>>> which would be a possibility, indeed...
>>>
>>>>  > Beside that the domain tables (location etc) get out of synch if
>>>>  one of > the SERs is down for a moment, because retransmission is
>>>>  only tried a > few times.
>>>>
>>>> i don't see why this needs to be the case with db mode 2.  when ser
>>>> comes back up, it updates its location table from database.
>>>
>>>
>>> I think mode 1 (Write-Through) should be used because the SER could
>>> start up while some of the contacts aren't flushed to DB yet.
>>>
>>> However, how would you set up your database connections here? Using
>>> a common usrloc database for all hosts (-> single point of failure)?
>>> This is the main point. _How_ do you share the contacts as reliable
>>> as possible so that a host can go down for a while without getting
>>> out of synch regarding the contacts?
>>>
>>> Andy
>>>
>>> _______________________________________________
>>> Serusers mailing list
>>> serusers at lists.iptel.org
>>> http://lists.iptel.org/mailman/listinfo/serusers
>>
>>
>> _______________________________________________
>> Serusers mailing list
>> serusers at lists.iptel.org
>> http://lists.iptel.org/mailman/listinfo/serusers