[Potential-SPAM] Re: [Users] Best practice for DNS failover using OpenSER?

T.R. Missner trmissner at bandwidth.com
Fri Jan 12 17:46:32 CET 2007


We are currently working on bit of a hackish way to honor SRV records.
Please poke holes as appropriate.

We will run a separate process that looks up a SRV record and then generates
local A records based on the results. Using the dispatcher module in a
similar fashion to what Christian describes below we then dispatch to the
pre agreed upon domain, resolved locally.  When the SRV results change we
resolve the pre agreed domain to reflect the change.

Example:

SRV returns 2 A records

server1.carrier.com priority 1 resolves to 4.5.6.7
server2.carrier.com priority 2 resolves to 5.6.7.8

We then build a fake local domain to match
server1.carrier.fake 4.5.6.7
server2.carrier.fake 5.6.7.8

The dispatcher list looks like this:

1 server1.carrier.fake
1 server2.carrier.fake

Now openser will always pick the first server and fail to the second server
using the mechanism Christian describes.

At some predefined interval our external process will check the SRV record
of the carrier.  Let's say it has now changed where server1 is priority 2
and server2 is priority 1 ( reversed from before). The external process
updates our local DNS to resolve server1.carrier.fake to 5.6.7.8 and
server2.carrier.com to 4.5.6.7


In this manner using an external process and local DNS ( in our case we use
DJBs tinydns ) we are honoring dynamic priority changing in SRV records with
openser.

This is still a work in progress. Will update once we have it up and
running.

Couple of caveats, we assume the number of records returned from the SRV
lookup will be consistent, since we have to build the same number of fake
domains. Also we assume we are dealing with priority routing not round
robin, though round robin would work, only the dispatcher algorithm would
need to change.

T.R.


On 1/11/07 11:36 AM, "Christian Schlatter" <cs at unc.edu> wrote:

> Staffan,
> 
> Kerker Staffan wrote:
> ...
>> Now, if I disable one of the Gateways, I hang every second call. OpenSER does
>> not
>> try the second A record address if the first doesn't answer. How can I solve
>> this? Shouldn't OpenSER fail over to the second A record listed in the NAPTR
>> => SRV
>> resolving? Or will OpenSER continue to resend all SIP INVITES until timers
>> fire? Would
>> it help if the proxy recieved an ICMP port/destination unreachable from the
>> network? Is
>> there anyway to get around this? In the other direction, from POTS to sip,
>> the PGW2200
>> nicely switches over to the second of my two OpenSER servers if I shut one of
>> them down. 
>> These servers have the same DNS entries (but for another SIP domain, NAPTR =>
>> SRV => 2x A record).
> 
> Yes, OpenSER or for that matter every transaction stateful proxy should
> do RFC 3263 based fail-over. But as you can imagine this is pretty
> complex to implement and that's why openser does not support it yet, it
> is listed on the development roadmap. The newest release of SER does
> support DNS failover.
> 
> But it is possible to implement failover with OpenSER, you just have to
> configure it manually on the proxy. And you have to adjust the SIP
> session timers of the tm module to achieve fast failovers.
> 
> Here is an overview of how I implemented failover with OpenSER (there
> are other ways to do that):
> 
> I use the dispatcher module with a non-random dispatcher algorithm to
> get deterministic failover.
> 
> dispatcher config file could look like:
> 
> 1 sip:gw-1.example.com
> 1 sip:gw-2.example.com
> 
> In the openser config file, I call the ds_select_domain() function just
> before t_relay. And in the failure_route I then use ds_next_domain() to
> select the next target from the dispatcher config file.
> 
> In order to get short failover times one has to adjust fr_timer for
> INVITE transactions. For INVITE transactions, fr_timer is the max time
> openser waits for a reply from the downstream SIP entity. As soon as
> openser receives such a reply, it will use fr_inv_timer as the final
> response timer. Per default fr_timer is 30 seconds so openser would wait
> about 30 seconds before trying the next target.
> 
> An openser config that does failover between gw-1.example.com and
> gw-2.example.com for gw.example could look like:
> 
> 
> modparam("tm", "fr_timer_avp", "i:24") # AVP to set fr_timer
> modparam("avpops","avp_aliases","fr_timer=i:24")
> 
> # failover support --> store dests in avp value
> modparam("dispatcher", "flags", 2)
> 
> 
> route[0] {
> ...
>    if (is_method("INVITE") && uri=~"sip:.*@gw.example.com") {
> 
>      # replace domain part with first dispatcher target of group 1
>      ds_select_domain("1", "9"); # alg 9 --> use first, second, etc
> 
>      # set fr_timer to 3 seconds (3 seconds for failover)
>      avp_write("i:3", "$avp(fr_timer)");
> 
>      t_on_failure("1");
>      t_relay();
>      exit;
>    }
> ...
> }
> 
> failure_route[1] {
> ...
>    # status is 408 if openser session timer fires
>    if (t_check_status("408")) {
>      # replace domain part with next dispatcher target
>      if (ds_next_domain()) {
>        t_relay();
>        exit;
>      }
>    }
>    ...
> }
> 
> 
> 
> - Christian
> 
>> 
>> I would love some best practice implementation clues regarding OpenSER and
>> multiple
>> GW fail over, if anyone of you have such knowledge or experience.
>> 
>> Best regards,
>> /Staffan
>> 
>> ---
>> Staffan Kerker, 
>> Saab Communication
>> Ljungadalsgatan 2, 35180 Växjö, Sweden
>> 
>> p. +46 470 42185
>> c. +46 705 391365
>> m. staffan.kerker at saabgroup.com
>> w. http://www.saabgroup.com
>> 
>> _______________________________________________
>> Users mailing list
>> Users at openser.org
>> http://openser.org/cgi-bin/mailman/listinfo/users
> 
> 
> _______________________________________________
> Users mailing list
> Users at openser.org
> http://openser.org/cgi-bin/mailman/listinfo/users





More information about the sr-users mailing list