[SR-Users] Kamailio failover - dialog (transaction) that is available on different proxies

Wed Aug 30 15:02:00 CEST 2017

So guys, I have nice news.
I made it working as expected, but with some editions of previous scheme.

I set NAPTR records for tcp/udp:
dig naptr sipnew.heremydomain.ua @ns2 +short
10 100 "S" "SIP+D2U" "" _sip._udp.sip.heremydomain.ua.
10 100 "S" "SIP+D2T" "" _sip._tcp.sip.heremydomain.ua.

I set SRV records for both types of transport:
dig srv _sip._udp.sip.heremydomain.wnet.ua +short @ns2
10 100 5060 kamailio1.heremydomain.ua.
10 100 5060 kamailio2.heremydomain.ua.

dig srv _sip._tcp.sip.heremydomain.ua +short @ns2
10 100 5060 kamailio1.heremydomain.ua.
10 100 5060 kamailio2.heremydomain.ua

And of course A records for both fqdn names:
dig kamailio1.heremydomain.ua @ns2 +short
10.10.10.1
dig kamailio2.heremydomain.ua @ns2 +short
10.10.10.2

Here I have to say, that I moved away from idea to set two IP addresses for
both fqdns.
As Sebastian has already said, indeed clients sent half requests to first
result of A records and half requests to second one.

So I decided to record route sip.heremydomain.ua, instead of direct fqdn
names of kamailio servers (kamailio1.heremydomain.ua /
kamailio2.heremydomain.ua).
And the result was excellent.

So the configuration of kamailio proxies was like that:
1. dialog module working with external database;
2. registrar module working with external database;
3. usrloc module working with external database;
4. both kamailio inserts data to the same db, config files are the same on
both kamailio (excepting the listening ifaces);
5. b2b user agent as routing server and media proxy at the same time;
6. Location records are visible from both kamailio proxies;

Here is the sequence of the call process (I used TCP transport to test the
environment):
1. Client sets up dialog through the first kamailio, so three-way-handshake
is done, media stream is up.
2. Then kamailio1 falls down, but dialog doesn't get down.
2.1 My client noticed, that 5060 port on first kamailio is unreachable and
suddenly re-registered itself on second one (he resolves the
sip.heremydomain.ua and gets kamailio2 host).
2.2 Media stream doesn't get down, because of external media-proxy.
3. Client decides to end up the call and sends BYE request, to kamailio2
host.
3.1 Kamailio2 knows anything about this client and dialog, so he process
this BYE as usual.
3.2 Session completely (properly) gets down.

In case of UDP transport, the result was a bit different, because in a few
cases srv record couldn't recognize that 5060 is unreachable on first
kamailio, so client tried to send BYE to first host.
But if first host was completely down, it worked in 100 % cases.

So I can say, that this topology is workable for me.

2017-08-27 12:37 GMT+03:00 Donat Zenichev <donat.zenichev at gmail.com>:

> Hi Igor.
> Well, indeed I've already done the solutions with heartbeat, but the main
> idea now is to minimize the absense of SIP server.
> Heartbeat need time (that depends on your condigurations) to understand
> that primary is down, e.g. you have dead interval set to 10 seconds, so if
> no activity has noticed while this period, the node is considered as dead.
> But, if you will set this interval lower, e.g. 2-3 seconds, you get the
> risk to obtain flaps (e.g. there is a delay within ip route from slave to
> primary node, so slave brings up the shared ip and start to process calls,
> but real master works fine indeed and have possibility to communicate).
>
> So according to heartbeat, I decided to perform it only inside same
> physical domain, where ucast/bcast packets will reach other node without
> any problems.
>
> According to my actual question, I've moved further and now think
> following scheme will work fine:
> 1. NAPTR records for every transport protocol (e.g. " _sip._udp.domain.org
> ").
>
> 2. SRV records for every NAPTR record (e.g. kamailio1.domain.org, ka
> mailio2.domain.org <http://kamailio1.domain.org/>) with same
> priority/weight for both of them, to balance half invites to first one and
> half invites to second one.
>
> 3. A records for every domain name (e.g. kamailio1.domain.org - 10.0.0.1,
> 10.0.0.2, where actually second one is kamailio2;
> and the same for fqdn kamailio2.domain.org - 10.0.0.2, 10.0.0.1).
>
> So the sequence of dialog actions will be
>
> 1. Invite from uac is balanced to kamailio1;
> 2. Dialog is established and media stream is up;
> 3. Then kamailio1 goes down;
> 4. Bye message tries to achieve host that was set in rr hf (kamailio1),
> but kamailio1 (10.0.0.1) is down, so bye message will be sent to 10.0.0.2
> (kamailio2) and a cause of the behaviour is 10.0.0.2 ip assigned to
> kamailio1 fqdn as second ip.
> 5. The message will be processed by kamailio2, because of common
> dialog/usrloc db.
>
> I will make an effort to set up it next week.
> In case of success, I will write a short report here.
>
>
> 2017-08-25 17:26 GMT+03:00 Donat Zenichev <donat.zenichev at gmail.com>:
>
>> I've searched through the sr users list and found a few discussions on
>> this count.
>>
>> So the way (as I think) that is more relevant for kamailio failover, is
>> solution with DNS:  NAPTR -> SRV records.
>>
>> Like:
>>
>> NAPTR record:
>> "IN NAPTR 10 10 SIP+D2U "" _sip._udp.domain.org"
>>
>> SRV records:
>> "_sip._udp.domain.org  SRV  10  1  5060  kamailio1.domain.org"
>> "_sip._udp.domain.org  SRV  10  1  5060  kamailio2.domain.org"
>>
>> A records:
>> "kamailio1   IN  A  10.0.0.1"
>> "kamailio2   IN  A  10.0.0.2"
>>
>> So each kamailio will add rr with own hostname - e.g.
>> kamailio1.domain.org
>> So that, client will send in-dialog requests to route with fqdn
>> kamailio1.domain.org
>> And I can't add to rr sip.domain.org, because every new request
>> (whatever it is initial or indialog) will be sent to one of the kamailio
>> servers, but I need to send in-dialog requests to the same kamailio.
>>
>> So for the goal of failover, I need to have more A records, like:
>> "kamailio1   IN  A  10.0.0.1"
>> "kamailio1   IN  A  10.0.0.2"
>> "kamailio2   IN  A  10.0.0.2"
>> "kamailio2   IN  A  10.0.0.1"
>>
>> And in case when kamailio 1 goes down, uac will have two ip dst to send
>> request: 10.0.0.1 and 10.0.0.2 (where indeed second one is kamailio2).
>> So as result I will have one database for userlocation and dialog module,
>> and loadbalancing based on SRV priority/weight fields.
>>
>> And as failover, A records, that give possibility to send requests first
>> to 10.0.0.1 and second to 10.0.0.2 (if rr was bind to kamailio1).
>> And otherwise, if rr was defined as kamailio2, first request tries to
>> achive kamailio1 and then kamailio2.
>>
>> Am I right at this point?
>>
>>
>>
>>
>> 2017-08-22 21:57 GMT+03:00 Donat Zenichev <donat.zenichev at gmail.com>:
>>
>>> Hi.
>>>
>>> I came up with idea to set up stand with two kamailio and one b2bua
>>> server (for routing).
>>>
>>> The idea consists of failover for dialogs, transactions.
>>> So if one of kamailio nodes is down, another one is able to catch up the
>>> dialog and let users to properly end up the session.
>>>
>>> For better realizing of it, I will try to describe the idea step by step:
>>> 1. UAC invites UAS, they've done three-way-handshake, media stream is up.
>>> 2. Kamailio that processed this dialog is down.
>>> 3. Users decided to end the session with BYE method, but proxy that
>>> processed their three-way-handshake recently is down, so one of ua sends
>>> BYE to the destination route that contains domain name (that both kamailio
>>> serve), BYE achieves the second kamailio to let him properly end the dialog.
>>>  But, there is a big but, this second kamailio hasn't ever known about
>>> this dialog, he doesn't support any transactions for it and furthermore he
>>> doesn't know anything about this call-id.
>>>
>>> So the solution for it, as I think, is hidden in db mode for user
>>> location (columns that contain call-ids, branches etc.
>>> But I need to be sure, if I'm on the right way.
>>>
>>> For purpose, where one ip is served by two nodes, I have two solutions:
>>>
>>> -First one. I want to create heartbeat cluster with two kamailio nodes,
>>> they will have one shared ip address, so when one node gets down, another
>>> one brings up shared ip interface and implements the same actions that
>>> master does.
>>>
>>> -Another method is to assign a few ip addresses to one domain name (ip
>>> addresses of different kamailio proxies).
>>>
>>>
>>> So the goal looks simple, if someone has ever done something like that,
>>> I will be glad to read the ideas.
>>>
>>> --
>>> --
>>> BR, Donat Zenichev
>>> Wnet VoIP team
>>> Tel:  +380(44) 5-900-808
>>> http://wnet.ua
>>>
>>
>>
>>
>> --
>> --
>> BR, Donat Zenichev
>> Wnet VoIP team
>> Tel:  +380(44) 5-900-808
>> http://wnet.ua
>>
>
>
>
> --
> --
> BR, Donat Zenichev
> Wnet VoIP team
> Tel:  +380(44) 5-900-808
> http://wnet.ua
>

-- 
-- 
BR, Donat Zenichev
Wnet VoIP team
Tel:  +380(44) 5-900-808
http://wnet.ua
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.kamailio.org/pipermail/sr-users/attachments/20170830/75a3870f/attachment.html>