On Fri, Feb 27, 2015 at 9:59 AM, Olle E. Johansson <oej(a)edvina.net> wrote:
Actually, it's the latter. Our current high
availability setup reilies on
anycast. And with TCP, this would mean a huge change in our setup.
That is in fact an interesting topic. Can you please elaborate a bit more
on this as I would like to see what we can
do in the software to make things easier.
I guess I can. Maybe the presentation of Krischan from last year's Kamailio
world will explain a lot:
http://www.kamailio.org/events/2014-KamailioWorld/day2/17-Krischan.Udelhove…
The setup on slide 9 is basically how our setup still looks like. We have a
few loadbalancers in different data centers sharing the same IP. And
depending on where the customer hits our network and how the call gets
routed, the loadbalancer handling the request and the one handling the
reply don't have to be the same. That's why UDP is so comfortable to work
with. We don't have a state, we don't care about the number of open TCP
sessions on one machine, and if one machine goes offline, OSPF makes sure,
the IP is still available.
When using TCP, we would have to make sure, every request leaves our
network on the same machine where it came in, because that's the machine
where the TCP session is open. That would mean to have a distinct OSPF
weight for each IP, so all packets always get routed to the same machine.
And that would mean, we probably would have to do a DNS round robin thing
to loadbalance the incoming traffic, since we wouldn't want to have all
traffic coming in on only one machine.
I had a similar discussion a while ago and it seems
like failover handling
is easier in UDP and we will need to fix this in order to be able to
migrate more users to TLS.
Yes, it for sure is a failover thing. We have a TCP and TLS server, too,
but this is not HA and only for test purposes right now. We probably would
use something with DNS RR, but aren't sure about how many clients we could
handle on one TCP or TLS machine. In our tests, we had about 16k sessions
open at the same time, but that would mean a lot of machines for all our
customers.
I haven't tested how different clients behave in
regards of TCP if the
server close a connection.
From what I saw with snom phones connected to the TLS
machine, when we
restart the Kamailio process, the clients aren't reachable for
inbound
calls until the next reregistration. They establish a new TCP connection
for an outbound call, but this connection can't be used for inbound calls.
After the next reregistration, everything is okay again.
Sebastian