[OpenSER-Devel] SF.net SVN: openser: [4242] trunk
Dan Pascu
dan at ag-projects.com
Mon May 26 21:47:40 CEST 2008
On Monday 26 May 2008, Juha Heinanen wrote:
> Dan Pascu writes:
> > When a proxy dies, the others will take over
> > the resources managed by the dead proxy and redistribute them among
> > themselves, thus the subscribers of the dead proxy are automatically
> > moved to a new proxy and become available again when they send their
> > next registration. Even before they re-register with the new home
> > proxy, they can still make outgoing calls immediately after their
> > home proxy failure, because them being mapped to a new proxy is
> > instantaneous once a dead proxy is removed from the network.
>
> dan,
>
> there is one more issue with this. if a UA makes or receives a call
> via one of the proxies and this proxy dies before the call ends, how
> can the UA be able to deliver or receive a bye, when the proxy in the
> route set is gone?
It can't. This is something that is accepted as part of the design. Once
any of the proxies that have added a Record-Route set is gone, further
in-dialog messages cannot be sent. This is an issue with any design, not
only the distributed design I mentioned.
> this is not an issue with loadbalancer, because its
> address never changes even if is replaced by another one.
I'd say you have the same issue with the loadbalancer scheme as well. Only
that you do not have it with the loadbalancer box itself, because its
clustered, but you have the same issue with any of the proxies behind the
loadbalancer. Unless you double every box in your system you have the
same problem when one of the proxies behind the loadbalancer dies. But
the distributed scheme can protect itself too against this problem, by
also doubling every proxy in the network, however this is not justified
as such failures are rare and the disruption is very small.
Even more, you cannot guarantee 100% resilience even if you make every
proxy a cluster, because there is a small time window when the slave has
not yet taken over but the master is dead. This can be between 30 to 120
seconds depending on the cluster configuration, but during that interval
all the in-dialog messages are lost.
Also there are cases where a clustered proxy loses network connectivity
completely, so even if the master is alive and there, is not reachable so
in-dialog messages will be lost. The cluster doesn't help at all in this
case.
This issue is not new to the distributed design. The purpose of the
distributed design is to enhance scalability and eliminate single points
of failure and redundant boxes at the small expense of losing some
resilience when catastrophic failures happen, but provide the ability of
the network to recover itself quickly withot human intervention.
--
Dan
More information about the Devel
mailing list