Hi Daniel,
On 03/15/2016 04:04 AM, Daniel-Constantin Mierla wrote:
This was discussed a lot during early days of SER and even many times along the 15 years so far.
I see!
Sorry, perhaps I should have dug deeper into list archives and folklore before suggesting it.
The persistence of transaction is not something easy, because of its complex relations with timers for retransmissions over udp, but also with connections for tcp/tls. Each transaction has a lot of states, particularly bound to each outgoing branches that can be at different phases.
I definitely had in mind a minimalistic solution which has few dependencies on other elements of state -- and certainly, in my mind, it was implicitly UDP-only. I was thinking to just dump the current timer values and restore them on the assumption that the passage of time was "stopped" while the server was down, or, if these timers are done with reference to gettimeofday/wall clock time, then, on the contrary, allow it to elapse and dig into the timer allowance. In other words, keep it simple, lazy.
However, I don't know what other dependencies it would have; are branches not a part of TM state as well? Or is it just a question of far too many data structures to dump/restore?
This is a thing of investing a lot of resources to get a solution for 0.01% or less, which in most of the cases sort out themselves fairly nice.
Fair enough. I think you're probably right on that, I just wondered if maybe there was a lower-hanging fruit option.
I do think you're right that the best mitigation for this problem is probably to have two Kamailio servers and an externally settable (i.e. via MI/RPC) $shv that allows one to take it "out of service", i.e. reject all new requests. Let the calls bleed off, then restart it.
-- Alex