Hi,
I've got a scenario like so:
UA A -----> Kamailio P ----> UA B
1. UA A initiates call through Kamailio P;
2. Dialog is established and confirmed, with Record-Route;
3. UA B sends reinvite #1 through P to A;
4. UA B sends 2xx reply;
5. UA B sends end-to-end ACK for reinvite #1 and almost simultaneously sends reinvite #2. The temporal delta is between reinvite #2 and ACK for reinvite #1 on the wire is 3 ms.
So, the result — for all kinds of stochastic processing and userspace scheduling type reasons — is that the reinvite is forwarded first, before the ACK. That leads to a 500 / 491 scenario UA A.
The cause, as we all know, is that Kamailio's worker threads are loosely coupled, and incoming UDP datagrams are distributed directly by the kernel.
Is there any general guidance on what to do with these scenarios? I looked at RFC 5407 § 3.1.4, which appears to describe a similar, but not identical scenario involving an initial INVITE and subsequent reinvite. As far as I can tell, the recommendation in that standard is "space the messaging out more in time".
Switching to TCP would presumably help, since any given flow would involve a single connection to a single worker thread and the transport would guarantee ordering. However, that's not really feasible in this implementation for a host of reasons.
I know Kamailio has some config locking primitives, but I am extremely wary of complex synchronisation on dialog-identifying attributes, particularly if there is a possibility that such locks could stall a worker perpetually.
Any other thoughts welcome!
Cheers,
-- Alex