Hi guys!
In Kamilio we had (have?) the problem described here. Maybe the problem still applies to ser-tm too, then it needs this fix too.
regards klaus
-------- Original-Nachricht -------- Betreff: [Kamailio-Devel] race condition Datum: Thu, 23 Apr 2009 10:34:26 +0200 Von: Zappasodi Daniele Daniele.Zappasodi@seltatel.it An: devel@lists.kamailio.org Referenzen: mailman.9.1240394401.8388.devel@lists.kamailio.org
Hello, I have a problem with a race condition with the reply and retransmission. Sometimes Openser (I'm still working with 1.3.x version) ignores the provisional reply and tears down the call after the fr_timer. This happens only in a particulary conditions: I have a short value for fr_timer (5 sec); the reply arrives very fast (I have gateway and proxy on the same box).
In this situation can happen that a child sends the INVITE (SEND_BUFFER in t_forward_nonack) and it is suspended by the scheduler before it can execute start_retr. Another child get the reply. When the first child executes start_retr it sends immediately a retransmission for the INVITE and after 5 seconds (the fr_timer) it tears down the call with CANCEL, because it ignores the reply.
I know that this is a very improbable condition (even if I encountered it), but if you want to fix it, it should be easy to solve: in t_forward_nonack, before start_retr, it should be tested if another child has already received a reply. Something like this (t_fwd.c on head):
@@ -719,7 +719,10 @@ -p_msg->REQ_METHOD); }
- start_retr( &t->uac[i].request ); + if(p_msg->REQ_METHOD==METHOD_INVITE && t->uac[i].last_received>=100) + LM_DBG("Last received %d\n",t->uac[i].last_received); + else + start_retr( &t->uac[i].request ); set_kr(REQ_FWDED); } }