[Kamailio-Devel] [ openser-Bugs-2037070 ] Maybe race condition: 200 OK for INVITE relayed after CANCEL

SourceForge.net noreply at sourceforge.net
Wed Aug 6 18:32:54 CEST 2008


Bugs item #2037070, was opened at 2008-08-03 15:39
Message generated for change (Comment added) made by abalashov
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=743020&aid=2037070&group_id=139143

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: modules
Group: ver 1.3.x
Status: Open
Resolution: None
Priority: 3
Private: No
Submitted By: Alex Balashov - Evariste System (abalashov)
Assigned to: Daniel-Constantin Mierla (miconda)
Summary: Maybe race condition: 200 OK for INVITE relayed after CANCEL

Initial Comment:
I have an SBC that sends calls to Kamailio, whose registrar module resolves multiple contacts for that AOR causing two call branches to be generated and sent toward the contacts, A and B.  A picks up first, then B, but both 200 OKs arrive in succession in the packet capture prior to the occurrence of any other SIP messages in any direction.

A and B pick up (200 OK) almost simultaneously.  As a result, Kamailio appears to pass back the second 200 OK to the SBC along with the first, even after sending a CANCEL to the B contact and receiving a 200 OK for that dialog.

Here is the precise sequence of events:

1. SBC sends call to Kamailio proxy.

2. Proxy does registrar dip and resolves two contacts - A and B.

3. Proxy bifurcates the call into two branches 'branch A' (to A) and  'branch B' (to B).  Rewrites RURI, relays INVITE.

4. A answers with 200 OK.

5. B answers with 200 OK.

6. Proxy passes back 200 OK to SBC for A.  Then for B.

7. SBC issues in-dialog end-to-end ACK for that 200 OK;  proxy decides to forward it only to A as per the ONREPLY-ROUTE.  No replies are forwarded to B.  It is here that I think things go wrong.

8. B keeps sending 200 OKs and getting no ACKs for them, and eventually gives up and kills the session.

So, it looks like not all replies are being statefully relayed to both branches.

Additionally, it looks like the following is happening:

- At step #6 above, the 200 OK passed to the SBC is for A only.

- The proxy elects to CANCEL the other branch to B between #6 and #7.

- After sending the CANCEL, the proxy decides to pass back the original 200 OK for the INVITE (with SDP) for B back to the SBC as well.

- After that, B replies with a 200 OK for the CANCEL issued by the proxy.  Why does it reply with a 200 OK?  Simply because it is after the INVITE was already OK'd?  Is that per the RFC?  I thought a call leg 
could not be CANCEL'd at this stage at all and requires a BYE?

- SBC ACKs the 200 OK (for INVITE) from A, and proxy relays to A.

- Meanwhile, B keeps sending 200 OKs for the INVITE (AFTER a CANCEL on that branch!) and the proxy keeps relaying them back to the SBC, which replies with ACKs.  But these ACKs keep getting forwarded back to A, not B, presumably because from the proxy's POV the B leg is now CANCEL'd and OK'd (in the penultimate step).

And here is the time-indexed sequence of events from the packet capture:

- Packet 9, time index 7.953711: 200 OK arrives from A.
- Packet 10, time index 7.954636: 200 OK arrives from B.
- Packet 11, time index 7.969227: Proxy passes 200 OK from A back to SBC.
- Packet 12, time index 7.969268: Proxy originates CANCEL for branch B.
- Packet 13, time index 7.970279: Proxy passes 200 OK from B back to SBC.
- Packet 14, time index 7.971508: 200 OK for CANCEL request arrives from B.  [1]
- Packet 15, time index 8.018730: SBC originates ACK for branch to A.
- Packet 16, time index 8.018895: Proxy passes ACK for branch A to A.
- Packet 17, time index 8.153957: B retransmits 200 OK for INVITE.
- Packet 18, time index 8.155309: Proxy forwards 200 OK from B to SBC.
- Packet 19, time index 8.155853: SBC sends ACK again to A's contact. This is really strange because the Contact address in Packet 18 is for B.

According to Juha Heinanen, this may be a race condition because:

"9 and 10 arrive to proxy very close to each other, which may result in a race condition bug causing proxy to send packet 13, which it should not do."

---

I am using OpenSER 1.3.2.  The sender UA is a Nextone SBC, and the two registrants are both Asterisk 1.4.21.2.  This problem might occurs in the same way every single time, irrespectively of any temporal variations.  It may be possible to reproduce by concurrently registering two Asterisk instances against the Kamailio registrar for one AOR and sending a call to them;  also, append_branches is turned on for the registrar.

----------------------------------------------------------------------

>Comment By: Alex Balashov - Evariste System (abalashov)
Date: 2008-08-06 12:32

Message:
Logged In: YES 
user_id=2167036
Originator: YES

I must disagree here:

- Packet 11, time index 7.969227: Proxy passes 200 OK from A back to SBC.
- Packet 12, time index 7.969268: Proxy originates CANCEL for branch B.
- Packet 13, time index 7.970279: Proxy passes 200 OK from B back to SBC.

Kamailio should not be passing back that second 200 OK *after* it has
CANCEL'd that same branch.

I suspect the problem may be that the thread charged with passing back the
200 OKs from the branches back to the SBC and the thread issuing the CANCEL
are different threads and unaware of what the other has done at this
precise moment.  I think that is the issue that needs to be fixed via some
form of locking / state machining.

Is this a naive, uninformed perspective?

----------------------------------------------------------------------

Comment By: Klaus Darilion (klaus_darilion)
Date: 2008-08-06 10:29

Message:
Logged In: YES 
user_id=1318360
Originator: NO

I think Kamailio can do nothing in such a scenario. The CANCEL to B will
be ignored as the call is already answered (the reponse code of the
CANCEL-reply is rather irrelevant).

Only the caller can handle such situations. That means the SBC should
detect that the second 200 OK has a different to tag. Thus, there are 2
dialogs created at the SBC: one with A and one with B. Thus, the SBC has to
decide what to do - usually it should terminate the second dialog - i.e.
send ACK to both, A and B, and then send BYE to B.

IMO this can be closed.

----------------------------------------------------------------------

Comment By: Daniel-Constantin Mierla (miconda)
Date: 2008-08-06 04:14

Message:
Logged In: YES 
user_id=1246013
Originator: NO

It is a well know race situation with 200OK for INVITE and CANCEL. The
RFC3261 is not clear about dealing with it, I don't know if there were some
updates in other documents. I will search in the next days (quite busy now)
for discussions related to same topic, so we can continue this thread.

Lowering the priority now to skip it in release blockers counting.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=743020&aid=2037070&group_id=139143



More information about the Devel mailing list