[Kamailio-Devel] [ openser-Bugs-2037070 ] Maybe race condition: 200 OK for INVITE relayed after CANCEL
SourceForge.net
noreply at sourceforge.net
Thu Aug 7 08:12:24 CEST 2008
Bugs item #2037070, was opened at 2008-08-03 21:39
Message generated for change (Comment added) made by axlh
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=743020&aid=2037070&group_id=139143
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: modules
Group: ver 1.3.x
Status: Open
Resolution: None
Priority: 3
Private: No
Submitted By: Alex Balashov - Evariste System (abalashov)
Assigned to: Daniel-Constantin Mierla (miconda)
Summary: Maybe race condition: 200 OK for INVITE relayed after CANCEL
Initial Comment:
I have an SBC that sends calls to Kamailio, whose registrar module resolves multiple contacts for that AOR causing two call branches to be generated and sent toward the contacts, A and B. A picks up first, then B, but both 200 OKs arrive in succession in the packet capture prior to the occurrence of any other SIP messages in any direction.
A and B pick up (200 OK) almost simultaneously. As a result, Kamailio appears to pass back the second 200 OK to the SBC along with the first, even after sending a CANCEL to the B contact and receiving a 200 OK for that dialog.
Here is the precise sequence of events:
1. SBC sends call to Kamailio proxy.
2. Proxy does registrar dip and resolves two contacts - A and B.
3. Proxy bifurcates the call into two branches 'branch A' (to A) and 'branch B' (to B). Rewrites RURI, relays INVITE.
4. A answers with 200 OK.
5. B answers with 200 OK.
6. Proxy passes back 200 OK to SBC for A. Then for B.
7. SBC issues in-dialog end-to-end ACK for that 200 OK; proxy decides to forward it only to A as per the ONREPLY-ROUTE. No replies are forwarded to B. It is here that I think things go wrong.
8. B keeps sending 200 OKs and getting no ACKs for them, and eventually gives up and kills the session.
So, it looks like not all replies are being statefully relayed to both branches.
Additionally, it looks like the following is happening:
- At step #6 above, the 200 OK passed to the SBC is for A only.
- The proxy elects to CANCEL the other branch to B between #6 and #7.
- After sending the CANCEL, the proxy decides to pass back the original 200 OK for the INVITE (with SDP) for B back to the SBC as well.
- After that, B replies with a 200 OK for the CANCEL issued by the proxy. Why does it reply with a 200 OK? Simply because it is after the INVITE was already OK'd? Is that per the RFC? I thought a call leg
could not be CANCEL'd at this stage at all and requires a BYE?
- SBC ACKs the 200 OK (for INVITE) from A, and proxy relays to A.
- Meanwhile, B keeps sending 200 OKs for the INVITE (AFTER a CANCEL on that branch!) and the proxy keeps relaying them back to the SBC, which replies with ACKs. But these ACKs keep getting forwarded back to A, not B, presumably because from the proxy's POV the B leg is now CANCEL'd and OK'd (in the penultimate step).
And here is the time-indexed sequence of events from the packet capture:
- Packet 9, time index 7.953711: 200 OK arrives from A.
- Packet 10, time index 7.954636: 200 OK arrives from B.
- Packet 11, time index 7.969227: Proxy passes 200 OK from A back to SBC.
- Packet 12, time index 7.969268: Proxy originates CANCEL for branch B.
- Packet 13, time index 7.970279: Proxy passes 200 OK from B back to SBC.
- Packet 14, time index 7.971508: 200 OK for CANCEL request arrives from B. [1]
- Packet 15, time index 8.018730: SBC originates ACK for branch to A.
- Packet 16, time index 8.018895: Proxy passes ACK for branch A to A.
- Packet 17, time index 8.153957: B retransmits 200 OK for INVITE.
- Packet 18, time index 8.155309: Proxy forwards 200 OK from B to SBC.
- Packet 19, time index 8.155853: SBC sends ACK again to A's contact. This is really strange because the Contact address in Packet 18 is for B.
According to Juha Heinanen, this may be a race condition because:
"9 and 10 arrive to proxy very close to each other, which may result in a race condition bug causing proxy to send packet 13, which it should not do."
---
I am using OpenSER 1.3.2. The sender UA is a Nextone SBC, and the two registrants are both Asterisk 1.4.21.2. This problem might occurs in the same way every single time, irrespectively of any temporal variations. It may be possible to reproduce by concurrently registering two Asterisk instances against the Kamailio registrar for one AOR and sending a call to them; also, append_branches is turned on for the registrar.
----------------------------------------------------------------------
Comment By: Alex Hermann (axlh)
Date: 2008-08-07 08:12
Message:
Logged In: YES
user_id=1212856
Originator: NO
Please quote the parts of the releveant rfc's that support your
statements.
The INVITE/CANCEL/200 OK race condition is well known, and the way to
handle it is to acknowledge the dialog with an ACK and terminate any
unwanted dialog with a BYE.
https://lists.cs.columbia.edu/pipermail/sip-implementors/2004-March/006217.html
> It seems to me that the problem here is that one hand of OpenSER is not
> aware of what the other hand is doing, or it would not pass a 200 OK
that
> it has *previously* received *after* terminating a branch.
You do mean the 200 OK for the INVITE, do you? If you mean the 200 OK for
the CANCEL, than you have a point, the proxy should absorb them.
The 200 OK for the INVITE MUST be passed to the UAC. The UAC should end
every dialog it is not interested in by sending a BYE. If your UAC fails to
do so, the UAC is broken, not the proxy.
RFC 3261 page 110, paragraph 2:
After a final response has been sent on the server transaction,
the following responses MUST be forwarded immediately:
- Any 2xx response to an INVITE request
RFC 3261 section 13.2.2.4:
Multiple 2xx responses may arrive at the UAC for a single INVITE
request due to a forking proxy.
----------------------------------------------------------------------
Comment By: Alex Balashov - Evariste System (abalashov)
Date: 2008-08-07 02:07
Message:
Logged In: YES
user_id=2167036
Originator: YES
Thank you, Anatoly.
While I agree that the SBC should handle this situation, the fact is,
OpenSER receives OK from branch A, receives OK from branch B, sends OK from
branch A, sends a CANCEL to branch B, then relays the OK from branch B. In
that order. Perhaps I am not understanding something about the specified
behaviour of a proxy, but I think that if it has taken it upon itself to
branch the call, it should be responsible for managing the outcome.
I understand that in principle, proxies are meant to properly relay
whatever they receive. But in this case, branching is performed by the
proxy, and the SBC is not aware of it. Furthermore, it is obvious that the
proxy is empowered to terminate one of the branches in response to a
certain set of conditions, so given that the situation above plays out in
the temporal order that it does, why would we say that it has no power to
intervene? The OKs are in-branch responses, not responses just to the
original dialog initiated by the SBC.
It seems to me that the problem here is that one hand of OpenSER is not
aware of what the other hand is doing, or it would not pass a 200 OK that
it has *previously* received *after* terminating a branch.
----------------------------------------------------------------------
Comment By: Anatoly Pidruchny (apidruchny)
Date: 2008-08-07 00:45
Message:
Logged In: YES
user_id=1759384
Originator: NO
Please excuse my barging in, but I just want to support Alex in the
opinion that it is OpenSER that has to take care of this. It is OpenSER,
not SBC, that forked the call to two branches. SBC has no idea that the
call was forked. IMO, when OpenSER receives second 200 OK from branch B, it
should not relay it to SBC, but it should send ACK immediately followed by
BYE to B, and later it should silently drop the 200 OK for that BYE. RFC is
not clear as to what to do in this case. And where is the definition of the
proxy saying that it can not send BYE?
/Anatoly.
----------------------------------------------------------------------
Comment By: Nobody/Anonymous (nobody)
Date: 2008-08-07 00:24
Message:
Logged In: NO
I just wanted to verify if the standard has the same opinion than:
16.7 Response Processing (Proxy)
10. Generate CANCELs
If the forwarded response was a final response, the proxy MUST
generate a CANCEL request for all pending client transactions
associated with this response context. A proxy SHOULD also
generate a CANCEL request for all pending client transactions
associated with this response context when it receives a 6xx
response. A pending client transaction is one that has
received a provisional response, but no final response (it is
in the proceeding state) and has not had an associated CANCEL
generated for it. Generating CANCEL requests is described in
Section 9.1.
The requirement to CANCEL pending client transactions upon
forwarding a final response does not guarantee that an endpoint
will not receive multiple 200 (OK) responses to an INVITE. 200
(OK) responses on more than one branch may be generated before
the CANCEL requests can be sent and processed. Further, it is
reasonable to expect that a future extension may override this
requirement to issue CANCEL requests.
IMO tis is not 100% written down, but it says the "endpoint" receives
multiple 200 OK. Further, as a proxy by definition can not send a BYE for a
logical point of view it has to forward the second 200 OK to the caller,
and the caller has to take care to terminate the second dialog.
Maybe the dialog could be used to fake BYE also in this scenario, but I
strongly suggest to fix the problem where it should be fixed -> in the SBC
- although this is probably the most complicated option ;-)
klaus
----------------------------------------------------------------------
Comment By: Nobody/Anonymous (nobody)
Date: 2008-08-06 22:41
Message:
Logged In: NO
> Kamailio should not be passing back that second 200 OK *after* it has
> CANCEL'd that same branch.
That is wrong. The proxy MUST forward the 200 OK as the dialog is already
established at B and a proxy can not terminate a dialog. The SBC MUST deal
with this race condition.
klaus
----------------------------------------------------------------------
Comment By: Alex Balashov - Evariste System (abalashov)
Date: 2008-08-06 18:32
Message:
Logged In: YES
user_id=2167036
Originator: YES
I must disagree here:
- Packet 11, time index 7.969227: Proxy passes 200 OK from A back to SBC.
- Packet 12, time index 7.969268: Proxy originates CANCEL for branch B.
- Packet 13, time index 7.970279: Proxy passes 200 OK from B back to SBC.
Kamailio should not be passing back that second 200 OK *after* it has
CANCEL'd that same branch.
I suspect the problem may be that the thread charged with passing back the
200 OKs from the branches back to the SBC and the thread issuing the CANCEL
are different threads and unaware of what the other has done at this
precise moment. I think that is the issue that needs to be fixed via some
form of locking / state machining.
Is this a naive, uninformed perspective?
----------------------------------------------------------------------
Comment By: Klaus Darilion (klaus_darilion)
Date: 2008-08-06 16:29
Message:
Logged In: YES
user_id=1318360
Originator: NO
I think Kamailio can do nothing in such a scenario. The CANCEL to B will
be ignored as the call is already answered (the reponse code of the
CANCEL-reply is rather irrelevant).
Only the caller can handle such situations. That means the SBC should
detect that the second 200 OK has a different to tag. Thus, there are 2
dialogs created at the SBC: one with A and one with B. Thus, the SBC has to
decide what to do - usually it should terminate the second dialog - i.e.
send ACK to both, A and B, and then send BYE to B.
IMO this can be closed.
----------------------------------------------------------------------
Comment By: Daniel-Constantin Mierla (miconda)
Date: 2008-08-06 10:14
Message:
Logged In: YES
user_id=1246013
Originator: NO
It is a well know race situation with 200OK for INVITE and CANCEL. The
RFC3261 is not clear about dealing with it, I don't know if there were some
updates in other documents. I will search in the next days (quite busy now)
for discussions related to same topic, so we can continue this thread.
Lowering the priority now to skip it in release blockers counting.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=743020&aid=2037070&group_id=139143
More information about the Devel
mailing list