Hi,
I'm wondering what the best approach to handling a SIP dialog when one endpoint disappears/fails to send the BYE message.
I have Kamailio as a proxy for all mobile (iPhone/Android) SIP clients. Occasionally, the user hangs up the call but no BYE message is received. This means that Asterisk has an open channel even though there is no client. Kamailio also continues to receive successful registrations from the SIP client so the endpoint is not down completely.
Is Kamailio the appropriate place to handle this situation? What do you recommend? If not could you point me in the right direction? RTP timeout? Asterisk? The SIP client itself?
Thanks for your help.
Benjamin Fitzgerald LETS Corporation (925) 235-1154 ben@letscorp.us
RTP timeout in asterisk is the best place to handle the situation. Another option is SIP session timer, but it could give false negatives with NATed clients.
On Friday 08 January 2016 11:56:51 Benjamin Fitzgerald wrote:
Hi,
I'm wondering what the best approach to handling a SIP dialog when one endpoint disappears/fails to send the BYE message.
I have Kamailio as a proxy for all mobile (iPhone/Android) SIP clients. Occasionally, the user hangs up the call but no BYE message is received. This means that Asterisk has an open channel even though there is no client. Kamailio also continues to receive successful registrations from the SIP client so the endpoint is not down completely.
Is Kamailio the appropriate place to handle this situation? What do you recommend? If not could you point me in the right direction? RTP timeout? Asterisk? The SIP client itself?
Thanks for your help.
Benjamin Fitzgerald LETS Corporation (925) 235-1154 ben@letscorp.us
Hi Benjamin,
To some extent, this is just a perennial, existential problem of using a proxy, so part of the answer is going to be that you need fundamentally reliable signalling, speaking from the vantage point of something which operates are a signalling relay (i.e. Kamailio).
However, I understand that reality does not mirror expectations. As the purveyor of a SIP service delivery platform based entirely on Kamailio, we run into this problem all the time, particularly since our system generates accounting records with billing involvement. There are some well-established and canonical solutions:
1. You make it sound like the Asterisk channel stays up indefinitely in such a situation. Why is that?
The normal behaviour is for Asterisk to hang up the call after some number of seconds without incoming RTP.
It's likely that tuning the RTP timeout setting to something conservative[1] would solve a lot of your problems off the bat.
2. The Kamailio 'dialog' module can spoof a BYE toward both endpoints based on an absolute dialog timeout (regardless of whether both dialog peers are still actively engaged), which can be set globally or on a per-dialog basis:
http://kamailio.org/docs/modules/4.3.x/modules/dialog.html#timeout-avp-id
http://kamailio.org/docs/modules/4.3.x/modules/dialog.html#default-timeout-i...
http://www.kamailio.org/wiki/cookbooks/4.3.x/pseudovariables#dlg_ctx_attr
3. The 'dialog' module also has a dead peer detection / keepalive scheme based on sequential OPTIONS pings:
http://kamailio.org/docs/modules/4.3.x/modules/dialog.html#idp1898328
If one or both of the peers don't respond to these, the dialog will be timed out, and if you've set $dlg_ctx(timeout_bye) = 1, this will result in a spoofed BYE toward both peers as well.
4. There are various other signalling-oriented UA-side mechanisms intended to solve this problem as well, such as SIP Session Timers (RFC 4028).
...
Of course, all this depends on the maintenance of dialog state in Kamailio, which is an additional complication and a potential wrinkle if that data were to be lost.
So, it's a bit hard to say whether Kamailio is the _best_ place to solve this problem. The first line of defence really should be at the endpoint level on both sides of the proxy. Beyond that, Kamailio does offer some pragmatic solutions.
-- Alex
[1] Notwithstanding RTP interruptions due to VAD, hold, etc.
Hi Alex,
Thanks for your quick response.
1. Sorry to be unclear, the Asterisk channel does not stay up indefinitely. We do have a max timeout but since a large portion of our business is based on conference calling, the timeout is rather large. I will definitely change the RTP timeout as my first attempt.
2. Since Asterisk is also a serving as PSTN gateway, I like this because it allows me to control calls with SIP endpoints separately. We have no issues with all PSTN calls and I'd like to keep it that way :)
3. I'm not sure this will work in my case because the endpoint is reachable, but client state is not in sync with the server: i.e. Kamailio/Asterisk think it's in a call but the endpoint does not. If sending OPTIONS could tell me if the endpoint thinks it's in a call or not, then this could potentially work. On a side note, is there a SIP message that I can send to a client to have it report its state? (Registered, Auth Failed, In a call, etc.)
4. I do know about SIP Session Timers but chose to not use them during the initial deployment (because of Asterisk channel timeout which I know realize is too large). Maybe this will help in conjunction with the above methods.
Would you mind expanding on endpoint defense? Specifically with mobile client applications? I agree this would be the ideal solution, I'm just not sure where to start here.
Benjamin Fitzgerald LETS Corporation (925) 235-1154 ben@letscorp.us
*******Confidential Notice: This message is intended only for the use of the individual or entity to which it is addressed and may contain information that is privileged, confidential and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this message in error, please delete this message from all computers and contact Orion Systems/LETS Corp immediately by return e-mail and/or telephone at (925) 566-5600
On Fri, Jan 8, 2016 at 12:08 PM, Alex Balashov abalashov@evaristesys.com wrote:
Hi Benjamin,
To some extent, this is just a perennial, existential problem of using a proxy, so part of the answer is going to be that you need fundamentally reliable signalling, speaking from the vantage point of something which operates are a signalling relay (i.e. Kamailio).
However, I understand that reality does not mirror expectations. As the purveyor of a SIP service delivery platform based entirely on Kamailio, we run into this problem all the time, particularly since our system generates accounting records with billing involvement. There are some well-established and canonical solutions:
- You make it sound like the Asterisk channel stays up indefinitely in
such a situation. Why is that?
The normal behaviour is for Asterisk to hang up the call after some number of seconds without incoming RTP.
It's likely that tuning the RTP timeout setting to something conservative[1] would solve a lot of your problems off the bat.
- The Kamailio 'dialog' module can spoof a BYE toward both endpoints
based on an absolute dialog timeout (regardless of whether both dialog peers are still actively engaged), which can be set globally or on a per-dialog basis:
http://kamailio.org/docs/modules/4.3.x/modules/dialog.html#timeout-avp-id
http://kamailio.org/docs/modules/4.3.x/modules/dialog.html#default-timeout-i...
http://www.kamailio.org/wiki/cookbooks/4.3.x/pseudovariables#dlg_ctx_attr
- The 'dialog' module also has a dead peer detection / keepalive scheme
based on sequential OPTIONS pings:
http://kamailio.org/docs/modules/4.3.x/modules/dialog.html#idp1898328
If one or both of the peers don't respond to these, the dialog will be timed out, and if you've set $dlg_ctx(timeout_bye) = 1, this will result in a spoofed BYE toward both peers as well.
- There are various other signalling-oriented UA-side mechanisms intended
to solve this problem as well, such as SIP Session Timers (RFC 4028).
...
Of course, all this depends on the maintenance of dialog state in Kamailio, which is an additional complication and a potential wrinkle if that data were to be lost.
So, it's a bit hard to say whether Kamailio is the _best_ place to solve this problem. The first line of defence really should be at the endpoint level on both sides of the proxy. Beyond that, Kamailio does offer some pragmatic solutions.
-- Alex
[1] Notwithstanding RTP interruptions due to VAD, hold, etc.
-- Alex Balashov | Principal | Evariste Systems LLC 303 Perimeter Center North, Suite 300 Atlanta, GA 30346 United States
Tel: +1-800-250-5920 (toll-free) / +1-678-954-0671 (direct) Web: http://www.evaristesys.com/, http://www.csrpswitch.com/
SIP Express Router (SER) and Kamailio (OpenSER) - sr-users mailing list sr-users@lists.sip-router.org http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-users
Benjamin,
On 01/08/2016 03:25 PM, Benjamin Fitzgerald wrote:
- Sorry to be unclear, the Asterisk channel does not stay up
indefinitely. We do have a max timeout but since a large portion of our business is based on conference calling, the timeout is rather large. I will definitely change the RTP timeout as my first attempt.
Yes, but I was referring specifically to the RTP timeout. If the mobile endpoint goes away, it will stop sending RTP. If Asterisk detects that no RTP has been received in x seconds, it should hang up the channel, after prophylactically sending a BYE for the call in the direction of Kamailio/the mobile peer.
I had been under the impression that Asterisk has a fairly conservative default RTP timeout anyway, but it seems I may be mistaken:
https://github.com/asterisk/asterisk/blob/master/configs/samples/pjsip.conf....
https://github.com/asterisk/asterisk/blob/master/configs/samples/sip.conf.sa...
(Not sure which SIP channel driver you're using.)
- I'm not sure this will work in my case because the endpoint is
reachable, but client state is not in sync with the server: i.e. Kamailio/Asterisk think it's in a call but the endpoint does not. If sending OPTIONS could tell me if the endpoint thinks it's in a call or not, then this could potentially work.
Would sending a BYE to both peers not have the effect of synchronising them forcefully to a state of "the call is hung up"?
If you're concerned about sending a BYE to an endpoint that thinks the call is already hung up, don't be. In that case, it'll simply be rejected. You can't negatively affect the state of a dialog that's already dead.
Curious, however: when you say "Kamailio/Asterisk think it's in a call", how does this apply to Kamailio?
Stateful SIP proxies are transaction-stateful, not dialog-stateful.
Thus, by default, Kamailio doesn't know anything about "calls", but only the SIP transactions of which they are made up, and only for so long as those transactions are active. The 'dialog' module allows Kamailio to be call-stateful, at the cost of additional statekeeping complexity, but you should only use this capability if you need it for something (e.g. limiting concurrent calls, keepalive/timeout as described previously, etc.)
On a side note, is there a SIP message that I can send to a client to have it report its state? (Registered, Auth Failed, In a call, etc.)
There's no standard query mechanism like this that I am aware of; the only way of disseminating such state information with which I'm familiar is presence, which is proactively pushed out by the endpoints and requires server-side support.
- I do know about SIP Session Timers but chose to not use them during
the initial deployment (because of Asterisk channel timeout which I know realize is too large). Maybe this will help in conjunction with the above methods.
SSTs are rather bureaucratic and, in my experience, often incorrectly implemented or unsupported. In the SST conception of things, the roles in keepalive ping-pong are negotiated entirely between the UAs, and it is up to the UAs to maintain those roles. This goes wrong easily enough that server-side solutions such as periodic reinvites and other "pings" (like the Kamailio dialog module's OPTIONS pings) are a rather popular alternative.
Would you mind expanding on endpoint defense? Specifically with mobile client applications? I agree this would be the ideal solution, I'm just not sure where to start here.
By "endpoint defence" I simply meant that detecting dead peers should be up to the SIP endpoints (mobile SIP client and Asterisk, by the sound of it) first and foremost, and that any proxy-side measures should be a secondary layer.
-- Alex
Alex,
I think #1 fixed it for me! Thank you so much! I changed the RTP timeout on a test account SIP account and immediately it resolved the issue.
You're right, sending a BYE would effectively synchronize them however I did not think keepalive using OPTIONS scheme would send a BYE message in the event of a dead RTP session. That's why I thought this scheme may not work.
I was mistaken about referring to Kamailio as dialog stateful, it's just easier for me to think about a call that way. When debugging this problem, I pulled up the SIP dialog on my Homer server and saw the last message being 200 OK sent to the SIP Client (after Invite/Trying) and the BYE was never sent back from the client. I suppose I phrased this incorrectly as Kamailio thinks the endpoint is in a call, when really it is just Asterisk and I am personally associating the state with these transactions.
Yes, I recall when I initially read about SSTs, many people reported they had difficulty getting them to function properly. So far it looks like I will not have to implement any proxy-side measures.
Benjamin Fitzgerald LETS Corporation (925) 235-1154 ben@letscorp.us
*******Confidential Notice: This message is intended only for the use of the individual or entity to which it is addressed and may contain information that is privileged, confidential and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this message in error, please delete this message from all computers and contact Orion Systems/LETS Corp immediately by return e-mail and/or telephone at (925) 566-5600
On Fri, Jan 8, 2016 at 12:47 PM, Alex Balashov abalashov@evaristesys.com wrote:
Benjamin,
On 01/08/2016 03:25 PM, Benjamin Fitzgerald wrote:
- Sorry to be unclear, the Asterisk channel does not stay up
indefinitely. We do have a max timeout but since a large portion of our business is based on conference calling, the timeout is rather large. I will definitely change the RTP timeout as my first attempt.
Yes, but I was referring specifically to the RTP timeout. If the mobile endpoint goes away, it will stop sending RTP. If Asterisk detects that no RTP has been received in x seconds, it should hang up the channel, after prophylactically sending a BYE for the call in the direction of Kamailio/the mobile peer.
I had been under the impression that Asterisk has a fairly conservative default RTP timeout anyway, but it seems I may be mistaken:
https://github.com/asterisk/asterisk/blob/master/configs/samples/pjsip.conf....
https://github.com/asterisk/asterisk/blob/master/configs/samples/sip.conf.sa...
(Not sure which SIP channel driver you're using.)
- I'm not sure this will work in my case because the endpoint is
reachable, but client state is not in sync with the server: i.e. Kamailio/Asterisk think it's in a call but the endpoint does not. If sending OPTIONS could tell me if the endpoint thinks it's in a call or not, then this could potentially work.
Would sending a BYE to both peers not have the effect of synchronising them forcefully to a state of "the call is hung up"?
If you're concerned about sending a BYE to an endpoint that thinks the call is already hung up, don't be. In that case, it'll simply be rejected. You can't negatively affect the state of a dialog that's already dead.
Curious, however: when you say "Kamailio/Asterisk think it's in a call", how does this apply to Kamailio?
Stateful SIP proxies are transaction-stateful, not dialog-stateful.
Thus, by default, Kamailio doesn't know anything about "calls", but only the SIP transactions of which they are made up, and only for so long as those transactions are active. The 'dialog' module allows Kamailio to be call-stateful, at the cost of additional statekeeping complexity, but you should only use this capability if you need it for something (e.g. limiting concurrent calls, keepalive/timeout as described previously, etc.)
On a side note, is there a SIP message that I can send to a client to
have it report its state? (Registered, Auth Failed, In a call, etc.)
There's no standard query mechanism like this that I am aware of; the only way of disseminating such state information with which I'm familiar is presence, which is proactively pushed out by the endpoints and requires server-side support.
- I do know about SIP Session Timers but chose to not use them during
the initial deployment (because of Asterisk channel timeout which I know realize is too large). Maybe this will help in conjunction with the above methods.
SSTs are rather bureaucratic and, in my experience, often incorrectly implemented or unsupported. In the SST conception of things, the roles in keepalive ping-pong are negotiated entirely between the UAs, and it is up to the UAs to maintain those roles. This goes wrong easily enough that server-side solutions such as periodic reinvites and other "pings" (like the Kamailio dialog module's OPTIONS pings) are a rather popular alternative.
Would you mind expanding on endpoint defense? Specifically with mobile
client applications? I agree this would be the ideal solution, I'm just not sure where to start here.
By "endpoint defence" I simply meant that detecting dead peers should be up to the SIP endpoints (mobile SIP client and Asterisk, by the sound of it) first and foremost, and that any proxy-side measures should be a secondary layer.
-- Alex
-- Alex Balashov | Principal | Evariste Systems LLC 303 Perimeter Center North, Suite 300 Atlanta, GA 30346 United States
Tel: +1-800-250-5920 (toll-free) / +1-678-954-0671 (direct) Web: http://www.evaristesys.com/, http://www.csrpswitch.com/
SIP Express Router (SER) and Kamailio (OpenSER) - sr-users mailing list sr-users@lists.sip-router.org http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-users
On 01/08/2016 04:32 PM, Benjamin Fitzgerald wrote:
I think #1 fixed it for me! Thank you so much! I changed the RTP timeout on a test account SIP account and immediately it resolved the issue.
Excellent! Happy to help.
You're right, sending a BYE would effectively synchronize them however I did not think keepalive using OPTIONS scheme would send a BYE message in the event of a dead RTP session. That's why I thought this scheme may not work.
No, indeed it would not; Kamailio has no awareness of RTP whatsoever. All such schemes, including this one, as well as SSTs, are aimed at detecting dead peers purely from signalling (that is, SIP) alone.
The way the OPTIONS keepalive flow works, for example, is:
UA A Proxy UA B ================================== ---- OPTIONS ----> <---- 200 OK ----- <--- OPTIONS ---- --- 200 OK ---->
If UA B goes away:
UA A Proxy UA B ================================== ---- OPTIONS ----> [no response] ...
------- BYE -------> <---- BYE ----- --- 200 OK --->
The BYEs are crafted by Kamailio to (a) look to UA A like they came from UA B and (b) to look to UA B like they came from UA A. This is because a proxy cannot, formally speaking, endogenously originate in-dialog requests. So, it's definitely a spoof-hack, but it works.
This mimics the more typical B2BUA-based approach of periodically hitting both ends with empty reinvites whose effect is nullary (i.e. no SDP amendments or remote dialog URI changes), and whose sole purpose is to see if a 200 OK response is returned. If not, you can assume the endpoint's dead or unreachable.
All proxy-side measures are going to operate on SIP solely.
I suppose I phrased this incorrectly as Kamailio thinks the endpoint is in a call, when really it is just Asterisk and I am personally associating the state with these transactions.
Well, it's understandable that you think in call-centric terms; we all do. It just becomes important when illuminating distinctions tied up in protocol semantics. It's a task requiring some pedantry.
-- Alex