Hello,
(Sorry for cross-posting to -users and -dev; not really sure where this post belongs most.)
A few days ago, I ran into an issue with a Kamailio server being somewhat unresponsive, during moderate call volume, on account of a down rtpproxy--the only rtpproxy in the set. This is rtpproxy classic, not ngcp-mediaproxy-ng.
Rtpproxy was not actually engaged on any of the initial INVITEs going through the server; the server is configured to invoke it conditionally based on a setting, and the setting was not set for any endpoints. rtpproxy_manage() was never called.
However, I call unforce_rtp_proxy() unconditionally in my config when handling CANCELs, reasoning that it can't do any harm if rtpproxy_manage() was not called before[1].
Nevertheless, it seemed to be the case that this situation was clogging up SIP worker threads, because some SIP messages were definitely dropped. Periodic log messages about inability to reach the rtpproxy were echoed as well. This problem cleared up almost immediately when the rtpproxy instance was restored into service.
This raised some questions in my mind about the relationship between rtpproxy management and SIP worker thread utilisation. I assume it was my indiscriminate unforce_rtp_proxy() calls that were actually clogging up the worker threads, right? If so, why? I figured that in the unforce_rtp_proxy() case, the rtpproxy module simply sends fire-and-forget UDP messages down the UDP control socket without any sort of blocking for acknowledgement, since in this case the call must be released on the rtpproxy side without doing any rewriting of SDP on the Kamailio side (unlike in the case where rtpproxy is engaged). Thus, there should be no need to wait for ports to substitute into the message. Or is the same response-wait mechanism used regardless, even in the unforce_rtp_proxy() case, for programmatic reasons?
More broadly, is there any way that this scenario can be prevented? In other words, is there a way to work around an outage of all rtpproxies in the set without tying up workers, or at least tying them up less severely?
Thanks!
-- Alex
[1] Is this a reasonable assumption?
The reason I do this is that I don't see a way to find out if rtpproxy was engaged from the body of a CANCEL message. I do check for a ;proxy_media RR parameter when handling BYEs, but since a CANCEL is not an in-dialog request, I'm not sure what to do except to call unforce_rtp_proxy()/rtpproxy_manage() indiscriminately, without resorting to storing state in htable or other complications I don't want.
Hello,
there are some parameters to control the timeout+retries for waiting a reply from rtpproxy:
http://kamailio.org/docs/modules/stable/modules/rtpproxy.html#idp15243344
Looking it the code, it seems the value for timeout parameter is sec, but could be easily made miliseconds, because the function used inside is poll() which takes timeout as milisec.
Cheers, Daniel
On 11/6/13 2:05 PM, Alex Balashov wrote:
Hello,
(Sorry for cross-posting to -users and -dev; not really sure where this post belongs most.)
A few days ago, I ran into an issue with a Kamailio server being somewhat unresponsive, during moderate call volume, on account of a down rtpproxy--the only rtpproxy in the set. This is rtpproxy classic, not ngcp-mediaproxy-ng.
Rtpproxy was not actually engaged on any of the initial INVITEs going through the server; the server is configured to invoke it conditionally based on a setting, and the setting was not set for any endpoints. rtpproxy_manage() was never called.
However, I call unforce_rtp_proxy() unconditionally in my config when handling CANCELs, reasoning that it can't do any harm if rtpproxy_manage() was not called before[1].
Nevertheless, it seemed to be the case that this situation was clogging up SIP worker threads, because some SIP messages were definitely dropped. Periodic log messages about inability to reach the rtpproxy were echoed as well. This problem cleared up almost immediately when the rtpproxy instance was restored into service.
This raised some questions in my mind about the relationship between rtpproxy management and SIP worker thread utilisation. I assume it was my indiscriminate unforce_rtp_proxy() calls that were actually clogging up the worker threads, right? If so, why? I figured that in the unforce_rtp_proxy() case, the rtpproxy module simply sends fire-and-forget UDP messages down the UDP control socket without any sort of blocking for acknowledgement, since in this case the call must be released on the rtpproxy side without doing any rewriting of SDP on the Kamailio side (unlike in the case where rtpproxy is engaged). Thus, there should be no need to wait for ports to substitute into the message. Or is the same response-wait mechanism used regardless, even in the unforce_rtp_proxy() case, for programmatic reasons?
More broadly, is there any way that this scenario can be prevented? In other words, is there a way to work around an outage of all rtpproxies in the set without tying up workers, or at least tying them up less severely?
Thanks!
-- Alex
[1] Is this a reasonable assumption?
The reason I do this is that I don't see a way to find out if rtpproxy was engaged from the body of a CANCEL message. I do check for a ;proxy_media RR parameter when handling BYEs, but since a CANCEL is not an in-dialog request, I'm not sure what to do except to call unforce_rtp_proxy()/rtpproxy_manage() indiscriminately, without resorting to storing state in htable or other complications I don't want.
On 11/06/2013 08:45 AM, Daniel-Constantin Mierla wrote:
there are some parameters to control the timeout+retries for waiting a reply from rtpproxy:
http://kamailio.org/docs/modules/stable/modules/rtpproxy.html#idp15243344
Looking it the code, it seems the value for timeout parameter is sec, but could be easily made miliseconds, because the function used inside is poll() which takes timeout as milisec.
Thank you, Daniel.
1. So, am I right to assume that the unforce_rtp_proxy() call waits for timeout and blocks the worker while doing so?
2. Is there any harm in calling unforce_rtp_proxy() for Call-IDs rtpproxy doesn't know about? is there a 'better' best practice for handling CANCELs where it is unknown whether rtpproxy was engaged on the initial call (because it is an option, nat_uac_detect, etc)?
-- Alex
On 11/6/13 2:58 PM, Alex Balashov wrote:
On 11/06/2013 08:45 AM, Daniel-Constantin Mierla wrote:
there are some parameters to control the timeout+retries for waiting a reply from rtpproxy:
http://kamailio.org/docs/modules/stable/modules/rtpproxy.html#idp15243344
Looking it the code, it seems the value for timeout parameter is sec, but could be easily made miliseconds, because the function used inside is poll() which takes timeout as milisec.
Thank you, Daniel.
- So, am I right to assume that the unforce_rtp_proxy() call waits
for timeout and blocks the worker while doing so?
iirc, yes, each command has a reply. You can put the control socket on udp/network and use ngrep for a quick check.
- Is there any harm in calling unforce_rtp_proxy() for Call-IDs
rtpproxy doesn't know about? is there a 'better' best practice for handling CANCELs where it is unknown whether rtpproxy was engaged on the initial call (because it is an option, nat_uac_detect, etc)?
No, it is no harm to call rtpproxy for non-existing sessions. You can even skip it, there is a session timeout in rtpproxy -- I don't know default value, but probably can be set via command line parameter -- so if you are not short in ports, you can just leave rtpproxy alone with closed calls without calling unforce command.
Cheers, Daniel
On Wed, Nov 6, 2013 at 12:03 PM, Daniel-Constantin Mierla miconda@gmail.com wrote:
On 11/6/13 2:58 PM, Alex Balashov wrote:
- Is there any harm in calling unforce_rtp_proxy() for Call-IDs rtpproxy
doesn't know about? is there a 'better' best practice for handling CANCELs where it is unknown whether rtpproxy was engaged on the initial call (because it is an option, nat_uac_detect, etc)?
No, it is no harm to call rtpproxy for non-existing sessions. You can even skip it, there is a session timeout in rtpproxy -- I don't know default value, but probably can be set via command line parameter -- so if you are not short in ports, you can just leave rtpproxy alone with closed calls without calling unforce command.
I seem to recall that the default is to close the session after 60 seconds of no RTP, but I'm not able to verify that right now.
Corey
On 11/06/2013 02:03 PM, Daniel-Constantin Mierla wrote:
On 11/6/13 2:58 PM, Alex Balashov wrote:
- Is there any harm in calling unforce_rtp_proxy() for Call-IDs
rtpproxy doesn't know about? is there a 'better' best practice for handling CANCELs where it is unknown whether rtpproxy was engaged on the initial call (because it is an option, nat_uac_detect, etc)?
No, it is no harm to call rtpproxy for non-existing sessions. You can even skip it, there is a session timeout in rtpproxy -- I don't know default value, but probably can be set via command line parameter -- so if you are not short in ports, you can just leave rtpproxy alone with closed calls without calling unforce command.
And, in the rtpproxy control protocol, the sessions are keyed by SIP Call-ID, right, not by tuples of IP endpoints and RTP ports? That is to say, there's no danger of stopping an existing conversation this way (assuming the Call-IDs are reasonable GUIDs and all that)?
-- Alex