[SR-Users] Mitigation of unavailable rtpproxy
abalashov at evaristesys.com
Wed Nov 6 14:05:28 CET 2013
(Sorry for cross-posting to -users and -dev; not really sure where this
post belongs most.)
A few days ago, I ran into an issue with a Kamailio server being
somewhat unresponsive, during moderate call volume, on account of a
down rtpproxy--the only rtpproxy in the set. This is rtpproxy classic,
Rtpproxy was not actually engaged on any of the initial INVITEs going
through the server; the server is configured to invoke it conditionally
based on a setting, and the setting was not set for any endpoints.
rtpproxy_manage() was never called.
However, I call unforce_rtp_proxy() unconditionally in my config when
handling CANCELs, reasoning that it can't do any harm if
rtpproxy_manage() was not called before.
Nevertheless, it seemed to be the case that this situation was clogging
up SIP worker threads, because some SIP messages were definitely
dropped. Periodic log messages about inability to reach the rtpproxy
were echoed as well. This problem cleared up almost immediately when
the rtpproxy instance was restored into service.
This raised some questions in my mind about the relationship between
rtpproxy management and SIP worker thread utilisation. I assume it was
my indiscriminate unforce_rtp_proxy() calls that were actually clogging
up the worker threads, right? If so, why? I figured that in the
unforce_rtp_proxy() case, the rtpproxy module simply sends
fire-and-forget UDP messages down the UDP control socket without any
sort of blocking for acknowledgement, since in this case the call must
be released on the rtpproxy side without doing any rewriting of SDP on
the Kamailio side (unlike in the case where rtpproxy is engaged). Thus,
there should be no need to wait for ports to substitute into the
message. Or is the same response-wait mechanism used regardless, even
in the unforce_rtp_proxy() case, for programmatic reasons?
More broadly, is there any way that this scenario can be prevented? In
other words, is there a way to work around an outage of all rtpproxies
in the set without tying up workers, or at least tying them up less
 Is this a reasonable assumption?
The reason I do this is that I don't see a way to find out if
rtpproxy was engaged from the body of a CANCEL message. I do check
for a ;proxy_media RR parameter when handling BYEs, but since a
CANCEL is not an in-dialog request, I'm not sure what to do except
to call unforce_rtp_proxy()/rtpproxy_manage() indiscriminately,
without resorting to storing state in htable or other complications
I don't want.
Alex Balashov - Principal
Evariste Systems LLC
235 E Ponce de Leon Ave
Decatur, GA 30030
Web: http://www.evaristesys.com/, http://www.alexbalashov.com/
More information about the sr-users