New subject: rtpengine port allocation & recycling [solved]

2 Feb 2016


      Hello,
I have a high-volume installation with 100% RTP relay using rtpengine.
We run rtpengine with a high and low port range of 8000 - 65000. 
Generally the peak concurrent channel count is around 2000-4000 ports, 
so in theory, so we should have enough ports for all calls currently in 
process.
In practice, however, we get a nearly constant stream of these errors 
from Kamailio:
Feb  2 15:52:58 sip /usr/local/sbin/kamailio[16438]: ERROR: rtpengine 
[rtpengine.c:1541]: rtpp_function_call(): proxy replied with error: 
Error rewriting SDP
When I trace this back to rtpengine logs on the server, I see:
---
Feb  2 15:55:47 rtphost rtpengine[14710]: 
[FSRIGZPJGXZWPNO.52364.907376806] Received command 'offer' from 
215.111.222.101:54100
Feb  2 15:55:47 rtphost rtpengine[14710]: 
[FSRIGZPJGXZWPNO.52364.907376806] Creating new call
Feb  2 15:55:47 rtphost rtpengine[14710]: 
[FSRIGZPJGXZWPNO.52364.907376806] Failed to get 2 consecutive UDP ports 
for relay
Feb  2 15:55:47 rtphost rtpengine[14710]: 
[FSRIGZPJGXZWPNO.52364.907376806] Error allocating media ports
Feb  2 15:55:47 rtphost rtpengine[14710]: 
[FSRIGZPJGXZWPNO.52364.907376806] Protocol error in packet from 
215.111.222.101:54100: Error rewriting SDP [d3:sdp285:v=0#015#012o=- 
1631132006999214513 4263823554000404972 IN IP4 
45.201.202.203#015#012s=VOS2009#015#012c=IN IP4 
45.201.202.203#015#012t=0 0#015#012m=audio 29034 RTP/AVP 18 8 0 
101#015#012a=rtpmap:18 G729/8000#015#012a=fmtp:18 
annexb=no#015#012a=rtpmap:8 PCMA/8000#015#012a=rtpmap:0 
PCMU/8000#015#012a=rtpmap:101 telephone-event/8000#015#012a=fmtp:101 
0-15#015#0123:ICE6:remove7:replacel6:origin18:session-connectione7:call-id31:FSRIGZPJGXZWPNO.52364.90737680613:received-froml3:IP415:215.111.222 
...
Feb  2 15:55:47 rtphost rtpengine[14710]: 
[FSRIGZPJGXZWPNO.52364.907376806] ... 
5.125e8:from-tag32:709abb985b675e1497597a5bd44ca2d67:command5:offere]
---
This is despite nearly no rtpengine engagement:
[root@csrprtpus1 log]# cat /proc/rtpengine/0/status
Refcount:    1
Control PID: 14709
Targets:     0
Buckets:     0
As far as I can see, the cause is that rtpengine is not deallocating ports:
---
[root@rtphost log]# lsof -n | grep -i rtp
rtpengine 14710        root *542u     IPv6         1082109949 
0t0        UDP *:39910
rtpengine 14710        root *543u     IPv6         1082109950 
0t0        UDP *:39911
rtpengine 14710        root *544u     IPv6         1082218768 
0t0        UDP *:41469
rtpengine 14710        root *545u     IPv6         1082058934 
0t0        UDP *:46424
rtpengine 14710        root *546u     IPv6         1082366799 
0t0        UDP *:22915
rtpengine 14710        root *547u     IPv6         1082288397 
0t0        UDP *:57245
rtpengine 14710        root *548u     IPv6         1082428020 
0t0        UDP *:24311
rtpengine 14710        root *549u     IPv6         1082366800 
0t0        UDP *:23266
rtpengine 14710        root *550u     IPv6         1082798144 
0t0        UDP *:22743
rtpengine 14710        root *551u     IPv6         1082605782 
0t0        UDP *:47845
rtpengine 14710        root *552u     IPv6         1082062515 
0t0        UDP *:52766
rtpengine 14710        root *553u     IPv6         1082648709 
0t0        UDP *:58266
rtpengine 14710        root *554u     IPv6         1082122218 
0t0        UDP *:54359
...
---
There are _lots_ of these:
[root@rtphost log]# lsof -n | grep -i rtpengine | grep 'UDP *:' | wc -l
56964
So, I suppose most of these just aren't being freed up.
These are short-duration calls, and the vast majority of them are 
CANCEL'd at various stages of completion. My approach to invocation of 
rtpengine in the config is:
1) rtpengine_offer() in branch_route[] on outbound call:
set_rtpengine_set("1");
    rtpengine_offer("replace-origin replace-session-connection ICE=remove");
    add_rr_param(";proxy_media=yes");
    setflag(PROXY_MEDIA);
This can happen multiple times, as many branches may be attempted before 
a gateway is found to take the call.
2) Symmetrical rtpengine_answer() in onreply_route:
if(isflagset(PROXY_MEDIA) && has_body("application/sdp")
       && $var(origin_b2bua) == 0) {
       set_rtpengine_set("1");
       rtpengine_answer("replace-origin replace-session-connection 
ICE=remove");
    }
3) rtpengine_delete() in event of CANCEL:
if(is_method("CANCEL")) {
       # Kill RTP stream if this call was using rtpengine.
set_rtpengine_set("1");
       rtpengine_delete();
# Implicitly forward the CANCEL and drop out of script
       # if the corresponding INVITE transaction exists.
if(!t_relay_cancel()) {
         # If we are still here, the INVITE transaction
         # was found but an error occurred.
sl_send_reply("500", "Internal Server Error");
         exit;
      }
# If we are still here, this is also bad because it means
      # the corresponding INVITE transaction no longer exists.
xlog("L_INFO", 'action=R-MAIN-REQUEST | ret=CANCEL_NO_TRANS_MATCH');
exit;
4) rtpengine_delete() in case of BYE:
if(has_totag()) {
       if(loose_route()) {
          if(is_method("BYE")) {
             if(check_route_param("proxy_media=yes")) {
                set_rtpengine_set("1");
                rtpengine_delete();
             }
          } else if(is_method("INVITE")) {
             # update rtpengine_offer() etc.
          }
...
       } else {
          ...
       }
    }
Any obvious problem here?
Thanks!
-- Alex
-- 
Alex Balashov | Principal | Evariste Systems LLC
303 Perimeter Center North, Suite 300
Atlanta, GA 30346
United States

Tel: +1-800-250-5920 (toll-free) / +1-678-954-0671 (direct)
Web: http://www.evaristesys.com/, http://www.csrpswitch.com/