[Serdev] Re: and another RTPPROXY outage

Thu Jun 30 18:03:57 UTC 2005

Can you start rtpproxy in foreground (-f) with -v option and log its 
output into some file, maybe I can spot the possible cause of such 
misbehaviour once it happens again.

Thanks!

Regards,

Maxim

Andres wrote:
> Hi Maxim,
> 
> We implemented rtpproxy with UDP sockets about 2 months ago according to 
> Jan's suggestion.  Today we had another outage, but this time at least 
> it only affected users needing the RTPPROXY (SER continued to operate 
> normally, which is great progress).
> 
> The symptom is the same as before.  The Recv-Q  buffer maxes out and 
> apparently locks up the process.  We simply see a whole bunch of lines 
> like this:
> 
> # netstat -pau
> Active Internet connections (servers and established)
> Proto Recv-Q Send-Q Local Address               Foreign 
> Address             State       PID/Program name
> udp   255496      0 sitges.telesip.ne:48140 
> *:*                                 974/rtpproxy  udp        0      0 
> sitges.telesip.ne:48141 *:*                                 
> 974/rtpproxy       udp   255496      0 sitges.telesip.ne:48142 
> *:*                                 974/rtpproxy       udp        0      
> 0 sitges.telesip.ne:48143 *:*                                 974/rtpproxy
> udp   141488      0 sitges.telesip.ne:48332 
> *:*                                 974/rtpproxy
> udp   255496      0 sitges.telesip.ne:48334 
> *:*                                 974/rtpproxy
> 
> Plus the SYSLOG gets filled with lines like:
> Jun 29 16:07:37 sitges /usr/local/sbin/ser[32267]: ERROR: 
> send_rtpp_command: timeout waiting reply from a RTP proxy
> 
> 
> Killing the rtpproxy process and starting it again fixes the issue.  I 
> still do not see how we can reproduce it since almost 2 months passed 
> since the last ocurrence.  Do you have a version that can give more 
> debug info.  We would be glad to test it out in order to find a solution 
> to this.
> 
> Our version is
> # ./rtpproxy -v
> Basic version: 20040107
> Extension 20050322: Support for multiple RTP streams and MOH
> 
> Thanks,
> Andres.
> 
> 
> Andres wrote:
> 
>> Maxim Sobolev wrote:
>>
>>> Andres,
>>>
>>> What you are reporting is very strange indeed. Unfortunately your 
>>> report doesn't contain enough information to identify the source of 
>>> the problem. Is the problem reproduceable?
>>
>>
>>
>> Hi Maxim,
>>
>> The problem is not reproducible but it has happened to someone else 
>> before.  Take a look at:
>> http://lists.iptel.org/pipermail/serusers/2005-April/017970.html
>>
>> The recomendation from Jan was to switch to UDP sockets.  We will be 
>> implementing that after a few tests so hopefully if rtpproxy blocks in 
>> the future, it will not take down SER with it.
>>
>> Thanks,
>> Andres
>>
>>>
>>> -Maxim
>>>
>>> Andres wrote:
>>>
>>>> Hi,
>>>>
>>>> After more than 2 years of flawlessly processing millions of calls 
>>>> and going through versions 0.8.10 all the way to 0.9.1, we had our 
>>>> first major SER outage yesterday.  One of our SER boxes stopped 
>>>> responding completely to any SIP messages (running on 0.9.1 from 
>>>> about 3 weeks ago).  We stopped and started SER multiple times, ran 
>>>> sniffer traces, turned on maximum debugging and all we could see was 
>>>> that SER did not respond to anything.  Not even "serctl moni" seemed 
>>>> to work.
>>>>
>>>> We finally ran the command "netstat -a -u -p", and saw about a dozen 
>>>> rtpproxy sockets that had exhausted the receive or send buffers 
>>>> (columns Recv-Q, or Send-Q).  After we killed and restarted 
>>>> rtpproxy, everything went back to normal.
>>>>
>>>> The questions now are:
>>>> 1.  Why would a problem in rtpproxy completely lock up SER?  Even 
>>>> after stopping and starting SER multiple times it was still blocked.
>>>> 2.  Why would rtpproxy lock up in the first place and exhaust the 
>>>> network buffers?  This is all UDP traffic so its not like the other 
>>>> side was slow at sending ACKs or something.  UDP traffic should be 
>>>> received and sent out on the fly by rtpproxy.  (Network interface is 
>>>> 100Mbps full-duplex and it never went down or showed any problems).
>>>>
>>>> The box is running Red Hat ES3.0 on dual 3.6 Xeons and 2GB of 
>>>> memory.  We can confirm that no more than 20-25 calls were running 
>>>> via rtpproxy on it at the time of the incident and that no more that 
>>>> 2-3% of our totall calls are handled by rtpproxy. If anybody can 
>>>> share any insights on what could be the cause to something like 
>>>> this, we would greatly appreciate it.
>>>>
>>>> Thanks,
>>>> Andres.
>>>> TeleSIP Network Admin
>>>>
>>>>
>>>>
>>>
>>
>> _______________________________________________
>> Serdev mailing list
>> serdev at lists.iptel.org
>> http://lists.iptel.org/mailman/listinfo/serdev
>>
>>
>