[Serdev] Re: and another RTPPROXY outage
Maxim Sobolev
sobomax at portaone.com
Thu Jun 30 18:03:57 UTC 2005
Can you start rtpproxy in foreground (-f) with -v option and log its
output into some file, maybe I can spot the possible cause of such
misbehaviour once it happens again.
Thanks!
Regards,
Maxim
Andres wrote:
> Hi Maxim,
>
> We implemented rtpproxy with UDP sockets about 2 months ago according to
> Jan's suggestion. Today we had another outage, but this time at least
> it only affected users needing the RTPPROXY (SER continued to operate
> normally, which is great progress).
>
> The symptom is the same as before. The Recv-Q buffer maxes out and
> apparently locks up the process. We simply see a whole bunch of lines
> like this:
>
> # netstat -pau
> Active Internet connections (servers and established)
> Proto Recv-Q Send-Q Local Address Foreign
> Address State PID/Program name
> udp 255496 0 sitges.telesip.ne:48140
> *:* 974/rtpproxy udp 0 0
> sitges.telesip.ne:48141 *:*
> 974/rtpproxy udp 255496 0 sitges.telesip.ne:48142
> *:* 974/rtpproxy udp 0
> 0 sitges.telesip.ne:48143 *:* 974/rtpproxy
> udp 141488 0 sitges.telesip.ne:48332
> *:* 974/rtpproxy
> udp 255496 0 sitges.telesip.ne:48334
> *:* 974/rtpproxy
>
> Plus the SYSLOG gets filled with lines like:
> Jun 29 16:07:37 sitges /usr/local/sbin/ser[32267]: ERROR:
> send_rtpp_command: timeout waiting reply from a RTP proxy
>
>
> Killing the rtpproxy process and starting it again fixes the issue. I
> still do not see how we can reproduce it since almost 2 months passed
> since the last ocurrence. Do you have a version that can give more
> debug info. We would be glad to test it out in order to find a solution
> to this.
>
> Our version is
> # ./rtpproxy -v
> Basic version: 20040107
> Extension 20050322: Support for multiple RTP streams and MOH
>
> Thanks,
> Andres.
>
>
> Andres wrote:
>
>> Maxim Sobolev wrote:
>>
>>> Andres,
>>>
>>> What you are reporting is very strange indeed. Unfortunately your
>>> report doesn't contain enough information to identify the source of
>>> the problem. Is the problem reproduceable?
>>
>>
>>
>> Hi Maxim,
>>
>> The problem is not reproducible but it has happened to someone else
>> before. Take a look at:
>> http://lists.iptel.org/pipermail/serusers/2005-April/017970.html
>>
>> The recomendation from Jan was to switch to UDP sockets. We will be
>> implementing that after a few tests so hopefully if rtpproxy blocks in
>> the future, it will not take down SER with it.
>>
>> Thanks,
>> Andres
>>
>>>
>>> -Maxim
>>>
>>> Andres wrote:
>>>
>>>> Hi,
>>>>
>>>> After more than 2 years of flawlessly processing millions of calls
>>>> and going through versions 0.8.10 all the way to 0.9.1, we had our
>>>> first major SER outage yesterday. One of our SER boxes stopped
>>>> responding completely to any SIP messages (running on 0.9.1 from
>>>> about 3 weeks ago). We stopped and started SER multiple times, ran
>>>> sniffer traces, turned on maximum debugging and all we could see was
>>>> that SER did not respond to anything. Not even "serctl moni" seemed
>>>> to work.
>>>>
>>>> We finally ran the command "netstat -a -u -p", and saw about a dozen
>>>> rtpproxy sockets that had exhausted the receive or send buffers
>>>> (columns Recv-Q, or Send-Q). After we killed and restarted
>>>> rtpproxy, everything went back to normal.
>>>>
>>>> The questions now are:
>>>> 1. Why would a problem in rtpproxy completely lock up SER? Even
>>>> after stopping and starting SER multiple times it was still blocked.
>>>> 2. Why would rtpproxy lock up in the first place and exhaust the
>>>> network buffers? This is all UDP traffic so its not like the other
>>>> side was slow at sending ACKs or something. UDP traffic should be
>>>> received and sent out on the fly by rtpproxy. (Network interface is
>>>> 100Mbps full-duplex and it never went down or showed any problems).
>>>>
>>>> The box is running Red Hat ES3.0 on dual 3.6 Xeons and 2GB of
>>>> memory. We can confirm that no more than 20-25 calls were running
>>>> via rtpproxy on it at the time of the incident and that no more that
>>>> 2-3% of our totall calls are handled by rtpproxy. If anybody can
>>>> share any insights on what could be the cause to something like
>>>> this, we would greatly appreciate it.
>>>>
>>>> Thanks,
>>>> Andres.
>>>> TeleSIP Network Admin
>>>>
>>>>
>>>>
>>>
>>
>> _______________________________________________
>> Serdev mailing list
>> serdev at lists.iptel.org
>> http://lists.iptel.org/mailman/listinfo/serdev
>>
>>
>
More information about the Serdev
mailing list