[Serusers] Deadly embrace with rtpproxy - Is it necessary? Can it be turned off? - sr-users

List overview All Threads
Download

newer

[Serusers] Deadly embrace with rtpproxy - Is it necessary? Can it be turned off?

older

[Kamailio-Users] [LCR] Can I use a...

Re: [Kamailio-Users] Problem...

Frank Durda IV

21 Aug 2008 21 Aug '08

8:23 p.m.

I'm running into a problem with rtpproxy on this point, quoting from the README:

- - - - - - - - - - - - after the session has been created, the proxy listens on the port it has allocated for that session and waits for receiving at least one UDP packet from each of two parties participating in the call. Once such packet is received, the proxy fills one of two ip:port structures associated with each call with source ip:port of that packet. When both structures are filled in, the proxy starts relaying UDP packets between parties; - - - - - - - - - - -

However, a number of clients frequently fail to emit any audio when originating a call until they hear something from the TDM gateway, such as ring-back or the called party answering. So although rtpproxy is receiving a stream of audio, such as a voice mail menu robot, the calling party can't hear any of it unless they happen to make some noise or randomly and blindly press a DTMF key. This seems to be made worse on links with silence suppression, so there is no background noise to trigger two-way audio. This is being encountered between Class 4 carriers, so we don't have the option to get someone to adjust their phone/PBX settings or have them breathe heavier.

Is there a setting adjustment to get rtpproxy to just pass the RTP packets from directed calling and called sources even if one party hasn't happened to make noise yet?

I personally don't understand why this requirement for seeing audio from both sides before starting the flow in either direction if audio starts coming in even exists. It seems to have no benefit but is bound to cause this deadly embrace problem in many situations that may be beyond the control of the owners of the equipment passing traffic along to the site where rtpproxy is in use.

Suggestions? Fix? I have looked at the latest snapshot of rtpproxy and the README is unchanged since 1.1 so apparently this behavior is still the same.

Thanks in advance!

Show replies by date

Valentin Nechayev

21 Aug 21 Aug

8:44 p.m.

New subject: [Serusers] Deadly embrace with rtpproxy - Is it necessary? Can it be turned off?

Hi,

...

...
...
...
...
Frank Durda IV frank.durda@hypercube-llc.com wrote:

...

I'm running into a problem with rtpproxy on this point, quoting from the README:

...

after the session has been created, the proxy listens on the port it has allocated for that session and waits for receiving at least one UDP packet from each of two parties participating in the call. Once such packet is received, the proxy fills one of two ip:port structures associated with each call with source ip:port of that packet. When both structures are filled in, the proxy starts relaying UDP packets between parties;

...

However, a number of clients frequently fail to emit any audio when originating a call until they hear something from the TDM gateway, such as ring-back or the called party answering.

If rtpproxy receives a packet from one side, it relays the packet to other side. But target address for this can be incorrect. Your problems can be tied with NAT or asymmetric implementation, in which packets are waited on one address (declared in SDP) but sent from another address.

...

So although rtpproxy is receiving a stream of audio, such as a voice mail menu robot, the calling party can't hear any of it unless they happen to make some noise or randomly and blindly press a DTMF key.

Sounds very similar to incorrect SDP announce or NAT issue. You should verify this using packet dump.

It shall be also noted that some gateways have some strange policy against changing remote ("remote" - for them, "local" - for other side) address. I have seen this on Cisco gateways of 53xx/54xx series: if address declared in SDP has changed, Cisco rejects to send to new address (and continues to send to previously known one) until something is received from new address. Now PortaOne version of rtpproxy has special runtime option to enable "RTP invites" which are sent to remote side until some packet is received from it. This fixes work with Cisco.

...

Is there a setting adjustment to get rtpproxy to just pass the RTP packets from directed calling and called sources even if one party hasn't happened to make noise yet?

As declared above, rtpproxy does it by default.

-- Valentin Nechayev PortaOne Inc., Software Engineer mailto:netch@portaone.com

Frank Durda IV

9:27 p.m.

Valentin Nechayev wrote:

...

If rtpproxy receives a packet from one side, it relays the packet to other side. But target address for this can be incorrect. Your problems can be tied with NAT or asymmetric implementation, in which packets are waited on one address (declared in SDP) but sent from another address.

Preface, these Class 4 sites (TDM gateways) are configured as a NAT environment and have been working for some months without issue (or without this particular issue being detected and identified), until we happened to pick up traffic that is originating in a network where they use silence suppression (likely originally G.729/G.726 but converted to G.711 by the time we get it). Absolutely NO RTP packets are being received from the calling party until they make a lot of noise, so there is nothing for rtpproxy to forward to the called party, and so half of the rtpproxy proxy activation criteria is not met.

According to the rtpproxy README file that I quoted in the original mail, rtpproxy will not forward packets coming from one direction (in this case, the called/answering party) at all until audio packets are received from the calling party, which won't happen because the calling party can't hear the called party and won't grunt or cough or something because they only hear silence from the called party, not even ring-back. They then hang up and call their local carrier and complain that the service sucks, and here we are.

In this scenario, debug shows SER and rtpproxy have exchanged IP and port info and rtpproxy is listening. Tcpdump shows the ring-back and called party audio is reaching rtpproxy (in-progress ring-back audio), but is all being discarded by rtpproxy and not forwarded. Tcpdump also shows no packets coming from the calling party, because they don't think the called party has answered, much less been rung. It is a deadly embrace, and unfortunately the rtpproxy documentation basically says "Yes, it does that", even though this is improper behavior.

So the question remains, is there a setting or compile option in rtpproxy to force it to start forwarding packets from either part of a call once rpptoxy has received the addresses and port information from SER for that call? We already know rtpproxy is getting the right values, we just want it to start doing its job immediately even if one of the two parties doesn't generate any audio instantly.

If rtpproxy as it stands can't do this, I either have to re-write rpproxy or replace it with a commercial appliance that doesn't have this "Must see two-way audio before I allow two-way audio" criteria. That requirement of rtpproxy the behavior are not RFC compliant.

Valentin Nechayev

9:57 p.m.

New subject: [Serusers] Deadly embrace with rtpproxy - Is it necessary? Can it be turned off?

Hi,

...

...
...
...
...
Frank Durda IV frank.durda@hypercube-llc.com wrote:

...

...
If rtpproxy receives a packet from one side, it relays the packet to other side. But target address for this can be incorrect. Your problems can be tied with NAT or asymmetric implementation, in which packets are waited on one address (declared in SDP) but sent from another address.

Preface, these Class 4 sites (TDM gateways) are configured as a NAT environment

What side is behind NAT and what is outside?

...

and have been working for some months without issue (or without this particular issue being detected and identified), until we happened to pick up traffic that is originating in a network where they use silence suppression (likely originally G.729/G.726 but converted to G.711 by the time we get it). Absolutely NO RTP packets are being received from the calling party until they make a lot of noise, so there is nothing for rtpproxy to forward to the called party, and so half of the rtpproxy proxy activation criteria is not met.

The problem with NAT and VAD is that if the party behind NAT doesn't conform to conditions for such source, the party outside NAT isn't able to determine address to send. To talk with party behind NAT, the following conditions shall be met:

1. The party behind NAT is symmetric either for signaling and for media. Here, it's symmetric, if source address for packets and declared target address are the same.

2. Either 1) the party behind NAT is able to determine its external address (most probably by STUN requests), or 2) it shall send RTP immediately without delay - so the other party is able to determine external address from real packet.

Your description is very close to situation if one party is behind NAT, doesn't use STUN or is unable to use it (i.e. NAT is truly symmetric). If the media originated by this party stops due to VAD or another similar technique, the effect will be exactly as you describe.

...

According to the rtpproxy README file that I quoted in the original mail, rtpproxy will not forward packets coming from one direction (in this case, the called/answering party) at all until audio packets are received from the calling party,

Seems your reading is wrong. I have analyzed rtpproxy code detailedly. Details in different versions can differ (Maxim refactored it very deeply some time ago), but the common algorithm is unchanged. You can see it in source code. On creating or updating session, in handle_command(), there is initial address filling (just after writeport: label) in all cases except address is null (0.0.0.0 for IPv4). You can also check it in log - there shall be messages like "Pre-filling caller/callee address with..."

...

which won't happen because the calling party can't hear the called party and won't grunt or cough or something because they only hear silence from the called party, not even ring-back. They then hang up and call their local carrier and complain that the service sucks, and here we are. In this scenario, debug shows SER and rtpproxy have exchanged IP and port info and rtpproxy is listening. Tcpdump shows the ring-back and called party audio is reaching rtpproxy (in-progress ring-back audio), but is all being discarded by rtpproxy and not forwarded. Tcpdump also shows no packets coming from the calling party, because they don't think the called party has answered, much less been rung.

Where did you get this dump? If rtpproxy sends packets to bogus address, this can be detected on the rtpproxy host itself, not some intermediate point.

...

It is a deadly embrace, and unfortunately the rtpproxy documentation basically says "Yes, it does that", even though this is improper behavior.

We work with rtpproxy 4+ years. There is no such desired behavior in it, even if you read such in its documentation. It can be bug, but not intentional behavior. Please try to investigate as detailedly as possible.

-- Valentin Nechayev PortaOne Inc., Software Engineer mailto:netch@portaone.com

Frank Durda IV

3 Dec 3 Dec

10:20 p.m.

Sorry for the long delay on this topic. Personal matters prevented me from investigating for a while and the original occurrence of the problem went away for the first client (because they changed something on their end), but now I have another apparently-identical problem with calls originating from a big-iron switch who isn't willing to change their settings like the first client did.

The problem is easily reproducible for those of you who wish to hear (or not hear) it themselves. (instructions below)

To recap, the problem is that per rtpproxys own documentation rtpproxy won't pass a packet in either direction for a given session until it has received audio from both sides of a given session. The README says:

[0]- after the session has been created, the proxy listens on the port it has [0] allocated for that session and waits for receiving at least one UDP [0] packet from each of two parties participating in the call. Once such [0] packet is received, the proxy fills one of two ip:port structures [0] associated with each call with source ip:port of that packet. When both [0] structures are filled in, the proxy starts relaying UDP packets between [0] parties;

This is exactly the behavior I am experiencing, but it is causing interoperability problems and really needs to change.

In my case, there is a new calling switch/SBC that won't send rtpproxy any audio packets when it places a call to us until it receives audio for that call from us first. I can see the switch on my side sending the RTP to the port that rtpproxy stipulated (following a 183 or 200 message), and absolutely none of that audio being sent out of rtpproxy and onto the calling switch/SBC. In other words, it is behaving exactly like what the README statement says.

Meanwhile, the equipment at other clients that send me calls usually send at least one RTP packet to rtpproxy shortly after receiving the 183 message (certainly by the time the 200 message shows up), and so two-way audio commences for that call. We have always noted a slight truncation of initial audio, but apparently no one is upset by that, but I suspect that was a result of rtpproxy behavior. This noted behavior also supports the statement in the README that audio must come from both directions before audio is sent in any direction by rtpproxy.

So, the original question was (and now still is), is there a way to disable this "both sides must send" criteria of rtpproxy? I don't see a technical reason for it, and I have a client with a NexTone or a Sonus (or perhaps both) and when they send us calls, there is no audio. Now, there may be some knob they can turn on their system to not wait to send audio to the called party (of which our rtpproxy is in the middle somewhere) until audio is received from the called party (as is happening now), but some of these companies see no need to alter their settings in such a way. It would be better if I could fix this behavior in rtpproxy to allow audio in both directions from the moment that it selects the ports to listen to for a given call and hands those back to SER.

NOTE: We do run rtpproxy with two interfaces, and did have to alter SER* to actually pay attention to passed address parameters passed to force_rtp_proxy(), as the stock logic in SER quietly ignored a provided IP address if two interfaces are in use. I also had to change fix_nated_contact() to accept and use an IP address provided if one was passed as a parameter because the value SER came up with was always wrong for our environment. We've had those two fixes in SER for eight months and it works fine with numerous clients. Rtpproxy is handed the right IP address for each interface and customer, along with desired settings to force_send_socket(), rewritehost() and fix_nated_contact(), and everybody gets the IP addresses that are correct for their network. It works, although I will admit ser.cfg is a bit klunky due to the lack of a way in SER to pass non-constants, eg "variables" to functions, but we manage. *These alterations were all in SER nathelper and rtpproxy is untouched. SER nathelper patches available on request.

However, we have this condition that rtpproxy (as claimed by its own documentation), won't pass traffic unless it receives at least one packet from both parties, which probably isn't going to happen in this situation because the maker of that equipment says they aren't doing anything that violates RFC in waiting to receive audio from the called system before sending audio to the called system and so see no need to change their stuff.

Let's see, someone asked earlier which rtpproxy interface is which? Well we ended up with "e" being the side facing our equipment and "i" being the side facing the rest of the world. That seemed backwards, but it worked first and the more logical combination did not for some reason that was never investigated. I suspect either way would work because interfaces are different netblocks in the 10.x.x.x space. The interface details are:

(internal/trusted side - no further NAT conversions to reach in-house switches)) em4: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500 options=1b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING> inet 10.33.168.18 netmask 0xffffffc0 broadcast 10.33.168.63

(external/untrusted side) em5: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> mtu 1500 options=1b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING> inet 10.83.168.2 netmask 0xffffffc0 broadcast 10.83.168.63

Destination Gateway Flags Refs Use Netif Expire default 10.83.168.1 UGS 0 6480075550 em5 (causes packets with public IPaddrs destinations to flow to router and out to Internet. (On outbound packets, Router replaces source IP with a matched public IP addr but destination is not altered, nor is packet content.)

/usr/local/bin/rtpproxy -l 10.33.168.18/10.83.168.20

As you see, the external/untrusted interface is also where the default route is, so that public IPaddrs will flow to the directly-connected router. Inbound rtp is NATed from a public address to an internal one on the rtpproxy interface em5. Again, all that has been passing millions of calls for months now, but now I've got one customer who won't send audio until we send audio, and apparently rtpproxy won't do this for some reason. All IP addresses check out as being reachable, it's just that nothing comes out of rtpproxy. Since the packet doesn't even come out of rtpproxy, the lack of audio can't be blamed on NAT, since there are no packets being emitted in first place that a NAT conversion might mess up. That isn't it. The problem appears to be rtpproxy, and the documentation says it is supposed to behave like this. The question is, why?

I am able to easily replicate the problem using an Asterisk box as the PBX for a Cisco SIP phone, by having the router immediately in front of the SER/rtpproxy server block all incoming UDP but port 5060 for packets coming from the asterisk server. Allow outgoing everything.

Remove the port block/access-list rule and the phone works fine. Add the block of RTP and calls set up and answer, but both parties hear nothing, not even 183 ring-back, even though the caller should be able to hear the called party. (All other calls occuring simultaneously are unaffected since I only blocked incoming RTP from one call source.) tcpdump running on all the interfaces on the SER/rtpproxy server also show that audio coming in on the one side to rtpproxy but is not emitted by the other side.

Now, wait ten seconds or so after you know the called party has answered (so far no audio of for either party), and remove the incoming RTP block and ding! You now have two-way audio, even though you should have had one-way audio up until the moment the inbound RTP block was removed. This basically emulates the behavior of the switch that won't send audio unless it receives audio and how this puts the calling switch and rtpproxy in a deadly embrace.

So that seems to demonstrate that rtpproxy is behaving exactly as documented, but that this is terrible behavior. What can be done to fix this?

(tcpdumps of the sample calls available on request)

Thanks in advance for staying awake and bonus points for suggesting a fix or coming up with one!

Andres

4 Dec 4 Dec

4:08 p.m.

Frank Durda IV wrote:

...

Sorry for the long delay on this topic. Personal matters prevented me from investigating for a while and the original occurrence of the problem went away for the first client (because they changed something on their end), but now I have another apparently-identical problem with calls originating from a big-iron switch who isn't willing to change their settings like the first client did.

The problem is easily reproducible for those of you who wish to hear (or not hear) it themselves. (instructions below)

To recap, the problem is that per rtpproxys own documentation rtpproxy won't pass a packet in either direction for a given session until it has received audio from both sides of a given session. The README says:

[0]- after the session has been created, the proxy listens on the port it has [0] allocated for that session and waits for receiving at least one UDP [0] packet from each of two parties participating in the call. Once such [0] packet is received, the proxy fills one of two ip:port structures [0] associated with each call with source ip:port of that packet. When both [0] structures are filled in, the proxy starts relaying UDP packets between [0] parties;

I do not think this readme reflects the code as it works today. At least that is my observation with version 1.0.2. As soon as the rtpproxy receives audio from one end, it will start relaying to the other end (to the port advertised in the SDP). Once it receives audio from that end it will update the session port and continue relaying the audio to the updated port. I know you have run tests that show otherwise, so the only thing I can suggest is to check the actual rtpproxy version you are running.

A great way to troubleshoot the issue is to run rtpproxy in the foreground and track the messages that show the port being updated: (they start with the advertised SDP ports) pre-filling caller's address with 192.116.246.234:41000 pre-filling callee's address with 192.116.246.235:20000

Then when it sees the actual packets coming in from a different source port, it updates the address and we see it in the log like: callee's address filled in: 192.116.246.235:1024

You should track first of all that the 'pre-filling' is done per SDP Advertised Ports. Then when your end starts sending RTP, you should see the 'caller's address filled' message and audio should start flowing from that end to the advertised SDP port of the other end. Finally when the other end sends Audio, you will see the last 'callee's address filled message'.

Andres http://www.telesip.net

Frank Durda IV

5:22 p.m.

Andres wrote:

...

I do not think this readme reflects the code as it works today. At least that is my observation with version 1.0.2. As soon as the rtpproxy receives audio from one end, it will start relaying to the other end (to the port advertised in the SDP). Once it receives audio from that end it will update the session port and continue relaying the audio to the updated port. I know you have run tests that show otherwise, so the only thing I can suggest is to check the actual rtpproxy version you are running.

I am not entirely sure which version I have got. The tar ball extracted a directory with the name rtpproxy-1.1, the Makefile says: PACKAGE_NAME = rtpproxy PACKAGE_STRING = rtpproxy 1.1.beta.20071218 PACKAGE_TARNAME = rtpproxy PACKAGE_VERSION = 1.1.beta.20071218 and there are other random CVS version tags like * $Id: main.c,v 1.62 2008/02/04 08:38:05 sobomax Exp $

but no obvious "THIS IS VERSION X" message, such as part of the usage() output.

Based on this I would suspect I have a newer version than what you have.

...

A great way to troubleshoot the issue is to run rtpproxy in the foreground and track the messages that show the port being updated: (they start with the advertised SDP ports) pre-filling caller's address with 192.116.246.234:41000 pre-filling callee's address with 192.116.246.235:20000

Then when it sees the actual packets coming in from a different source port, it updates the address and we see it in the log like: callee's address filled in: 192.116.246.235:1024

You should track first of all that the 'pre-filling' is done per SDP Advertised Ports. Then when your end starts sending RTP, you should see the 'caller's address filled' message and audio should start flowing from that end to the advertised SDP port of the other end. Finally when the other end sends Audio, you will see the last 'callee's address filled message'.

I will see if I can run it this way on a test system. Thanks!

Andres

5:44 p.m.

...

but no obvious "THIS IS VERSION X" message, such as part of the usage() output.

Based on this I would suspect I have a newer version than what you have.

yes. This is what we have: * $Id: main.c,v 1.48.2.3 2008/03/18 05:19:20 sobomax Exp $

Andres http://www.telesip.net

...

Frank Durda IV

8:12 p.m.

Here is the output from rtpproxy interspersed with comments about what was going on externally, as well as lsof output:

System quiet, no calls in progress. Placing call:

:received command "UE 565aabb151708c9c46c744d2400a454b@72.66.211.59 72.66.211.59 19994 as3257bf0a;1" :new session 565aabb151708c9c46c744d2400a454b@72.66.211.59, tag as3257bf0a;1 requested, type strong :new session on a port 35008 created, tag as3257bf0a;1 :sending reply "35008 10.31.168.18 :" :received command "L 565aabb151708c9c46c744d2400a454b@72.66.211.59 10.131.0.2 16004 as3257bf0a;1 000a0283+1+4e00002+48ad7f8b " :lookup on ports 35008/35008, session timer restarted :sending reply "35008 10.81.168.2 :" :callee's address filled in: 10.31.168.2:16004 (RTP) :guessing RTCP port for callee to be 16005 : :(lsof says) :rtpproxy 68032 root 5u IPv4 0xffffff0026117850 0t0 UDP 10.31.168.18:35008 :rtpproxy 68032 root 6u IPv4 0xffffff00113d6980 0t0 UDP 10.31.168.18:35009 :rtpproxy 68032 root 7u IPv4 0xffffff001fed65f0 0t0 UDP 10.81.168.2:35008 :rtpproxy 68032 root 8u IPv4 0xffffff0015324720 0t0 UDP 10.81.168.2:35009

10.81.168.2 faces the calling party (router, then asterisk then cisco phone) 10.31.168.18 faces the PSTN gateway switch (called party is there)

Called number ringing. Despite letting called number ring several times, calling party hears no ring back audio or any other audio. The Ring-back audio is being sent to rtpptroxy (tcpdump shows this), but is being thrown away, despite "filled in" thing above. Stats below from rtpproxy also show large discard.

Call is now answered, and neither party can hear audio from one another. No new messages are emitted from rtpproxy. Audio is being sent to rtpptroxy by called system, but is being thrown away. (Calling system audio is blocked by the router as part of the test.)

This state demonstrates the "neither party will blink" deadly embrace scenario, if the calling system refused to send audio until it received audio from the called direction, which would be a rtpproxy daemon, or worse, two rtpproxy daemons facing each other.

Now, allowing inbound RTP audio is allowed from calling party by removing the router restriction on incoming UDP packets to ports other than 5060:

:caller's address filled in: 72.66.211.59:19994 (RTP) :guessing RTCP port for caller to be 19995 : :(lsof says) :rtpproxy 68032 root 5u IPv4 0xffffff0026117850 0t0 UDP 10.31.168.18:35008 :rtpproxy 68032 root 6u IPv4 0xffffff00113d6980 0t0 UDP 10.31.168.18:35009 :rtpproxy 68032 root 7u IPv4 0xffffff001fed65f0 0t0 UDP 10.81.168.2:35008 :rtpproxy 68032 root 8u IPv4 0xffffff0015324720 0t0 UDP 10.81.168.2:35009

Both parties instantly can now hear each others' audio. This demsonatres that BOTH sides must be filled in before the audio passes for either, just as documented (but terrible behavior).

After a minute or so, Calling party goes on hook:

:received command "D 565aabb151708c9c46c744d2400a454b@72.66.211.59 as3257bf0a 000 :a0283+1+4e00002+48ad7f8b" :forcefully deleting session 1 on ports 35008/35008 :RTP stats: 2753 in from callee, 1459 in from caller, 2920 relayed, 1292 dropped :RTCP stats: 10 in from callee, 6 in from caller, 12 relayed, 4 dropped :session on ports 35008/35008 is cleaned up :sending reply "0 :" : :lsof shows no IPV4 ports open.

Called phone goes on hook. End of session.

Note that even rtproxy stats show that it was throwing away most of the audio coming from called party (this is G.711 so packet counts after going off-hook should be about the same), which shouldn't be the case if it allows the audio to pass as soon as it gets the details from the called party, details that arrive before or just as the audio starts to arrive (when 183 is sent).

I think I have demonstrated the problem now several convincing ways now. What is needed is a fix. Or is it just as simple as ripping out the if (sp->complete != 0) { } in main.c? I feel a tad uncomfortable about the side-effects of doing that, but it looks somewhat underprotected anyway.

Greger Viken Teigre

10:03 p.m.

Look for direction=active or passive in SDP. You signal to the UA whether they should expect an active or passive UA. See http://www.iptel.org/ser/doc/modules/nathelper

As I said, waiting for media makes sense if you believe that the user agent is behind a NAT and the NAT may change the original port. The only way to get through is to wait until you get a packed and send back on the src port, hoping the NAT is symmetric. g-)

On Thu, 04 Dec 2008 21:12:28 +0100, Frank Durda IV frank.durda@hypercube-llc.com wrote:

...

Here is the output from rtpproxy interspersed with comments about what was going on externally, as well as lsof output:

System quiet, no calls in progress. Placing call:

:received command "UE 565aabb151708c9c46c744d2400a454b@72.66.211.59 72.66.211.59 19994 as3257bf0a;1" :new session 565aabb151708c9c46c744d2400a454b@72.66.211.59, tag as3257bf0a;1 requested, type strong :new session on a port 35008 created, tag as3257bf0a;1 :sending reply "35008 10.31.168.18 :" :received command "L 565aabb151708c9c46c744d2400a454b@72.66.211.59 10.131.0.2 16004 as3257bf0a;1 000a0283+1+4e00002+48ad7f8b " :lookup on ports 35008/35008, session timer restarted :sending reply "35008 10.81.168.2 :" :callee's address filled in: 10.31.168.2:16004 (RTP) :guessing RTCP port for callee to be 16005 : :(lsof says) :rtpproxy 68032 root 5u IPv4 0xffffff0026117850 0t0 UDP 10.31.168.18:35008 :rtpproxy 68032 root 6u IPv4 0xffffff00113d6980 0t0 UDP 10.31.168.18:35009 :rtpproxy 68032 root 7u IPv4 0xffffff001fed65f0 0t0 UDP 10.81.168.2:35008 :rtpproxy 68032 root 8u IPv4 0xffffff0015324720 0t0 UDP 10.81.168.2:35009

10.81.168.2 faces the calling party (router, then asterisk then cisco phone) 10.31.168.18 faces the PSTN gateway switch (called party is there)

Called number ringing. Despite letting called number ring several times, calling party hears no ring back audio or any other audio. The Ring-back audio is being sent to rtpptroxy (tcpdump shows this), but is being thrown away, despite "filled in" thing above. Stats below from rtpproxy also show large discard.

Call is now answered, and neither party can hear audio from one another. No new messages are emitted from rtpproxy. Audio is being sent to rtpptroxy by called system, but is being thrown away. (Calling system audio is blocked by the router as part of the test.)

This state demonstrates the "neither party will blink" deadly embrace scenario, if the calling system refused to send audio until it received audio from the called direction, which would be a rtpproxy daemon, or worse, two rtpproxy daemons facing each other.

Now, allowing inbound RTP audio is allowed from calling party by removing the router restriction on incoming UDP packets to ports other than 5060:

:caller's address filled in: 72.66.211.59:19994 (RTP) :guessing RTCP port for caller to be 19995 : :(lsof says) :rtpproxy 68032 root 5u IPv4 0xffffff0026117850 0t0 UDP 10.31.168.18:35008 :rtpproxy 68032 root 6u IPv4 0xffffff00113d6980 0t0 UDP 10.31.168.18:35009 :rtpproxy 68032 root 7u IPv4 0xffffff001fed65f0 0t0 UDP 10.81.168.2:35008 :rtpproxy 68032 root 8u IPv4 0xffffff0015324720 0t0 UDP 10.81.168.2:35009

Both parties instantly can now hear each others' audio. This demsonatres that BOTH sides must be filled in before the audio passes for either, just as documented (but terrible behavior).

After a minute or so, Calling party goes on hook:

:received command "D 565aabb151708c9c46c744d2400a454b@72.66.211.59 as3257bf0a 000 :a0283+1+4e00002+48ad7f8b" :forcefully deleting session 1 on ports 35008/35008 :RTP stats: 2753 in from callee, 1459 in from caller, 2920 relayed, 1292 dropped :RTCP stats: 10 in from callee, 6 in from caller, 12 relayed, 4 dropped :session on ports 35008/35008 is cleaned up :sending reply "0 :" : :lsof shows no IPV4 ports open.

Called phone goes on hook. End of session.

Note that even rtproxy stats show that it was throwing away most of the audio coming from called party (this is G.711 so packet counts after going off-hook should be about the same), which shouldn't be the case if it allows the audio to pass as soon as it gets the details from the called party, details that arrive before or just as the audio starts to arrive (when 183 is sent).

I think I have demonstrated the problem now several convincing ways now. What is needed is a fix. Or is it just as simple as ripping out the if (sp->complete != 0) { } in main.c? I feel a tad uncomfortable about the side-effects of doing that, but it looks somewhat underprotected anyway.

Serusers mailing list Serusers@lists.iptel.org http://lists.iptel.org/mailman/listinfo/serusers

Andres

11:10 p.m.

I see this in your output:

...

:callee's address filled in: 10.31.168.2:16004 (RTP)

Which means it received audio from that IP.

What I don't see is rtpproxy "prefilling" the addresses (caller and callee) with the SDP data which is used to send audio until we know what the real ip/port is. You should be seeing some info like:

pre-filling caller's address with.... pre-filling callee's address with....

If you don't see that, then are you sure that data is coming in the SDP Invite? Without that info, rtpproxy has no way to relay your rtp before receiving rtp from the other end.

Andres http://www.telesip.net

Frank Durda IV

5 Dec 5 Dec

12:13 a.m.

Andres wrote:

...

I see this in your output:

...
:callee's address filled in: 10.31.168.2:16004 (RTP)

Which means it received audio from that IP.

What I don't see is rtpproxy "prefilling" the addresses (caller and callee) with the SDP data which is used to send audio until we know what the real ip/port is. You should be seeing some info like:

pre-filling caller's address with.... pre-filling callee's address with....

If you don't see that, then are you sure that data is coming in the SDP Invite? Without that info, rtpproxy has no way to relay your rtp before receiving rtp from the other end.

Yes, there is a functional SDP payload in the INVITE, the 183 Session Progress, and the 200 OK messages. That's why I have been unable to figure out why rtpproxy didn't have enough information to go on from the INVITE SDP, but was also waiting for something from the first audio packet from the INVITE system. I mean, if there wasn't a SDP payload, would rtpproxy been contacted SER, which it obviously did do.

Since the SIP messages seemed, I didn't think to include them, but here they are (with some numbers obscured) for the same call that I showed rtpproxy debug for:

The INVITE, as received from the calling party by SER (monitored on the interface by ngrep)

U 2008/12/04 19:42:20.471166 72.66.211.59:5060 -> 10.81.90.1:5060 INVITE sip:6829995231@208.66.49.130 SIP/2.0. Via: SIP/2.0/UDP 72.66.211.59:5060;branch=z9hG4bK230c41c1;rport. From: "TROUBLESHOOTING 1000" sip:4692221700@72.66.211.59;tag=as3257bf0a. To: sip:6829995231@208.66.49.130. Contact: sip:4692221700@72.66.211.59. Call-ID: 565aabb151708c9c46c744d2400a454b@72.66.211.59. CSeq: 102 INVITE. User-Agent: Asterisk PBX. Max-Forwards: 70. Date: Thu, 04 Dec 2008 19:42:22 GMT. Allow: INVITE, ACK, CANCEL, OPTIONS, BYE, REFER, SUBSCRIBE, NOTIFY. Supported: replaces. Content-Type: application/sdp. Content-Length: 415. . v=0. o=root 58770 58770 IN IP4 72.66.211.59. s=session. c=IN IP4 72.66.211.59. t=0 0. m=audio 19994 RTP/AVP 0 3 8 112 5 10 7 97 111. a=rtpmap:0 PCMU/8000. a=rtpmap:3 GSM/8000. a=rtpmap:8 PCMA/8000. a=rtpmap:112 AAL2-G726-32/8000. a=rtpmap:5 DVI4/8000. a=rtpmap:10 L16/8000. a=rtpmap:7 LPC/8000. a=rtpmap:97 iLBC/8000. a=fmtp:97 mode=30. a=rtpmap:111 G726-32/8000. a=silenceSupp:off - - - -. a=ptime:20. a=sendrecv.

The responses from the called switch (monitored on the interface facing the called switch by ngrep)

U 2008/12/04 19:42:20.478463 10.31.0.2:5060 -> 10.31.90.1:5060 SIP/2.0 100 Trying. Call-ID: 565aabb151708c9c46c744d2400a454b@72.66.211.59. CSeq: 102 INVITE. From: "TROUBLESHOOTING 1000" sip:4692221700@72.66.211.59;tag=as3257bf0a. To: sip:6829995231@208.66.49.130;tag=000a0283+1+4e00002+48ad7f8b. Via: SIP/2.0/UDP 10.31.90.1;branch=z9hG4bKc44e.8eae2e65.0. Via: SIP/2.0/UDP 72.66.211.59:5060;branch=z9hG4bK230c41c1;rport=5060. Server: DC-SIP/2.0. Content-Length: 0. .

# U 2008/12/04 19:42:26.926570 10.31.0.2:5060 -> 10.31.90.1:5060 SIP/2.0 183 Session Progress. Call-ID: 565aabb151708c9c46c744d2400a454b@72.66.211.59. CSeq: 102 INVITE. From: "TROUBLESHOOTING 1000" sip:4692221700@72.66.211.59;tag=as3257bf0a. To: sip:6829995231@208.66.49.130;tag=000a0283+1+4e00002+48ad7f8b. Via: SIP/2.0/UDP 10.31.90.1;branch=z9hG4bKc44e.8eae2e65.0. Via: SIP/2.0/UDP 72.66.211.59:5060;branch=z9hG4bK230c41c1;rport=5060. Server: DC-SIP/2.0. Contact: sip:6829995231@10.31.0.2. Content-Type: application/sdp. Content-Length: 173. . v=0 o=- 3437408546 3437408606 IN IP4 10.31.0.2 s=- c=IN IP4 10.31.168.2 t=0 0 m=audio 16004 RTP/AVP 0 a=sendrecv a=ptime:20 a=rtpmap:0 PCMU/8000 a=silenceSupp:off - - - -

# U 2008/12/04 19:42:37.176531 10.31.0.2:5060 -> 10.31.90.1:5060 SIP/2.0 200 OK. Call-ID: 565aabb151708c9c46c744d2400a454b@72.66.211.59. CSeq: 102 INVITE. From: "TROUBLESHOOTING 1000" sip:4692221700@72.66.211.59;tag=as3257bf0a. To: sip:6829995231@208.66.49.130;tag=000a0283+1+4e00002+48ad7f8b. Via: SIP/2.0/UDP 10.31.90.1;branch=z9hG4bKc44e.8eae2e65.0. Via: SIP/2.0/UDP 72.66.211.59:5060;branch=z9hG4bK230c41c1;rport=5060. Server: DC-SIP/2.0. Contact: sip:6829995231@10.31.0.2. Content-Type: application/sdp. Content-Length: 173. . v=0 o=- 3437408546 3437408606 IN IP4 10.31.0.2 s=- c=IN IP4 10.31.168.2 t=0 0 m=audio 16004 RTP/AVP 0 a=sendrecv a=ptime:20 a=rtpmap:0 PCMU/8000 a=silenceSupp:off - - - -

Don't forget that SER is altering the IP addresses prior to rtpproxy seeing them and before they are emitted to the other party, so don't let that throw you. Take my word that all the IP address substitutions are correct and are handling calls fine.

Now it sounds like there is some bit of code that is getting skipped or doesn't work right when you use rtpproxy in the mode with two interfaces like I am. Is anybody else using that mode? Maybe the reason no one else is seeing the audio issue is because they only use one interface and the code path is different.

Based on this area of interest in the code, I added some quick debug to rtpproxy and here are the results for a normal call, where the caller is sending audio as soon as the 183 or 200 arrives and audio flows more or less normally:

received command "UE 0d9e8dfd12c8df171a33919e2a536894@72.66.211.59 72.66.211.59 15804 as5c0532d5;1"^M new session 0d9e8dfd12c8df171a33919e2a536894@72.66.211.59, tag as5c0532d5;1 requested, type strong new session on a port 35000 created, tag as5c0532d5;1 C1 C2 C5 C6 C9 sending reply "35000 10.31.168.18 " received command "L 0d9e8dfd12c8df171a33919e2a536894@72.66.211.59 10.31.0.2 25006 as5c0532d5;1 000a0283+1+4d70 003+5a436147" lookup on ports 35000/35000, session timer restarted C1 C2 C5 C6 C9 sending reply "35000 10.81.168.2 "

Now, as you say the "pre-fill" message didn't come out, so note which log messages were emitted and note that the code flowed around the section that would do the pre-fill ("C3"). Other than adding the rtpp_log_write calls and maybe a brace pair to get it within the same conditional, the code is stock.

...

writeport: rtpp_log_write(RTPP_LOG_INFO, sp->log,"C1"); if (pidx >= 0) { if (ia[0] != NULL && ia[1] != NULL) { rtpp_log_write(RTPP_LOG_INFO, sp->log,"C2"); /* * Unless the address provided by client historically * cannot be trusted and address is different from one * that we recorded update it. */ if (spa->untrusted_addr == 0 && !(spa->addr[pidx] != NULL && SA_LEN(ia[0]) == SA_LEN(spa->addr[pidx]) && memcmp(ia[0], spa->addr[pidx], SA_LEN(ia[0])) == 0)) { rtpp_log_write(RTPP_LOG_INFO, sp->log,"C3"); rtpp_log_write(RTPP_LOG_INFO, spa->log, "pre-filling %s's address " "with %s:%s", (pidx == 0) ? "callee" : "caller", addr, port); if (spa->addr[pidx] != NULL) free(spa->addr[pidx]); spa->addr[pidx] = ia[0]; ia[0] = NULL; } if (spa->rtcp->untrusted_addr == 0 && !(spa->rtcp->addr[pidx] != NULL && SA_LEN(ia[1]) == SA_LEN(spa->rtcp->addr[pidx]) && memcmp(ia[1], spa->rtcp->addr[pidx], SA_LEN(ia[1])) == 0)) { rtpp_log_write(RTPP_LOG_INFO, sp->log,"C4"); if (spa->rtcp->addr[pidx] != NULL) free(spa->rtcp->addr[pidx]); spa->rtcp->addr[pidx] = ia[1]; ia[1] = NULL; } } rtpp_log_write(RTPP_LOG_INFO, sp->log,"C5"); spa->asymmetric[pidx] = spa->rtcp->asymmetric[pidx] = asymmetric; spa->canupdate[pidx] = spa->rtcp->canupdate[pidx] = NOT(asymmetric); if (request != 0 || response != 0) { rtpp_log_write(RTPP_LOG_INFO, sp->log,"C6"); if (requested_nsamples > 0) { rtpp_log_write(RTPP_LOG_INFO, sp->log,"C7"); rtpp_log_write(RTPP_LOG_INFO, spa->log, "RTP packets from %s " "will be resized to %d milliseconds", (pidx == 0) ? "callee" : "caller", requested_nsamples / 8); } else if (spa->resizers[pidx].output_nsamples > 0) { rtpp_log_write(RTPP_LOG_INFO, sp->log,"C8"); rtpp_log_write(RTPP_LOG_INFO, spa->log, "Resizing of RTP " "packets from %s has been disabled", (pidx == 0) ? "callee" : "caller"); } spa->resizers[pidx].output_nsamples = requested_nsamples; } } for (i = 0; i < 2; i++) if (ia[i] != NULL) free(ia[i]); cp = buf; len = 0; if (cookie != NULL) { len = sprintf(cp, "%s ", cookie); cp += len; } if (lia[0] == NULL || ishostnull(lia[0])) len += sprintf(cp, "%d\n", lport); else len += sprintf(cp, "%d %s%s\n", lport, addr2char(lia[0]), (lia[0]->sa_family == AF_INET) ? "" : " 6"); rtpp_log_write(RTPP_LOG_INFO, sp->log,"C9"); doreply: doreply(); return 0; ...

So, something in this "if" didn't allow the pre-fill to occur:

if (spa->untrusted_addr == 0 && !(spa->addr[pidx] != NULL && SA_LEN(ia[0]) == SA_LEN(spa->addr[pidx]) && memcmp(ia[0], spa->addr[pidx], SA_LEN(ia[0])) == 0)) {

I'll add some additional debugging later and see if I can spot which of those conditions is the guilty party or parties.

Andres

3:34 p.m.

...

Now it sounds like there is some bit of code that is getting skipped or doesn't work right when you use rtpproxy in the mode with two interfaces like I am. Is anybody else using that mode?

No, we have never tried that mode. Maybe you could setup a test system with only one NIC card and compare the behavior of rtpproxy.

...

Maybe the reason no one else is seeing the audio issue is because they only use one interface and the code path is different.

...

So, something in this "if" didn't allow the pre-fill to occur:
       if (spa->untrusted_addr == 0 && !(spa->addr[pidx] != NULL &&
         SA_LEN(ia[0]) == SA_LEN(spa->addr[pidx]) &&
         memcmp(ia[0], spa->addr[pidx], SA_LEN(ia[0])) == 0)) {
I'll add some additional debugging later and see if I can spot which of those conditions is the guilty party or parties.

That looks like the right section to debug.

Andres http://www.telesip.net

...

Frank Durda IV

20 Mar 20 Mar

3:36 a.m.

New subject: [Serusers] Deadly embrace with rtpproxy - Caused by coding error? Possible fix attached

An update to an issue I brought up in December and had nothing new to report until now.

To recap, I have rtpproxy in a two-interface configuration (which based on comments here few people actually use), and the problem was that rtpproxy was refusing to forward audio to either party until it had received at least one audio packet from both parties. If either party also had a similar policy, you had a deadly embrace and a dead-air call. No ring-back (for 183 calls), and no call audio.

With the assistance of some people here, I got the stock debug plus some of my own going and determined that the "pre-filling caller/callee address with..." events never took place, so the address data didn't get filled in until the first RTP packets arrived from both parties, when a plain "filling caller/callee" would occur. Then the audio flow in both directions would commence.

Both the INVITE and the 183/200 messages always had perfectly sane SDP payloads, but this didn't seem to matter.

I finally got the time to go through the command recived code leading up to the decision to prefill or not with a high level of debugging. I noted what appears to be a code flaw in rtpproxy. The version that I am looking at is "main.c,v 1.62 2008/02/04 08:38:05 sobomax".

In particular, there is a array in the structure "rtpp_session" called "int untrusted_addr[2]". Despite being an array, two of the three references to untrusted_addr that exist (all in main.c) treat it as though it was a non-array element of the structure, so when it was tested for zero or non-zero state in two if() statements, what actually got tested was the memory address of the start of the array (always non-zero on my platforms), and not the content of either array element.

As a further confirmation that something was wacky here, the third reference to untrusted_addr did specify which element it was placing the value "1" into, like it expected the other pieces of code to be able to view that value later.

It appears that the impact of these possible incorrect tests would not really matter unless two interfaces are in use, so it is possible this is a coding error that has gone undetected since this bit of code was originally written.

I changed the two tests that looked improper to examine the same array element as that of the other items located in the same structure that the if() statement was also testing, and rtpproxy immediately started prefilling the addresses for the session in response to the INVITE, 183 and 200 SDP payloads when two interfaces are in use.

This appears to have corrected the problem I was encountering.

Attached is the tw-line-change context diff. I would appreciate if someone more familiar with rtpproxy would bless the change. The patched code certainly makes more sense than what was there previously, and does seem to do the right things now. Thanks!

*** main.c.STOCK Wed Feb 20 18:51:44 2008 --- main.c Thu Mar 19 21:14:40 2009 *************** *** 930,936 **** * cannot be trusted and address is different from one * that we recorded update it. */ ! if (spa->untrusted_addr == 0 && !(spa->addr[pidx] != NULL && SA_LEN(ia[0]) == SA_LEN(spa->addr[pidx]) && memcmp(ia[0], spa->addr[pidx], SA_LEN(ia[0])) == 0)) { rtpp_log_write(RTPP_LOG_INFO, spa->log, "pre-filling %s's address " --- 930,936 ---- * cannot be trusted and address is different from one * that we recorded update it. */ ! if (spa->untrusted_addr[pidx] == 0 && !(spa->addr[pidx] != NULL && SA_LEN(ia[0]) == SA_LEN(spa->addr[pidx]) && memcmp(ia[0], spa->addr[pidx], SA_LEN(ia[0])) == 0)) { rtpp_log_write(RTPP_LOG_INFO, spa->log, "pre-filling %s's address " *************** *** 940,946 **** spa->addr[pidx] = ia[0]; ia[0] = NULL; } ! if (spa->rtcp->untrusted_addr == 0 && !(spa->rtcp->addr[pidx] != NULL && SA_LEN(ia[1]) == SA_LEN(spa->rtcp->addr[pidx]) && memcmp(ia[1], spa->rtcp->addr[pidx], SA_LEN(ia[1])) == 0)) { if (spa->rtcp->addr[pidx] != NULL) --- 940,946 ---- spa->addr[pidx] = ia[0]; ia[0] = NULL; } ! if (spa->rtcp->untrusted_addr[pidx] == 0 && !(spa->rtcp->addr[pidx] != NULL && SA_LEN(ia[1]) == SA_LEN(spa->rtcp->addr[pidx]) && memcmp(ia[1], spa->rtcp->addr[pidx], SA_LEN(ia[1])) == 0)) { if (spa->rtcp->addr[pidx] != NULL)

Valentin Nechayev

12:23 p.m.

New subject: [Serusers] Deadly embrace with rtpproxy - Caused by coding error? Possible fix attached

Hi,

...

...
...
...
...
Frank Durda IV frank.durda@hypercube-llc.com wrote:

...

To recap, I have rtpproxy in a two-interface configuration (which based on comments here few people actually use), and the problem was that rtpproxy was refusing to forward audio to either party until it had received at least one audio packet from both parties. If either party also had a similar policy, you had a deadly embrace and a dead-air call. No ring-back (for 183 calls), and no call audio.

We had got the same problem with Cisco gateways. If renegotiation is made, the gateway sometimes change its port, but didn't start flow from new port until some packet has received on it. To fix this, we invented option to do periodical sendings to configured address (which can differ from real address) - it fixed the problem with Cisco. Later this customer reported that Cisco agreed this is their bug and fixed it.

...

Attached is the tw-line-change context diff. I would appreciate if

For most readers it is better to use "unified context" diff format (diff -u), than "old context" (diff -c).

...

! if (spa->untrusted_addr == 0 && !(spa->addr[pidx] != NULL && ! if (spa->untrusted_addr[pidx] == 0 && !(spa->addr[pidx] != NULL &&

I agree - this looks like simple typo.

-- Valentin Nechayev PortaOne Inc., Software Engineer mailto:netch@portaone.com

Frank Durda IV

21 Aug 21 Aug

9:48 p.m.

More on this, it appears at least one of the parties who has calls get in this situation with our SER/rtpproxy setup is also running SER/rtpproxy, and if the rtpproxy documentation is correct, this will never ever work, regardless of the use of NAT.

Consider this situation. Caller sends a call to a nearby SER/rtpproxy (unit A). Unit A does the INVITE to a second SER/rtpproxy system (unit B) owned by another party which acts as a PSTN gateway.

The call is set up, and unit B receives the SDP data from the PSTN switch, passes it to his rtpproxy (unit B), and sends that SDP payload on to unit A SER, who passes it to his rtpproxy and then on to the calling party. PSTN switch starts sending ring-back and maybe the calling party answers. Unit B rtpproxy discards all the audio because it has not heard anything from Unit A.

Meanwhile, the calling party is screaming his lungs out and those audio packets are being received by Unit A rtpproxy, who discards them because he hasn't seen any audio from Unit B. B never sees audio from A and A never sees audio from B. Stalemate.

Neither A nor B rptproxy can end the stalemate (aka a deadly embrace) because of this "must see two-way audio in order to allow at least one-way audio" rule. Each rtpproxy prevents the other from meeting that criteria. That rule needs to go away immediately.

5968

Age (days ago)

6179

Last active (days ago)

sr-users@lists.kamailio.org

15 comments

4 participants

tags (0)

participants (4)

Andres
Frank Durda IV
Greger Viken Teigre
Valentin Nechayev