At 5:07 PM +0200 on 8/23/04, Jiri Kuthan wrote:
At 04:30 PM 8/23/2004, John Todd wrote:
The problem, as I see it from this discussion, is that some devices do not work correctly with the simple UDP packet sent to port 5060 of the remote UA, because there is no reply packet which is what keeps the NAT mapping of some NAT router/translators. I don't see this as a UA problem; if there is no NAT translation, then even the best-programmed UA can't receive an inbound INVITE.
It is a UA problem. NATs are based on client-server paradigm, have been there for a while, and applications desiring to receive packets out should first send some out. This is a differentiator for many products today. There are such which work (and have typically other shorctomings), there are broken implementations which don't. That's just normal early-technology shopping process.
There is a NAT problem too -- NATs feature quite undeterministic behaviour. That's something that needs to get fixed to and IETF behave effort is good for.
I agree. However, I don't see ICE or any of the other extensions becoming part of most UA implementations for at least another >1.5 years (assuming that approvals happen quickly at IETF) but those of us with customers have to come out with solutions faster than that in this quickly-solidifying market.
The manner in which Asterisk handles this type of keepalive is
somewhat simple but novel, and may be worth examination. Every X seconds, an OPTIONS request is made to the remote UA by the server. Even if the UA does not support the OPTIONS query, it typically hands back a SIP error, which serves the purpose of keeping the NAT translations open. If the device supports OPTIONS, then a "normal" SIP reply is sent, also serving the intended purpose.
Its great it mostly works but it is a hack. It introduces lot of brittlenes -- it will fail whenever NAT bindings change: if NAT reboots, it will fail if NAT is not too deterministic, it will fail if forcible IP address change occurs, etc. Getting it robust is simply hard without client support. (Which is BTW a simple application of the e2e principle.)
I agree that it is a hack, but so is any solution that tries to solve this problem from the "outside" of the NAT. Using OPTIONS is perhaps just a slightly different hack that may make more NAT boxes do the right thing.
(Side note: has anyone generated a list of NAT boxes/software which require outbound packets for translations to stay open? In other words: where, exactly, does SER's method NOT work?)
Perhaps instead of a UDP packet with no content, a SIP OPTIONS
request could be sent by SER. This could perhaps be an selective flag associated with the NAT support in SER, so that either the dummy packet or the OPTIONS packet could be transmitted by the module.
There are other solutions here, like reducing the interval of REGISTER requests to serve the same purpose of refreshing NAT table mappings. However, one could argue that this method has a much higher load than an OPTIONS packet, especially when scaling across thousands or tens of thousands of clients in an environment where external databases (i.e. Radius, SQL, etc) are used for authentication lookups.
I would like that better. We could perhaps mitigate the performance penalty by granting re-REGISTERs which don't change too much -- that could be possible done as authentication-less in-memory lookup.
I'm not familiar with what you reference here; are you talking about cached credentials, or some other method that isn't a full authentication lookup for REGISTER requests which "appear" to have the same characteristics as prior registrations? (danger! I've imagined what you might be talking about, and in my (perhaps incorrect) assumptions, I can see some security problems. With this method, it would probably be easy to "take over" a SIP UA's identity for inbound calls without password authentication if the attacker was behind the same NAT external address.)
Note that there have been numerous examples of such poorly-written
SIP stacks on UA devices that they would crash on an OPTIONS request. Their repair is outside the scope of SER or this discussion.
:) Well, broken end-devices are unfortunately a never-ending pain.
My flat forehead (from banging against a wall) is direct evidence of this issue.
So on the to-do-list, I think there is an effort to educate UA vendors how to get things right, and there is an option to force short re-registration interval and try to invent some way which will reduced the performance penalty.
While I think that education of UA vendors is noble, I think that often these subtle and important issues fall outside of their comprehension in the headlong drive to get as many products out the door as possible, with minimally qualified VOIP programming staff. (/me points to flat forehead again)
The short re-registration interval technique would be great, if it can be shown to scale and remain secure.
I still like the method of using the OPTIONS queries as a short-term hack, even if I have to use (as one other poster did) the sipsak method to produce those queries. I actually kind of like that method, except that now I have to extract the list of "registered" users into yet another place for yet another script to run against them... Incorporating this into the nathelper module does not seem to be "overkill" - it's a module, not a core component of SER, so feature creep is (in my opinion) more acceptable in certain places which we already recognize as being "hacks."