[Devel] Processing REGISTER requests

Wed Oct 12 10:14:59 CEST 2005

On Tuesday 11 October 2005 19:37, Bogdan-Andrei Iancu wrote:
> >1. the port changes as a result of the NAT box relocating the
> > mappings. 1st time 1.2.3.4:5060 next time 1.2.3.4.1025 then
> > 1.2.3.4:1031
>
> please read carefully the algorithm - *only* the IP is checked (only if
> more than one contact for the AOR matchs); So this is not the case

That was just a description of one of the situations that can happen on 
the NAT box. The purpose was to list all possible cases to get the whole 
picture. It was not to say that it will not work in this case.

>
> >2. the ip address changes because the adsl connection is reset and you
> >receive a new IP address. in this case the port may change as well but
> > it may not.
> >   1st time 1.2.3.4:5060 then 1.2.7.8:5060 then 1.2.9.10:1031
>
> if the UAC uses a private IP, than the contact will match (it will be
> private); if more than one equal contact  are found, the received IP
> and callid will be used. So, no problem.

It is a problem, see the example below.

>
> basically, it's very similar to what Klaus suggested - the difference
> is I give more credit to received IP than to call-id, in the send the I
> use first received IP and only if needed I use the call-id. Why? I find
> more reliable to identify a UAC based on it's IP (it's change is a
> corner case) than using call-id which is just a recommendation of the
> RFC.

The change of the NAT IP is not a corner case. I have a use case where the 
NAT address changes with every registration. In this case NAT IP is 
anything but reliable to identify the requestor.

Here is the scenario:

Each phone (noted UA below) sits behind a NAT box (noted CNB below, from 
Client Nat Box) that usually has the same IP address all the time, but on 
occasion may have it changed (like for example if the ADSL provider 
resets the connection and gives a new IP address). Then there is a second 
level of NAT consisting of a number N boxes that are picked at random 
with every register message, based on routing decisions outside the scope 
of openser (these will be noted PN below, from Public NAT).
Now the 1st level NAT boxes wouldn't matter in the process, even if they 
change IP addresses because that IP is never to be seen by openser.

Now consider 1 AOR configured by a user into 2 phones sitting in 2 
different networks and using the same private IP. First time they 
register like this:

UA1 (10.0.0.1:5060) --> CNB1 --> PN1 (1.2.3.4) --> Proxy

UA2 (10.0.0.1:5060) --> CNB2 --> PN2 (5.6.7.8) --> Proxy

Proxy retains:
- UA1: contact 10.0.0.1:5060, received IP 1.2.3.4 and callid callid-UA1
- UA2: contact 10.0.0.1:5060, received IP 5.6.7.8 and callid callid-UA2

next register they pick a different NAT, and assume UA2 picks the NAT UA1 
used before, while UA1 uses a completely different one:

UA2 (10.0.0.1:5060) --> CNB2 --> PN1 (1.2.3.4) --> Proxy

UA1 (10.0.0.1:5060) --> CNB1 --> PN7 (9.8.7.6) --> Proxy

Now proxy gets these:
- UA1: contact 10.0.0.1:5060, received IP 9.8.7.6 and callid callid-UA1
- UA2: contact 10.0.0.1:5060, received IP 1.2.3.4 and callid callid-UA2

Situation for UA1 is clear, but UA2 has a conflict. The IP check would 
make it overwrite the old UA1 contact, while the callid check would 
correctly identify it as long as the phone respects the rfc 
recommendation (which on my data happens in 98% of cases).
But since you favor IP over callid, the wrong contact will be updated.

And it can be even worse: if the 2 phones use cross pick NAT boxes then 
they both overwrite each others contacts.

However if you consider checking the NAT port as well, then we may 
eliminate the conflict (the PN1 NAT box will give different ports for the 
2 clients and the IP check will fail in this case, because even though 
IPs are the same ports will not be - the port from the first connection 
is still open so it can't be allocated to the second connection)

Now which test would you favor to avoid this situation? This is not just 
an example, it's a real case scenario.

Using this new algorithm would be worse than now for me. Currently my only 
problem is that 1 phone can end up with 2 contacts for an overlapping 
period near its end of the registration period when he sends a new 
register, but with your proposed algorithm, I can get into the 
overwrite-me realm quickly.

IMO, we should combine the tests like this:

1. match contact (which is private). there will result 1 or more entries.
   If 1 update it and stop else continue.
2. apply public NAT test and get a trimmed down list from the list 
   obtained at step 1
3. apply the callid test and get a second trimmed down list from the list 
   obtained at step 1
4. Interpret result:
   case both lists empty: 
        new registration
        stop
   case ip list empty, callid list has 1 element:
        update that element
        stop
   case ip list has 1 element, callid list is empty:
        update that element
        stop
   case ip list has 1 element, callid list has 1 element:
        update the element pointed by callid (favor callid)
        stop
   case ip list has multiple elements, callid list has 1 element:
        update the element pointed by callid
        stop
   case ip list has 1 element, callid list has multiple elements:
        # this is clearly a conflict situation that is not easy to solve
        # I'm not sure if we can pick something here that doesn't have
        # a high chance break some contact
        # But I also think this has very low chances to appear
        _maybe_ update the element pointed by IP
        stop
   case ip list has multiple elements, callid list has multiple elements:
        # another conflicting situation
        # I have no idea what to pick here.
        # Also chances are this won't be seen in practice.
        ???
        stop

Note: my real life experience with a consistent contacts database shows 
that there are no duplicate callid's in the database. This leads me to 
believe that the cases above where the callid check would yield a list 
with multiple elements will not be seen in practice, which are exactly 
the last 2 cases which have clear conflicts.

Also I would consider the NAT port in the IP test as well as mentioned 
above which would avoid problems in the described case.

-- 
Dan