George,
I commend you for thorough testing and analysis :-)
See inline.
I think I may have solved my problems with usrloc
replication, so I'm
commenting my previous mail inline for anyone interested to see.
On Thursday 06 October 2005 19:01, George Perantinos wrote:
Dear list,
I'm trying to implement registration replication between 2 SERs (ver.
0.9.4, they are AA behind an F5 BigIP). Both SERs are using the same
mysql
database and mediaproxy is used for RTP relaying (whenever needed). The
basis for my configuration is the mediaproxy config from
onsip.org
(excellent work, many thanx guys).
I sould add here that I'm using caller-id stickiness with the BigIp.
I must emphasize that caller-id stickiness is crucial to your other
conclusions, as well as my responses. I.e. they may not be valid without the
stickiness being handled by an external system.
My servers are 10.21.128.232 and 10.21.128.233. I'm using forward_tcp and
save_memory for replication purposes, as can be seen in the following
registration route block:
At first I realised that when replicating between SERs (whether using
forward_tcp or t_replicate) you loose the NATed info for a UA. The nat
flag
will not be carried forward, so the SER that receives a replicated
REGISTER
will handle that UA as a UA with a real IP in future requests (the result
of
an INVITE is a timeout).
Yes, in 0.9.x ( I believe) you can do this on the replicating server:
# If contact is NATed, adda a receive parameter for the other server to
handle
if(isflagset(6)){
add_rcv_param();
}
This will be stored in the contact as received.
To further complicate things, debugging with serctl
for the NAT flag is
worthless for this reason, as the ouput of "Flags:" field is not
consistent
between the 2 servers. (Some bug perhaps?)
Yes, I always find it difficult to interpret the flags :-)
route[2] {
# -----------------------------------------------------------------
# REGISTER Message Handler
# ----------------------------------------------------------------
sl_send_reply("100", "Trying");
if(src_ip==10.21.128.233) {
# If it is a forward from our brother then...
if (client_nat_test("3")) {
# check if the user is NATed
setflag(6);
fix_nated_register();
force_rport();
};
One needs to set the nat flag if the REGISTER request is for NATed UA.
The above client_nat_test I was doing is worthless (stupid mistake). I was
checking if my server, that the replication originated from, was NATed...
Of
course, both my servers always are NATed.
The only way that i can think of obtaining this info is by checking for
private IPs in the contact header. Since the other server has already
performed the nat_client_test on the UA, the fix_nated_register
information
exists (and is replicated correctly), so I think a plain search must be
safe
(though I would be glad to hear any objections).
So, this block becomes:
if (search("^(Contact|m): .*(a)(192\.168\.|10\.|172\.16)")) {
setflag(6);
};
See above. BTW, nat_uac_test() has a test that only checks for private IPs
in Contact. I don't recall the number of the test right now, but they are
documented in the README. Why do you want the replicating server to have the
NAT flag set? Will both servers do pinging? BigIP, AFAIK, does a NATed
stickiness, right. So any server could ping with the same result. If each
server will have its own public IP (bigip is not NATing), then the server
getting the replication has no chance of getting through a (port) restricted
NAT. I'm not sure if I understand you scenario here.
save_memory("location");
break;
This "break" was redundant, since I'm always breaking in my main routing
block
after proccessing in this block.
} else {
if (!search("^Contact:[ ]*\*") && client_nat_test("7"))
{
setflag(6);
fix_nated_register();
force_rport();
};
if (!radius_www_authorize("domain.com")) {
www_challenge("domain.com","0");
xlog("L_NOTICE","Could not authorize user %fu, IP %is \n");
break;
};
if (!check_to()) {
sl_send_reply("401", "Unauthorized");
xlog("L_NOTICE","From: does not match URI:, user %fu, IP %is
\n");
break;
};
consume_credentials();
Despite doing all the above, still the NAT device was not keeping the
binded
port for the second SER. In my desparation, I removed the above above
"consume_credentials();" and suddenly everything worked!!
The second server begun at last to also send the magical natping to the
client
in order to keep the NAT binding open.
But I don't get it... I'd be obliged to anyone that would step up and
explain
this to me...
You must always check for your replicating server and except it from the
authorize section (or use allow_trusted())
>
> if (!save("location")) {
> xlog("L_WARN","MYSQL down? Could not save location\n");
> sl_reply_error();
> };
> xlog("L_INFO","Updating registration for user %fu, IP %is
\n");
> add_rcv_param();
> if (!forward_tcp("10.21.128.233", 5060)) {
> xlog("L_WARN","Cannot replicate user %fu at server
> 10.21.128.233\n"); };
> };
> }
Well well, here you have add_rcv_param, so even though the flag is not set,
the secondary ser will know the correct location.
When performing a "serctl ul show" on the servers, the results are as
follows:
1) For UAs coming from real IPs, serctl shows identical results on both
servers.
2) For NATed UAs there is difference in the "Flags:" field.
On the server that initialy proccessed the request, the value of the
"Flags" is 1 (OK!), while on the server that the registration was
forwarded, the value of the "Flags" field is 257(??).
This issue still remains, just a little different.
"Flags: 0" is replicated as "Flags: 256", while "Flags: 1"
as "Flags:
257".
(It looks as though {repl.flag} = {orig.flag} + 256)
Yes, I have observed that myself (not only for replication). I'm not sure
why. Have been thinking about going through the code, but...
>
> My impression is that the Flags field should be 1 if a UA is NATed and 0
> if
> not. Am I wrong?
No.
> If I'm right on the above, then shouldn't
both servers show the same
> result
> in Flags (i.e. 1) for NATed clients?
>
> On the other hand, if I restart SER (and the contacts are read from the
> database), then every "weird" 257 Flag is reset to 1, until the next UA
> re-registration (where it returns again to 257).
Yes, seen the same. Again, I'm not sure if this is purely cosmetic or a bug.
g-)