Hi, this is a common issue reported by other users. Imagine I do failover between two gateways based on request timeout:
------------------------------ failure_route[FAILURE_ROUTE_OUT] {
# Locally generated 408 due to transaction timeout: if (t_local_replied("last") && $T_reply_code==408) { xlog("L_ERROR", "_ERROR_ $T_reply_code local replied => failover\n"); ... do failover ... }
} ------------------------------
The above code runs when the gateway-1 doesn't reply at all so "fr_timer" expires (let's say 5 seconds).
The problem is that such code also runs when no final response at all is received from gateway-1 in "fr_inv_timer" (let's say 150 seconds). This is, 408 is locallly generated even if the proxy has received provisional responses for such transaction.
Main problem this originates is the fact that the list of gateways could end without any of them giving a final reply, and when no more gateways are available it's common to reply 500/503 to the client.
Two workarounds:
a) When no more gateways are available inspect if $T_reply_code==408 and then reply 408 rather than 500/503. I don't like it as if no one gateways has replied a provisional response in "fr_timer" then I have a *real* problem and I should reply 500/503.
b) Enable a flag(PROVISIONAL_RECEIVED) when a provisional response is received and then reply 408/480 when no more gateways are available.
Anyhow, there should be (IMHO) a in-built way to determine if the 408 occurs after "fr_timer" or "fr_inv_timer". Do I miss something? do you consider this a real need?
Regards.
2010/3/25 Iñaki Baz Castillo ibc@aliax.net:
Anyhow, there should be (IMHO) a in-built way to determine if the 408 occurs after "fr_timer" or "fr_inv_timer".
What I suggest is:
- Create a custom SIP status code (i.e. 498) which means "fr_timer expiration", this is: no provisional (neither final) response has been receiving within "fr_timer" seconds (typically 5-10 seconds).
This 498 really means that the server didn't reply at all !! so we must do failover or inform the client about a real problem (500/503).
- In case the expiration occurs due to "fr_inv_timer" (this is, after receiving at least a 100 or 1XX reply within "fr_timer") then an usual 408 would be locally generated (as it does now).
This 408 would mean that the server hasn't replied a final response within "fr_inv_timer" (100-200 seconds), which just indicates that nobody answered or rejected the call, but the server is alive.
Opinions?
I aki Baz Castillo writes:
- Create a custom SIP status code (i.e. 498) which means "fr_timer
expiration", this is: no provisional (neither final) response has been receiving within "fr_timer" seconds (typically 5-10 seconds).
i simple set a flag if provisional reply has been received.
-- juha
2010/3/25 Juha Heinanen jh@tutpro.com:
I aki Baz Castillo writes:
> - Create a custom SIP status code (i.e. 498) which means "fr_timer > expiration", this is: no provisional (neither final) response has been > receiving within "fr_timer" seconds (typically 5-10 seconds).
i simple set a flag if provisional reply has been received.
Yes, this is the second workaround I told in my initial mail, but even if it works it seems a workaround for me. There should be some way builit in the TM module to determine it.
Another solution would be TM module to set a pseudo-variable if the 408 is generated due to "fr_timer", i.e: $tm_fr_timeout = true.
I aki Baz Castillo writes:
Yes, this is the second workaround I told in my initial mail, but even if it works it seems a workaround for me. There should be some way builit in the TM module to determine it.
Another solution would be TM module to set a pseudo-variable if the 408 is generated due to "fr_timer", i.e: $tm_fr_timeout = true.
fine with me, but the flag workaround has worked fine for years for me and is easy to implement in the script.
in my opinion, there are many more higher priority things to do in core/tm module.
-- juha
On Mar 25, 2010 at 13:44, I??aki Baz Castillo ibc@aliax.net wrote:
Hi, this is a common issue reported by other users. Imagine I do failover between two gateways based on request timeout:
failure_route[FAILURE_ROUTE_OUT] {
# Locally generated 408 due to transaction timeout: if (t_local_replied("last") && $T_reply_code==408) { xlog("L_ERROR", "_ERROR_ $T_reply_code local replied => failover\n"); ... do failover ... }
}
The above code runs when the gateway-1 doesn't reply at all so "fr_timer" expires (let's say 5 seconds).
The problem is that such code also runs when no final response at all is received from gateway-1 in "fr_inv_timer" (let's say 150 seconds). This is, 408 is locallly generated even if the proxy has received provisional responses for such transaction.
Try t_any_replied() (http://sip-router.org/docbook/sip-router/branch/master/modules/tm/tm.html#t_...)
E.g.: if (t_check_status("408"){ if (!t_any_replied()) { t_reply(503, "Try again later, busy gws"); exit; } }
Main problem this originates is the fact that the list of gateways could end without any of them giving a final reply, and when no more gateways are available it's common to reply 500/503 to the client.
Two workarounds:
a) When no more gateways are available inspect if $T_reply_code==408 and then reply 408 rather than 500/503. I don't like it as if no one gateways has replied a provisional response in "fr_timer" then I have a *real* problem and I should reply 500/503.
b) Enable a flag(PROVISIONAL_RECEIVED) when a provisional response is received and then reply 408/480 when no more gateways are available.
Anyhow, there should be (IMHO) a in-built way to determine if the 408 occurs after "fr_timer" or "fr_inv_timer". Do I miss something? do you consider this a real need?
I don't consider this a real need, however it already can be done easily :-)
Andrei
Andrei Pelinescu-Onciul writes:
The problem is that such code also runs when no final response at all is received from gateway-1 in "fr_inv_timer" (let's say 150 seconds). This is, 408 is locallly generated even if the proxy has received provisional responses for such transaction.
Try t_any_replied()
andrei,
description of the function has:
Returns true if at least one of the current transactions branches did receive some reply in the past.
in serial forking to gws, i reset my "gw alive" flag before each t_relay() call. does the above mean that if gw 1 has replied something, then t_any_replied() will be true also for gw 2 even if no reply comes back from it?
-- juha
2010/3/25 Juha Heinanen jh@tutpro.com:
andrei,
description of the function has:
Returns true if at least one of the current transactions branches did receive some reply in the past.
in serial forking to gws, i reset my "gw alive" flag before each t_relay() call. does the above mean that if gw 1 has replied something, then t_any_replied() will be true also for gw 2 even if no reply comes back from it?
+1
On Mar 25, 2010 at 20:06, Juha Heinanen jh@tutpro.com wrote:
Andrei Pelinescu-Onciul writes:
The problem is that such code also runs when no final response at all is received from gateway-1 in "fr_inv_timer" (let's say 150 seconds). This is, 408 is locallly generated even if the proxy has received provisional responses for such transaction.
Try t_any_replied()
andrei,
description of the function has:
Returns true if at least one of the current transactions branches did receive some reply in the past.
in serial forking to gws, i reset my "gw alive" flag before each t_relay() call. does the above mean that if gw 1 has replied something, then t_any_replied() will be true also for gw 2 even if no reply comes back from it?
t_any_replied() returns true if a reply was received on any branch, so yes it should return true in your example.
Andrei
2010/3/25 Andrei Pelinescu-Onciul andrei@iptel.org:
Try t_any_replied() (http://sip-router.org/docbook/sip-router/branch/master/modules/tm/tm.html#t_...)
E.g.: if (t_check_status("408"){ if (!t_any_replied()) { t_reply(503, "Try again later, busy gws"); exit; } }
According to the documentation: ---------------------- t_any_replied()
Returns true if at least one of the current transactions branches did receive some reply in the past. If called from a failure or onreply route, the "current" reply is not taken into account. -----------------------
Does it also return true if the only received response was 1XX?
Thanks.
I aki Baz Castillo writes:
Returns true if at least one of the current transactions branches did receive some reply in the past. If called from a failure or onreply route, the "current" reply is not taken into account.
Does it also return true if the only received response was 1XX?
i don't know, but "1xx" is "some response".
-- juha
On Mar 25, 2010 at 19:07, I??aki Baz Castillo ibc@aliax.net wrote:
2010/3/25 Andrei Pelinescu-Onciul andrei@iptel.org:
Try t_any_replied() (http://sip-router.org/docbook/sip-router/branch/master/modules/tm/tm.html#t_...)
E.g.: ??if (t_check_status("408"){ ?? ??if (!t_any_replied()) { ?? ?? ?? ??t_reply(503, "Try again later, busy gws"); ?? ?? ?? ??exit; ?? ??} ??}
According to the documentation:
t_any_replied()
Returns true if at least one of the current transactions branches did receive some reply in the past. If called from a failure or onreply route, the "current" reply is not taken into account.
Does it also return true if the only received response was 1XX?
Yes, it should (I don't remember if I ever tested it with provisional replies).
Andrei