while testing new version of lcr module, i got this kind of tm errors:
Jun 22 11:27:36 localhost /usr/sbin/sip-router[16330]: ERROR: tm [t_reply.c:1066]: ERROR: t_should_relay_response: status rewrite by UAS: stored: 408, received: 403 Jun 22 11:27:36 localhost /usr/sbin/sip-router[16329]: ERROR: tm [t_reply.c:1066]: ERROR: t_should_relay_response: status rewrite by UAS: stored: 408, received: 403 Jun 22 11:27:37 localhost /usr/sbin/sip-router[16331]: ERROR: tm [t_reply.c:1066]: ERROR: t_should_relay_response: status rewrite by UAS: stored: 408, received: 403 Jun 22 11:27:39 localhost /usr/sbin/sip-router[16331]: ERROR: tm [t_reply.c:1066]: ERROR: t_should_relay_response: status rewrite by UAS: stored: 408, received: 403
they came when i first called t_relay 4 times. first call was in route block and the rest in failure route block. there was no reply to any of these four calls and each time fr_timer hit. on fifth call, gw responded with 403.
i don't get these kind of errors in k. what do they mean and how to get rid of them?
-- juha
On Jun 22, 2009 at 11:38, Juha Heinanen jh@tutpro.com wrote:
while testing new version of lcr module, i got this kind of tm errors:
Jun 22 11:27:36 localhost /usr/sbin/sip-router[16330]: ERROR: tm [t_reply.c:1066]: ERROR: t_should_relay_response: status rewrite by UAS: stored: 408, received: 403 Jun 22 11:27:36 localhost /usr/sbin/sip-router[16329]: ERROR: tm [t_reply.c:1066]: ERROR: t_should_relay_response: status rewrite by UAS: stored: 408, received: 403 Jun 22 11:27:37 localhost /usr/sbin/sip-router[16331]: ERROR: tm [t_reply.c:1066]: ERROR: t_should_relay_response: status rewrite by UAS: stored: 408, received: 403 Jun 22 11:27:39 localhost /usr/sbin/sip-router[16331]: ERROR: tm [t_reply.c:1066]: ERROR: t_should_relay_response: status rewrite by UAS: stored: 408, received: 403
they came when i first called t_relay 4 times. first call was in route block and the rest in failure route block. there was no reply to any of these four calls and each time fr_timer hit. on fifth call, gw responded with 403.
i don't get these kind of errors in k. what do they mean and how to get rid of them?
They mean that after a final reply was received or "faked" (automatically generated locally) on a branch, a new different reply for the same branch was received. In all this cases the new reply is discarded. The error messages is printed for everything except the following 2 cases: initial final reply on branch was 487 or the final reply was 408 followed by a 487 (these are very common and happen in normal operation).
In your case the final reply is 408 (timeout) and the new received one is 403. What might have happened is either the gw sent the 403 after the fr_inv_timer did hit for the branch corresponding to your fifth call or the gw sent the 403 for one of the previous branches, but it's hard to say without more info.
You do not get this messages in k, because there the error message is enclosed by #ifdef EXTRA_DEBUG.
Andrei
Andrei Pelinescu-Onciul writes:
In your case the final reply is 408 (timeout) and the new received one is 403. What might have happened is either the gw sent the 403 after the fr_inv_timer did hit for the branch corresponding to your fifth call or the gw sent the 403 for one of the previous branches, but it's hard to say without more info.
andrei,
i set fr_timer_avp to 3000 (3 sec) before calling t_relay. i have
modparam("tm", "fr_inv_timer", 75000)
i have defined fr_inv_timer_avp, but it has not been given a value during processing of this request. could the explanation be this:
k:
1.3.11. fr_timer_avp (string)
If you set the AVP to an empty string, the whole mechanism for variable timeout will be disabled, falling back to the static value.
sr:
1.3.29. fr_inv_timer_avp (string)
The value of fr_inv_timer timer can be overriden on per-transaction basis. The administrator can provide a value to be used for a particular transaction in an AVP. This parameter contains the name of the AVP that will be checked. If the AVP exists then its value will be used for the fr_inv_timer timer, effectively overriding the value configured in fr_inv_timer parameter for the current transaction.
what does sr do if fr_inv_timer_avp exists, but is not set before t_relay is called?
below is what comes to syslog.
-- juha
Jun 22 16:02:54 localhost /usr/sbin/sip-router[31499]: INFO: Routing initial INVITE to sip:0407058050@192.98.101.24:5060 and <<null>> Jun 22 16:02:59 localhost /usr/sbin/sip-router[31514]: INFO: Routing next INVITE to sip:0407058050@192.98.101.21:5060 Jun 22 16:03:04 localhost /usr/sbin/sip-router[31514]: INFO: Routing next INVITE to sip:0407058050@192.98.101.23:5060 Jun 22 16:03:09 localhost /usr/sbin/sip-router[31514]: INFO: Routing next INVITE to sip:0407058050@192.98.101.22:5060 Jun 22 16:03:14 localhost /usr/sbin/sip-router[31514]: INFO: Routing next INVITE to sip:0407058050@192.98.100.10:5060 Jun 22 16:03:15 localhost /usr/sbin/sip-router[31499]: ERROR: tm [t_reply.c:1066]: ERROR: t_should_relay_response: status rewrite by UAS: stored: 408, received: 403 Jun 22 16:03:15 localhost /usr/sbin/sip-router[31498]: ERROR: tm [t_reply.c:1066]: ERROR: t_should_relay_response: status rewrite by UAS: stored: 408, received: 403 Jun 22 16:03:15 localhost /usr/sbin/sip-router[31500]: ERROR: tm [t_reply.c:1066]: ERROR: t_should_relay_response: status rewrite by UAS: stored: 408, received: 403 Jun 22 16:03:18 localhost /usr/sbin/sip-router[31498]: ERROR: tm [t_reply.c:1066]: ERROR: t_should_relay_response: status rewrite by UAS: stored: 408, received: 403 Jun 22 16:03:19 localhost /usr/sbin/sip-router[31514]: WARNING: No more gateways for INVITE sip:0407058050@192.98.101.24:5060
On Jun 22, 2009 at 16:12, Juha Heinanen jh@tutpro.com wrote:
Andrei Pelinescu-Onciul writes:
In your case the final reply is 408 (timeout) and the new received one is 403. What might have happened is either the gw sent the 403 after the fr_inv_timer did hit for the branch corresponding to your fifth call or the gw sent the 403 for one of the previous branches, but it's hard to say without more info.
andrei,
i set fr_timer_avp to 3000 (3 sec) before calling t_relay. i have
^^^^^^ the avp timers are in seconds, so you should set them it to 3. All the other timeouts are in ms (I'll add a note to the docs).
modparam("tm", "fr_inv_timer", 75000)
i have defined fr_inv_timer_avp, but it has not been given a value during processing of this request. could the explanation be this:
k:
1.3.11. fr_timer_avp (string)
If you set the AVP to an empty string, the whole mechanism for variable timeout will be disabled, falling back to the static value.
sr:
1.3.29. fr_inv_timer_avp (string)
The value of fr_inv_timer timer can be overriden on per-transaction basis. The administrator can provide a value to be used for a particular transaction in an AVP. This parameter contains the name of the AVP that will be checked. If the AVP exists then its value will be used for the fr_inv_timer timer, effectively overriding the value configured in fr_inv_timer parameter for the current transaction.
Yes, this is it.
what does sr do if fr_inv_timer_avp exists, but is not set before t_relay is called?
An avp always has a value if it exists. If its value is the empty string or 0 => the avp integer value will be 0 => in this case the timer will be set to 0.
I'll add a check for 0 in sr too (so that it will be compatible with the k way).
Note that in ser the modern way to use avps for timers is via t_set_fr() & friends (you can set everything from fr timers to retransmission and lifetime, you can later reset to default via t_reset_fr() and you can use it only when needed).
Andrei
Andrei Pelinescu-Onciul writes:
i set fr_timer_avp to 3000 (3 sec) before calling t_relay. i have
^^^^^^ the avp timers are in seconds, so you should
set them it to 3. All the other timeouts are in ms (I'll add a note to the docs).
i did so, but it didn't affect the error messages. while doing it i noticed that fr_timer is not very accurate. i call next_gw() when previous call times out and syslog reports 5 second intervals:
Jun 22 17:56:46 localhost /usr/sbin/sip-router[5881]: INFO: Routing initial INVITE to sip:0407058050@192.98.101.21:5060 and <<null>> Jun 22 17:56:51 localhost /usr/sbin/sip-router[5909]: INFO: Routing next INVITE to sip:0407058050@192.98.101.24:5060 Jun 22 17:56:56 localhost /usr/sbin/sip-router[5909]: INFO: Routing next INVITE to sip:0407058050@192.98.101.23:5060 Jun 22 17:57:01 localhost /usr/sbin/sip-router[5909]: INFO: Routing next INVITE to sip:0407058050@192.98.101.22:5060 Jun 22 17:57:06 localhost /usr/sbin/sip-router[5909]: INFO: Routing next INVITE to sip:0407058050@192.98.100.10:5060
An avp always has a value if it exists. If its value is the empty string or 0 => the avp integer value will be 0 => in this case the timer will be set to 0.
i added xlog statement before first next_gw() call and it reports that fr_inv_timer_avp has value null:
Jun 22 17:56:46 localhost /usr/sbin/sip-router[5881]: INFO: fr_inv_timer_avp <null>
I'll add a check for 0 in sr too (so that it will be compatible with the k way).
that would be nice so that script writer can leave the avp unset if there is no need to change the default set by the module parameter variable.
Note that in ser the modern way to use avps for timers is via t_set_fr() & friends (you can set everything from fr timers to retransmission and lifetime, you can later reset to default via t_reset_fr() and you can use it only when needed).
thanks for the pointer. i'll take a look at those.
-- juha
i updated my tm module from git and replaced tm timer avps with t_set_fr calls. before i start calling next_gw()/t_relay(), i do
t_set_fr(75000, 3000)
accuracy of fr_timeout became good, but i'm still getting the error messages:
Jun 22 19:38:53 localhost /usr/sbin/sip-router[11411]: INFO: Routing initial INVITE to sip:0407058050@192.98.101.21:5060 and <<null>> Jun 22 19:38:56 localhost /usr/sbin/sip-router[11437]: INFO: Routing next INVITE to sip:0407058050@192.98.101.24:5060 Jun 22 19:38:59 localhost /usr/sbin/sip-router[11437]: INFO: Routing next INVITE to sip:0407058050@192.98.101.23:5060 Jun 22 19:39:02 localhost /usr/sbin/sip-router[11437]: INFO: Routing next INVITE to sip:0407058050@192.98.101.22:5060 Jun 22 19:39:05 localhost /usr/sbin/sip-router[11437]: INFO: Routing next INVITE to sip:0407058050@192.98.100.10:5060 Jun 22 19:39:07 localhost /usr/sbin/sip-router[11412]: ERROR: tm [t_reply.c:1066]: ERROR: t_should_relay_response: status rewrite by UAS: stored: 408, received: 403 Jun 22 19:39:07 localhost /usr/sbin/sip-router[11411]: ERROR: tm [t_reply.c:1066]: ERROR: t_should_relay_response: status rewrite by UAS: stored: 408, received: 403 Jun 22 19:39:07 localhost /usr/sbin/sip-router[11410]: ERROR: tm [t_reply.c:1066]: ERROR: t_should_relay_response: status rewrite by UAS: stored: 408, received: 403
like before, in this test the first four t_relays resulted in fr_timeout (gw didn't respond) and the last returned 403 (forbidden).
what is the next step, i.e., how to get rid of the error messages? i don't see that i would be doing anything wrong here in my cfg file.
-- juha
On Jun 22, 2009 at 19:44, Juha Heinanen jh@tutpro.com wrote:
i updated my tm module from git and replaced tm timer avps with t_set_fr calls. before i start calling next_gw()/t_relay(), i do
t_set_fr(75000, 3000)
accuracy of fr_timeout became good, but i'm still getting the error messages:
It's strange that the accuracy became good, because the timeout is set in a similar way when fr_timer_avp is used (so it should be always good, always at max. 125 ms drift with default TIMER_TICKS_HZ).
Jun 22 19:38:53 localhost /usr/sbin/sip-router[11411]: INFO: Routing initial INVITE to sip:0407058050@192.98.101.21:5060 and <<null>> Jun 22 19:38:56 localhost /usr/sbin/sip-router[11437]: INFO: Routing next INVITE to sip:0407058050@192.98.101.24:5060 Jun 22 19:38:59 localhost /usr/sbin/sip-router[11437]: INFO: Routing next INVITE to sip:0407058050@192.98.101.23:5060 Jun 22 19:39:02 localhost /usr/sbin/sip-router[11437]: INFO: Routing next INVITE to sip:0407058050@192.98.101.22:5060 Jun 22 19:39:05 localhost /usr/sbin/sip-router[11437]: INFO: Routing next INVITE to sip:0407058050@192.98.100.10:5060 Jun 22 19:39:07 localhost /usr/sbin/sip-router[11412]: ERROR: tm [t_reply.c:1066]: ERROR: t_should_relay_response: status rewrite by UAS: stored: 408, received: 403 Jun 22 19:39:07 localhost /usr/sbin/sip-router[11411]: ERROR: tm [t_reply.c:1066]: ERROR: t_should_relay_response: status rewrite by UAS: stored: 408, received: 403 Jun 22 19:39:07 localhost /usr/sbin/sip-router[11410]: ERROR: tm [t_reply.c:1066]: ERROR: t_should_relay_response: status rewrite by UAS: stored: 408, received: 403
like before, in this test the first four t_relays resulted in fr_timeout (gw didn't respond) and the last returned 403 (forbidden).
what is the next step, i.e., how to get rid of the error messages? i don't see that i would be doing anything wrong here in my cfg file.
We could log it at a lower level or add a config option for skipping the log message, but first we should see if it doesn't point to some other more serious error. I still don't understand why you get them.
Are you sure that the gateway does not respond later with 403 for the initial branches (after more then 14s)? Do you add a new branch in all the cases?
Could you send me a packet dump along with the log (preferably at a high debug level) and a short example on how you are doing this gateway failover from the failure route? It's most likely a too verbose message for your setup, but I'd like to be sure of that and that it does not hide a more important problem.
Andrei
On 22-06 15:53, Andrei Pelinescu-Onciul wrote:
On Jun 22, 2009 at 16:12, Juha Heinanen jh@tutpro.com wrote:
Andrei Pelinescu-Onciul writes:
In your case the final reply is 408 (timeout) and the new received one is 403. What might have happened is either the gw sent the 403 after the fr_inv_timer did hit for the branch corresponding to your fifth call or the gw sent the 403 for one of the previous branches, but it's hard to say without more info.
andrei,
i set fr_timer_avp to 3000 (3 sec) before calling t_relay. i have
^^^^^^ the avp timers are in seconds, so you should
set them it to 3. All the other timeouts are in ms (I'll add a note to the docs).
Shouldn't we change the unit of AVP timers to ms too, just to be consistent? I've bumped into this inconsistency many times in the past and it was always a great source of confusion for me.
Jan.
On Jun 22, 2009 at 18:55, Jan Janak jan@iptel.org wrote:
On 22-06 15:53, Andrei Pelinescu-Onciul wrote:
On Jun 22, 2009 at 16:12, Juha Heinanen jh@tutpro.com wrote:
Andrei Pelinescu-Onciul writes:
In your case the final reply is 408 (timeout) and the new received one is 403. What might have happened is either the gw sent the 403 after the fr_inv_timer did hit for the branch corresponding to your fifth call or the gw sent the 403 for one of the previous branches, but it's hard to say without more info.
andrei,
i set fr_timer_avp to 3000 (3 sec) before calling t_relay. i have
^^^^^^ the avp timers are in seconds, so you should
set them it to 3. All the other timeouts are in ms (I'll add a note to the docs).
Shouldn't we change the unit of AVP timers to ms too, just to be consistent? I've bumped into this inconsistency many times in the past and it was always a great source of confusion for me.
Since I think new scripts should use t_set_fr() I don't have any preference for fr_timer_avp unit. However if we change it, both ser and kamailio users that have it set in seconds somewhere in some DB, will have problems. We could change its name to fr_timer_avp_ms (so at least users will get an error) or leave it the way it is (at least is documented now, and while inconsistent, it will have the minimum migration impact).
Andrei
Jan Janak writes:
Shouldn't we change the unit of AVP timers to ms too, just to be consistent? I've bumped into this inconsistency many times in the past and it was always a great source of confusion for me.
i agree with that proposal. perhaps we could remove of the tm timer avps altogether. that way it would be easier for k user to recognize that changes are needed.
-- juha