Hello everyone,
I am racking my brain on a peculiar behaviour with TM timers which I
managed to narrow down after some trial and error, but the full
implications and causes of which I still have to understand, and I am
curious if anyone among the devs or other users with more knowledge than me
has any clarification or suggestion to make about it.
My scenario is the following, running Kamailio 5.8.5 on Debian bookworm
packages
1. An INVITE for a local user is received
2. Preferences for the local user are read. The user has a certain
timeout after which call forwarding should trigger. My setup aims to engage
this timeout only if the user has provided a "proper" progress indication
(180 or 183), and use a default value of 30 s if no progress is received.
For this example, we assume this customized timeout is 10 s.
3. Standard usrloc lookup. A single location is found, behind an edge
proxy (registered with Path).
4. INVITE gets sent (statefully) to the user through the edge proxy, UDP
transport only. Before routing, the fr_inv_timer is set to the default
value mentioned above using t_set_fr(30000).
5. The user replies with a 180. In the onreply_route, since this is the
first proper progress indication, the fr_inv_timer is changed to the
customized value calling t_set_fr(10000)
At this point, I would expect the INVITE transaction to fail for timeout
and go into call forwarding from the failure_route after the 10 s set in
the onreply_route, but I am getting feedback from users that this does not
always happen, and instead sometimes this happens in the 30 s set initially.
I tried to reproduce the issue in a controlled setup, and the reliability
of the t_set_fr() in onreply_route seems to be affected by the delay with
which the 180 is received. Introducing pauses in a simulated client with
sipp
- With 400ms the call forwarding always triggers as expected (10s)
- With 440ms I see 1/10 failures (call forwarding triggers in 30s)
- With 450ms I see 5/10 failures
- With >500ms I seem to get 10/10 failures
So it seems like using t_set_fr() in onreply_route is sensitive to timing.
Regarding this sensitivity, I noticed that 500 ms is the default
retr_timer1 I am using, and changing that affects the effectiveness of the
t_set_fr() call on onreply, i.e. t_set_retr(1000) allows t_set_fr() in
onreply_route to be effective up to delays of 1000 ms in the 180 reception.
Is this known and/or is there any workaround to get the behaviour I hoped
to implement? If the limitation cannot be worked around, could it be useful
to mention it in the t_set_fr() documentation?
Thanks in advance for your help!