Thank you Henning,

Sorry for complicating the question!

Including 480 in t_check_status will indeed catch the Python scripts crashes, but also a bunch of legitimate cases (480 Temporarily Unavailable – Callee currently unavailable.). So it won't work! Personally, I would try to change the return value to 503, in FreeSWITCH, for the crashing Python scripts.

I try to clarify another part of my question: will the condition (t_branch_timeout() and !t_branch_replied()) work if the FreeSWITCH blocks (so timeout) after a first reply (like 100 Trying) ?
The sequence:
 - Kamailio -> INVITE -> FreeSWITCH
 - Kamailio <- 100 Trying <- FreeSWITCH
 - FreeSWITCH blocks
 - failure_route signaled because t_branch_timeout
 - BUT t_branch_replied, so no alternative FS tried
 Shouldn't all timeouts re-arm the failure route ?

 failure_route[REROUTE] {
    if (t_is_canceled()) {
        exit;
    }
    //  also 6xx ?
    if (t_check_status("5[0-9][0-9]") or (t_branch_timeout() and !t_branch_replied())) {
        // re-route to another available FreeSWITCH server
    }
    else if (t_branch_timeout()) {
        t_on_failure("REROUTE");
    }
}

Thanks,
Liviu

On Fri, Oct 29, 2021 at 11:17 AM Henning Westerholt <hw@gilawa.com> wrote:

Hello,

 

Not sure if I understood your question 100%. If you want to work on 480 replies in the failure_route, you should also include this in the t_check_status(..) function call.

 

Cheers,

 

Henning

 

--

Henning Westerholt – https://skalatan.de/blog/

Kamailio services – https://gilawa.com

 

From: sr-users <sr-users-bounces@lists.kamailio.org> On Behalf Of Liviu ANDRON
Sent: Thursday, October 28, 2021 11:51 AM
To: sr-users@lists.kamailio.org
Subject: [SR-Users] Does failure_route for redirection covers all error cases ?

 

Hello,

We are routing calls to FreeSWITCH servers.
We have a failure routing mechanism in place, which looks pretty common from my research around:

route[INVITE] {

...
t_on_failure("REROUTE");

and
failure_route[REROUTE] {
    if (t_is_canceled()) {
        exit;
    }
    //  also 6xx ?
    if (t_check_status("5[0-9][0-9]") or (t_branch_timeout() and !t_branch_replied())) {
        // re-route to another available FreeSWITCH server
    }
}

The problem is that we don't capture all the failures we would like to, one such example being 480 sent by FreeSWITCH in various cases (even default in hangup_cause_to_sip, mod_sofia.c, https://github.com/signalwire/freeswitch/blob/master/src/mod/endpoints/mod_sofia/mod_sofia.c#L369), like a Python script crash, but also legitimate cases like "user not registered" (USER_NOT_REGISTERED).
Are there other cases? Like a timeout replied (t_branch_timeout() and t_branch_replied()) which never recovers ? Shouldn't any t_branch_timeout() re-arm the failure route ?

 

Thanks,

Liviu