[Kamailio-Users] [LCR] About ping

List overview All Threads
Download

newer

older

Re: [Kamailio-Users] Siremis 0.9.2...

[Kamailio-Users] ACK when using NAT

Iñaki Baz Castillo

20 Mar 2009 20 Mar '09

6:29 p.m.

Hi, some questions about LCR module ping mechanism:

- How to know which gws are down? The only I see is a NOTICE log: --------- NOTICE:lcr:gw_set_state: trunk "99.99.99.99:5060" from group: <2> is OFFLINE --------- Is not possible a MI command to list the down gateways?

- Why ping_interval cannot be less than 180 seconds?

- In case of failure_route and "next_gw()", is the used gw (failing gw) automatically marked as down? (it would be useful so we don't need to wait "fr_timer" seconds for each request during "ping_interval").

Thanks.

-- Iñaki Baz Castillo

Show replies by date

Iñaki Baz Castillo

20 Mar 20 Mar

6:49 p.m.

El Viernes, 20 de Marzo de 2009, Iñaki Baz Castillo escribió:

...

Why ping_interval cannot be less than 180 seconds?

I see in the code: ----------------- if (ping_interval < DEF_PING_TIMER) { ping_interval = DEF_PING_TIMER; LM_DBG("set OPTIONS timer to default value <%d>\n", DEF_PING_TIMER); } ---------------

I see no reason for so big minimun value. I need this value being 5-10 seconds.

-- Iñaki Baz Castillo

Andreas Heise

9:21 p.m.

Hello Iñaki,

you should ask Alexandr he has introdused this feature with rev5452, but I'm not sure if he is on the lists all the time, so I'll forward your question to him...

regards, Andreas

2009/3/20 Iñaki Baz Castillo ibc@aliax.net

...

El Viernes, 20 de Marzo de 2009, Iñaki Baz Castillo escribió:

...

Why ping_interval cannot be less than 180 seconds?

I see in the code:

if (ping_interval < DEF_PING_TIMER) { ping_interval = DEF_PING_TIMER; LM_DBG("set OPTIONS timer to default value <%d>\n", DEF_PING_TIMER); }

I see no reason for so big minimun value. I need this value being 5-10 seconds.

-- Iñaki Baz Castillo

Kamailio (OpenSER) - Users mailing list Users@lists.kamailio.org http://lists.kamailio.org/cgi-bin/mailman/listinfo/users http://lists.openser-project.org/cgi-bin/mailman/listinfo/users

Iñaki Baz Castillo

7:48 p.m.

El Viernes, 20 de Marzo de 2009, Iñaki Baz Castillo escribió:

...

In case of failure_route and "next_gw()", is the used gw (failing gw)

automatically marked as down? (it would be useful so we don't need to wait "fr_timer" seconds for each request during "ping_interval").

Unfortunatelly I've checked that this is not true, the gateway is not automatically set as "offline" when t_relay fails or when "next_gw" is again called.

Suggestion: it would be really great if "next_gw" automatically sets as down the previous used gw (since it didn't success). Or perhaps a new function "mark_previous_gw_offline()" which could be manually called before "next_gw()".

Opinions?

-- Iñaki Baz Castillo

jh＠tutpro.com

8:37 p.m.

Iñaki Baz Castillo writes:

...

Suggestion: it would be really great if "next_gw" automatically sets as down the previous used gw (since it didn't success). Or perhaps a new function "mark_previous_gw_offline()" which could be manually called before "next_gw()".

i think there is currently no information left about previous gw (it is removed from the avp). either it would need to be stored to another avp or the marking function would need to be given the ip address of the previous gw as argument.

-- juha

Iñaki Baz Castillo

8:37 p.m.

El Viernes, 20 de Marzo de 2009, Juha Heinanen escribió:

...

Iñaki Baz Castillo writes:

...
Suggestion: it would be really great if "next_gw" automatically sets as down the previous used gw (since it didn't success). Or perhaps a new function "mark_previous_gw_offline()" which could be manually called before "next_gw()".

i think there is currently no information left about previous gw (it is removed from the avp). either it would need to be stored to another avp or the marking function would need to be given the ip address of the previous gw as argument.

Yes, it's not so easy as I though initially... but not impossible :)

-- Iñaki Baz Castillo

jh＠tutpro.com

8:26 p.m.

New subject: [Kamailio-Users] [LCR] About ping

Iñaki Baz Castillo writes:

...

How to know which gws are down? The only I see is a NOTICE log:

NOTICE:lcr:gw_set_state: trunk "99.99.99.99:5060" from group: <2> is OFFLINE

Is not possible a MI command to list the down gateways?

lcr_gw_dump should include information if gw is down:

p = int2str((unsigned long)(*gws)[i].ping, &len); attr = add_mi_attr(node, MI_DUP_VALUE, "PING", 4, p, len); if (attr == NULL) goto err;

value 2 is offline.

...

Why ping_interval cannot be less than 180 seconds?

i don't know. the ping stuff was contributed by another author.

...

In case of failure_route and "next_gw()", is the used gw (failing gw)

automatically marked as down? (it would be useful so we don't need to wait "fr_timer" seconds for each request during "ping_interval").

it might be possible to write a function that you could call (from failure route) to mark the current gw offline.

-- juha

Ovidiu Sas

8:29 p.m.

On Fri, Mar 20, 2009 at 4:26 PM, Juha Heinanen jh@tutpro.com wrote:

...

Iñaki Baz Castillo writes:

> - How to know which gws are down? The only I see is a NOTICE log: > --------- > NOTICE:lcr:gw_set_state: trunk "99.99.99.99:5060" from group: <2> is OFFLINE > --------- > Is not possible a MI command to list the down gateways?

lcr_gw_dump should include information if gw is down:

p = int2str((unsigned long)(*gws)[i].ping, &len); attr = add_mi_attr(node, MI_DUP_VALUE, "PING", 4, p, len); if (attr == NULL) goto err;

value 2 is offline.

> - Why ping_interval cannot be less than 180 seconds?

i don't know. the ping stuff was contributed by another author.

It should be configurable via a param

...

> - In case of failure_route and "next_gw()", is the used gw (failing gw) > automatically marked as down? (it would be useful so we don't need to > wait "fr_timer" seconds for each request during "ping_interval").

it might be possible to write a function that you could call (from failure route) to mark the current gw offline.

Or/and a function that will return the status of a particular gw. Before relaying, the script can check the availability of the gw and call the next_gw() if neccessary.

Regards, Ovidiu Sas

Iñaki Baz Castillo

8:36 p.m.

El Viernes, 20 de Marzo de 2009, Ovidiu Sas escribió:

...

...
> - Why ping_interval cannot be less than 180 seconds?

i don't know. the ping stuff was contributed by another author.

It should be configurable via a param

Well, there is already a param "pin_interval" to set the... ping interval. But for now ifyou set a value less than 180 then it is set to 180. IMHO it makes no sense having both parameters: - ping_interval = 100 - max_ping_interval = 80 XD

...

...
> - In case of failure_route and "next_gw()", is the used gw (failing gw) > automatically marked as down? (it would be useful so we don't need to > wait "fr_timer" seconds for each request during "ping_interval").

it might be possible to write a function that you could call (from failure route) to mark the current gw offline.

Or/and a function that will return the status of a particular gw. Before relaying, the script can check the availability of the gw and call the next_gw() if neccessary.

That's already done by "load_gws()" function. It only loads gws marked as active. What I mean is that setting online/offline is done when the ping successes/fails, and it would be nice to set a gw as "offline" manually after failure_route and so (without the need of waiting the ping action to be executed, which could take many seconds yet).

Regards.

-- Iñaki Baz Castillo

jh＠tutpro.com

8:45 p.m.

Ovidiu Sas writes:

...

Or/and a function that will return the status of a particular gw. Before relaying, the script can check the availability of the gw and call the next_gw() if neccessary.

load_gws only loads gws that are currently know to be alive.

-- juha

jh＠tutpro.com

8:48 p.m.

if gw could be marked dead from failure route, i don't see much point in pinging the gws, because invite would be the ping.

-- juha

Iñaki Baz Castillo

9:16 p.m.

El Viernes, 20 de Marzo de 2009, Juha Heinanen escribió:

...

if gw could be marked dead from failure route, i don't see much point in pinging the gws, because invite would be the ping.

The previous ping would avoid "fr_timer" seconds of waiting in the INVITE.

-- Iñaki Baz Castillo

jh＠tutpro.com

21 Mar 21 Mar

5:26 a.m.

Iñaki Baz Castillo writes:

...

The previous ping would avoid "fr_timer" seconds of waiting in the INVITE.

inaki,

in a typical case, if ping interval is, say, 10 seconds, and fr_timer, say 3 seconds, i fail to see any big advantage in pinging the gws provided that if fr_timer fires, there is a possibility to mark the gw as being dead.

i'll try to implement this as an alternative/complement for the pinging for the next release.

-- juha

Iñaki Baz Castillo

5:51 p.m.

El Sábado, 21 de Marzo de 2009, Juha Heinanen escribió:

...

Iñaki Baz Castillo writes:

...
The previous ping would avoid "fr_timer" seconds of waiting in the INVITE.

inaki,

in a typical case, if ping interval is, say, 10 seconds, and fr_timer, say 3 seconds, i fail to see any big advantage in pinging the gws provided that if fr_timer fires, there is a possibility to mark the gw as being dead.

True, but note that default fr_timer is 30 seconds :)

...

i'll try to implement this as an alternative/complement for the pinging for the next release.

That's great, thanks a lot.

-- Iñaki Baz Castillo

jh＠tutpro.com

22 Mar 22 Mar

5:59 a.m.

Iñaki Baz Castillo writes:

...

...
in a typical case, if ping interval is, say, 10 seconds, and fr_timer, say 3 seconds, i fail to see any big advantage in pinging the gws provided that if fr_timer fires, there is a possibility to mark the gw as being dead.

True, but note that default fr_timer is 30 seconds :)

when i forward a request to a gw, i set fr_timer to 3 sec.

...

...
i'll try to implement this as an alternative/complement for the pinging for the next release.

That's great, thanks a lot.

how about waking a gw up from death? there could be a timestamp telling when a gw died and then a module parameter telling how long to wait before the gw is tried again. that way there would be no need for a timer process.

-- juha

Iñaki Baz Castillo

7:38 p.m.

El Domingo, 22 de Marzo de 2009, Juha Heinanen escribió:

...

how about waking a gw up from death? there could be a timestamp telling when a gw died and then a module parameter telling how long to wait before the gw is tried again. that way there would be no need for a timer process.

Good idea.

-- Iñaki Baz Castillo

Iñaki Baz Castillo

20 Mar 20 Mar

8:33 p.m.

El Viernes, 20 de Marzo de 2009, Juha Heinanen escribió:

...

Iñaki Baz Castillo writes:

...

How to know which gws are down? The only I see is a NOTICE log:

NOTICE:lcr:gw_set_state: trunk "99.99.99.99:5060" from group: <2> is OFFLINE --------- Is not possible a MI command to list the down gateways?

lcr_gw_dump should include information if gw is down:

p = int2str((unsigned long)(*gws)[i].ping, &len); attr = add_mi_attr(node, MI_DUP_VALUE, "PING", 4, p, len); if (attr == NULL) goto err;

value 2 is offline.

Right, I get the following line:

GW:: GRP_ID=2 IP_ADD=x.x.x.x HOSTNAME= PORT=5060 SCHEME=sip TRANSPORT= STRIP=0 TAG= WEIGHT=1 FLAGS=0 PING=2

However it requires parsing as so...

...

...

Why ping_interval cannot be less than 180 seconds?

i don't know. the ping stuff was contributed by another author.

Ok, so could I submit a patch to just eliminate that requeriment?

...

...

In case of failure_route and "next_gw()", is the used gw (failing gw)

automatically marked as down? (it would be useful so we don't need to wait "fr_timer" seconds for each request during "ping_interval").

it might be possible to write a function that you could call (from failure route) to mark the current gw offline.

ok, that's exactly what I was thinking about. I will try it.

Thanks.

...

-- juha

-- Iñaki Baz Castillo

jh＠tutpro.com

8:40 p.m.

Iñaki Baz Castillo writes:

...

Ok, so could I submit a patch to just eliminate that requeriment?

yes, please do so.

-- juha

5959

Age (days ago)

5961

Last active (days ago)

sr-users@lists.kamailio.org

17 comments

4 participants

tags (0)

participants (4)

Andreas Heise
Iñaki Baz Castillo
jh＠tutpro.com
Ovidiu Sas