On 28.10.2009 12:55 Uhr, IƱaki Baz Castillo wrote:
2009/10/28 Daniel-Constantin Mierla
<miconda(a)gmail.com>om>:
The error
remains and it just affects to dialog module. I log in a file
when the Munin plugin performing the fifo command gets a dialg_xxxx value
greater than 200 (note that there is no traffic in this Kamailio !!!), and
this is what I've got from past friday (3 days ago) having the latest 1.5
version installed:
Sat Oct 17 00:50:03 CEST 2009: dialogs_early = 134217728
Sat Oct 17 00:50:03 CEST 2009: dialogs_total = 134217728
Sat Oct 17 16:50:03 CEST 2009: dialogs_early = 2820392
Sat Oct 17 16:50:03 CEST 2009: dialogs_total = 2820392
This is interesting as you say there is no traffic on server. Since is a
testing system I guess we can play a bit with it.
Well, no it's already on production, anyhow I can experiment a bit on it :)
ok.
The error
might be due to
a bug in FIFO command parsing or a bug in printing the command to fifo file.
If it helps, I could reproduce the problem without the munin plugin,
just by running the command "kamctl fifo get_statistics
active_dialogs" several times. Ramdomly it returns an error "fifo
command must begin with :: total_size".
I tried (about 100 times) and gives all the time 0. I had no traffic at
all. Shall I do something else to reproduce?
Thanks,
Daniel
The best is to
add a debug message that will print the entire fifo command
received by kamailio. Should I do a patch for you?
I will try by myself, thanks.
More strange is with dialog value as there is no
traffic, that value should
not be affected at all and should be zero. If you do kamctl get_statistics
all from command line, do you get same value?
The main problem is that this issue occurs jsut *some* times. Note
that munin plugins are executed each 5 minutos and the MI command just
fails "some times", so it would be really impossible to check if
"kamctl get_statistics" also fails.
However, about the big dialogs numbers the command returns, I extract
that value as follows (note that I did some modifications in the munin
plugin trying top detect the isssue):
-------------------
dialogs_early=$(kamctl fifo get_statistics early_dialogs) 2>>
/tmp/munin-kamailio.error
retcode=$?
if [ $retcode -ne 0 ] ; then
echo "$(date): kamctl fifo get_statistics early_dialogs -> retcode =
$retcode" >> /tmp/munin-kamailio.error
fi
# This is not a server in production so dialog_early must be near 0:
if [ $dialogs_early -ge 200 ] ; then
echo "$(date): dialogs_early = $dialogs_early" 2>&1 >>
/tmp/munin-kamailio.error
fi
------------------
And the output I get in /tmp/munin-kamailio.error is
Sat Oct 17 00:50:03 CEST 2009: dialogs_early = 134217728
Sat Oct 17 16:50:03 CEST 2009: dialogs_early = 2820392
Fri Oct 23 12:45:04 CEST 2009: dialogs_early = 2863080
This is:
The command "dialogs_early=$(kamctl fifo get_statistics early_dialogs)
2>> /tmp/munin-kamailio.error" ALWAYS returns 0 (correct return code
in bash) so it never "fails".
The value the above command returns is obviously wrong (134217728, 2820392...).
--
Daniel-Constantin Mierla
* Kamailio SIP Masterclass, Nov 9-13, 2009, Berlin
*
http://www.asipto.com/index.php/sip-router-masterclass/