[sr-dev] [kamailio/kamailio] RPC command stats.get_statistics randomly reporting 9223372036854776000 for current active/early dialogs (#1591)
Joel Serrano
notifications at github.com
Fri Jul 13 20:13:26 CEST 2018
Hi Daniel,
On my initial test in prod, I can see already discrepancies:
```
root at kamailio2:~# kamctl rpc dlg.stats_active
{
"jsonrpc": "2.0",
"result": {
"starting": 4,
"connecting": 55,
"answering": 2,
"ongoing": 253,
"all": 314
},
"id": 23360
}
root at kamailio2:~# kamctl rpc stats.get_statistics all | grep dialog
"dialog:active_dialogs = 18446744073709551604",
"dialog:early_dialogs = 15",
"dialog:expired_dialogs = 0",
"dialog:failed_dialogs = 120",
"dialog:processed_dialogs = 340",
root at kamailio2:~#
```
Now we can go past 9223372036854776000 ??? I don't understand that.
That was right after restarting (green active dialogs, yellow early dialogs):

Here is multiple runs of the stats.get_statistics:
```
root at kamailio2:~# kamctl rpc stats.get_statistics all | grep dialog
"dialog:active_dialogs = 18446744073709551614",
"dialog:early_dialogs = 8",
"dialog:expired_dialogs = 0",
"dialog:failed_dialogs = 315",
"dialog:processed_dialogs = 863",
root at kamailio2:~# kamctl rpc stats.get_statistics all | grep dialog
"dialog:active_dialogs = 18446744073709551614",
"dialog:early_dialogs = 8",
"dialog:expired_dialogs = 0",
"dialog:failed_dialogs = 315",
"dialog:processed_dialogs = 864",
root at kamailio2:~# kamctl rpc stats.get_statistics all | grep dialog
"dialog:active_dialogs = 18446744073709551614",
"dialog:early_dialogs = 8",
"dialog:expired_dialogs = 0",
"dialog:failed_dialogs = 315",
"dialog:processed_dialogs = 864",
root at kamailio2:~# kamctl rpc stats.get_statistics all | grep dialog
"dialog:active_dialogs = 18446744073709551613",
"dialog:early_dialogs = 10",
"dialog:expired_dialogs = 0",
"dialog:failed_dialogs = 320",
"dialog:processed_dialogs = 866",
root at kamailio2:~# kamctl rpc stats.get_statistics all | grep dialog
"dialog:active_dialogs = 0",
"dialog:early_dialogs = 8",
"dialog:expired_dialogs = 0",
"dialog:failed_dialogs = 322",
"dialog:processed_dialogs = 870",
root at kamailio2:~# kamctl rpc stats.get_statistics all | grep dialog
"dialog:active_dialogs = 1",
"dialog:early_dialogs = 9",
"dialog:expired_dialogs = 0",
"dialog:failed_dialogs = 323",
"dialog:processed_dialogs = 873",
root at kamailio2:~# kamctl rpc stats.get_statistics all | grep dialog
"dialog:active_dialogs = 18446744073709551615",
"dialog:early_dialogs = 9",
"dialog:expired_dialogs = 0",
"dialog:failed_dialogs = 324",
"dialog:processed_dialogs = 875",
root at kamailio2:~# kamctl rpc stats.get_statistics all | grep dialog
"dialog:active_dialogs = 18446744073709551615",
"dialog:early_dialogs = 10",
"dialog:expired_dialogs = 0",
"dialog:failed_dialogs = 325",
"dialog:processed_dialogs = 876",
root at kamailio2:~#
```
I tried to have a script gathering metrics:
```
root at kamailio2:~# cat dialogs.sh
#!/bin/bash
while true; do
DIALOGS=`kamctl rpc dlg.list | grep h_entry | sort | wc -l`
METRICS=`kamctl rpc stats.get_statistics all | grep -e "dialog:active" -e "dialog:early" | tr -d " " | tr -d "\n"`
NEWMETRICS=`kamctl rpc dlg.stats_active | jq -c .result`
echo "`date +'%Y-%m-%d %H:%M:%S.%N'` - dlg.list:$DIALOGS - stats.get_statistics:$METRICS - dlg.stats_active:$NEWMETRICS"
sleep 1
done
root at kamailio2:~#
```
But, after a little kamailio "starts slowing down" until it dies (had to completely reboot the server).
```
root at kamailio2:~# ./dialogs.sh
2018-07-13 10:57:14.204158716 - dlg.list:419 - stats.get_statistics:"dialog:active_dialogs=13","dialog:early_dialogs=12", - dlg.stats_active:{"starting":5,"connecting":62,"answering":7,"ongoing":288,"all":362}
2018-07-13 10:57:15.437165579 - dlg.list:422 - stats.get_statistics:"dialog:active_dialogs=15","dialog:early_dialogs=11", - dlg.stats_active:{"starting":4,"connecting":59,"answering":10,"ongoing":289,"all":362}
2018-07-13 10:57:16.678432892 - dlg.list:423 - stats.get_statistics:"dialog:active_dialogs=14","dialog:early_dialogs=9", - dlg.stats_active:{"starting":2,"connecting":58,"answering":9,"ongoing":289,"all":358}
2018-07-13 10:57:17.915458776 - dlg.list:422 - stats.get_statistics:"dialog:active_dialogs=13","dialog:early_dialogs=9", - dlg.stats_active:{"starting":4,"connecting":60,"answering":7,"ongoing":289,"all":360}
2018-07-13 10:57:19.149467641 - dlg.list:420 - stats.get_statistics:"dialog:active_dialogs=13","dialog:early_dialogs=8", - dlg.stats_active:{"starting":3,"connecting":60,"answering":5,"ongoing":290,"all":358}
2018-07-13 10:57:20.386457513 - dlg.list:415 - stats.get_statistics:"dialog:active_dialogs=14","dialog:early_dialogs=7", - dlg.stats_active:{"starting":4,"connecting":59,"answering":5,"ongoing":290,"all":358}
2018-07-13 10:57:21.627241266 - dlg.list:412 - stats.get_statistics:"dialog:active_dialogs=15","dialog:early_dialogs=7", - dlg.stats_active:{"starting":3,"connecting":60,"answering":6,"ongoing":292,"all":361}
2018-07-13 10:57:22.867957903 - dlg.list:415 - stats.get_statistics:"dialog:active_dialogs=14","dialog:early_dialogs=9", - dlg.stats_active:{"starting":2,"connecting":64,"answering":5,"ongoing":291,"all":362}
2018-07-13 10:57:24.099604068 - dlg.list:418 - stats.get_statistics:"dialog:active_dialogs=13","dialog:early_dialogs=10", - dlg.stats_active:{"starting":2,"connecting":65,"answering":3,"ongoing":294,"all":364}
2018-07-13 10:57:33.049706118 - dlg.list:411 - stats.get_statistics:"dialog:active_dialogs=8","dialog:early_dialogs=6", - dlg.stats_active:{"starting":4,"connecting":62,"answering":5,"ongoing":287,"all":358}
2018-07-13 10:57:34.290398926 - dlg.list:418 - stats.get_statistics:"dialog:active_dialogs=8","dialog:early_dialogs=6", - dlg.stats_active:{"starting":9,"connecting":62,"answering":5,"ongoing":287,"all":363}
2018-07-13 10:57:35.531092078 - dlg.list:419 - stats.get_statistics:"dialog:active_dialogs=8","dialog:early_dialogs=6", - dlg.stats_active:{"starting":9,"connecting":62,"answering":5,"ongoing":287,"all":363}
2018-07-13 10:57:36.762143019 - dlg.list:419 - stats.get_statistics:"dialog:active_dialogs=8","dialog:early_dialogs=6", - dlg.stats_active:{"starting":9,"connecting":62,"answering":5,"ongoing":287,"all":363}
2018-07-13 10:57:38.006519396 - dlg.list:420 - stats.get_statistics:"dialog:active_dialogs=8","dialog:early_dialogs=6", - dlg.stats_active:{"starting":10,"connecting":62,"answering":5,"ongoing":287,"all":364}
2018-07-13 10:57:39.254513350 - dlg.list:420 - stats.get_statistics:"dialog:active_dialogs=8","dialog:early_dialogs=6", - dlg.stats_active:{"starting":10,"connecting":62,"answering":5,"ongoing":287,"all":364}
2018-07-13 10:57:55.779414400 - dlg.list:431 - stats.get_statistics:"dialog:active_dialogs=2","dialog:early_dialogs=6", - dlg.stats_active:{"starting":8,"connecting":72,"answering":10,"ongoing":272,"all":362}
^C
root at kamailio2:~#
```
As you can see it starts taking >10s to reply some times, which I believe means it is dying, leaving a core and restarting:
```
root at kamailio2:/var/tmp# ls -lrt
total 47781984
-rw------- 1 root kamailio 8607031296 Jul 13 10:59 core.kamailio.24376.13cn4.1531504766
-rw------- 1 root kamailio 2128039936 Jul 13 11:01 core.kamailio.28422.13cn4.1531504860
-rw------- 1 root kamailio 2081087488 Jul 13 11:01 core.kamailio.28420.13cn4.1531504860
-rw------- 1 root kamailio 2136645632 Jul 13 11:01 core.kamailio.28421.13cn4.1531504860
-rw------- 1 root kamailio 2174545920 Jul 13 11:01 core.kamailio.28423.13cn4.1531504860
-rw------- 1 root kamailio 5265956864 Jul 13 11:07 core.kamailio.1335.13cn4.1531505030
-rw------- 1 root kamailio 5309550592 Jul 13 11:07 core.kamailio.1339.13cn4.1531505030
-rw------- 1 root kamailio 5294923776 Jul 13 11:07 core.kamailio.1336.13cn4.1531505030
-rw------- 1 root kamailio 5329113088 Jul 13 11:07 core.kamailio.1334.13cn4.1531505030
-rw------- 1 root kamailio 5338894336 Jul 13 11:07 core.kamailio.1338.13cn4.1531505030
-rw------- 1 root kamailio 5262925824 Jul 13 11:07 core.kamailio.1337.13cn4.1531505030
root at kamailio2:/var/tmp#
```
I'm grabbing the `bt full` and `info locals` from all and I'll add them here shortly. I will have to rollback until you have time to check all this.
--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/kamailio/kamailio/issues/1591#issuecomment-404911844
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.kamailio.org/pipermail/sr-dev/attachments/20180713/97abd1d9/attachment-0001.html>
More information about the sr-dev
mailing list