[sr-dev] [kamailio/kamailio] RPC command stats.get_statistics randomly reporting 9223372036854776000 for current active/early dialogs (#1591)

Joel Serrano notifications at github.com
Fri Jul 13 08:23:45 CEST 2018


Some screenshots to illustrate what I mean:

![image](https://user-images.githubusercontent.com/16212586/42674283-f2d6c2da-8623-11e8-8df4-cf5b9859efd6.png)

Same graph, with a Y-axe limited to 500:

![image](https://user-images.githubusercontent.com/16212586/42674318-112dc756-8624-11e8-8c06-c2fc345d08cf.png)

Same as above, but shortening the time frame and with Y-limit of 100:

![image](https://user-images.githubusercontent.com/16212586/42674387-586dfa28-8624-11e8-956d-f9b1805883da.png)

I see a type of pattern going further back in time is the following:

- all is working
- something happens (*see below note*)
- dialogs number reported goes to 0 or close to 0 for a bit
- dialogs report the 9223372036854776000
- things start working again?



And, looking in the logs, at the ~time the spikes happen, I can see this in the logs:

```
root at kamailio2:~# cat /var/log/kamailio/kamailio.log-20180712 | grep -v NOTICE | grep -v "core/tcp_"
...
Jul 12 02:54:24 kamailio2 sbc[27795]: WARNING: dialog [dlg_handlers.c:1596]: dlg_ontimeout(): timeout for dlg with CallID 'orcr9cRwLT' and tags 'AV-f9MK~v' 'tetjZeg24aZeS'
Jul 12 02:54:45 kamailio2 sbc[27795]: WARNING: dialog [dlg_handlers.c:1596]: dlg_ontimeout(): timeout for dlg with CallID 'IRO6Y5Rw53' and tags 'cIaxfaScL' '9jBBFcc1NaUNQ'
Jul 12 02:54:48 kamailio2 sbc[27795]: WARNING: dialog [dlg_handlers.c:1596]: dlg_ontimeout(): timeout for dlg with CallID 'fw3hjesWFN' and tags '-3xLiat-a' 't1r1KrNrH0BcQ'
Jul 12 02:54:50 kamailio2 sbc[27795]: WARNING: dialog [dlg_handlers.c:1596]: dlg_ontimeout(): timeout for dlg with CallID 'w1rHrdpK2f' and tags 'HoJgX22ya' '4XmtD99FgyyDa'
Jul 12 02:54:58 kamailio2 sbc[27795]: WARNING: dialog [dlg_handlers.c:1596]: dlg_ontimeout(): timeout for dlg with CallID 'mrnlQ8OYuX' and tags 'yum6HTlKK' '766FZ05HaQXmj'
Jul 12 02:55:02 kamailio2 sbc[27795]: WARNING: dialog [dlg_handlers.c:1596]: dlg_ontimeout(): timeout for dlg with CallID 'lrbwFVn7Q8' and tags 'xOubxsLmV' '77Q419N7eHy6N'
Jul 12 02:55:03 kamailio2 sbc[27795]: WARNING: dialog [dlg_handlers.c:1596]: dlg_ontimeout(): timeout for dlg with CallID '14yaUN~xGG' and tags '6nSXFwBSb' 'mp1U2N57Ncg3p'
Jul 12 02:55:05 kamailio2 sbc[27795]: WARNING: dialog [dlg_handlers.c:1596]: dlg_ontimeout(): timeout for dlg with CallID 'hgI~lJdBJ3' and tags 'FzXjp4-YJ' 'a2jt4Hrv1H0cN'
Jul 12 02:55:09 kamailio2 sbc[27795]: WARNING: dialog [dlg_handlers.c:1596]: dlg_ontimeout(): timeout for dlg with CallID 'oBof16H0~A' and tags 'ZepyE21EK' '3ZXHF271aU7pH'
Jul 12 02:55:13 kamailio2 sbc[27795]: WARNING: dialog [dlg_handlers.c:1596]: dlg_ontimeout(): timeout for dlg with CallID 'XIvos7QACD' and tags 'UPsNWynuM' 'X3jF108jcX89N'
Jul 12 02:55:13 kamailio2 sbc[27795]: WARNING: dialog [dlg_handlers.c:1596]: dlg_ontimeout(): timeout for dlg with CallID 'ygaFfVdLn0' and tags '3Qbx8dkvK' '0yUyZrX83aSpa'
Jul 12 02:55:18 kamailio2 sbc[27795]: WARNING: dialog [dlg_handlers.c:1596]: dlg_ontimeout(): timeout for dlg with CallID 'm9G~N0LU9d' and tags 'wQ~cwqKHN' 'cNp0aHaS0XD4F'
Jul 12 02:55:20 kamailio2 sbc[27795]: WARNING: dialog [dlg_handlers.c:1596]: dlg_ontimeout(): timeout for dlg with CallID 'OUaCSKP2PZ' and tags 'W2-o~Knwc' 'epjHQN6yQj91e'
Jul 12 02:55:22 kamailio2 sbc[27795]: WARNING: dialog [dlg_handlers.c:1596]: dlg_ontimeout(): timeout for dlg with CallID 'wUtsFit9wE' and tags 'CmPHrLXm9' 'KKpSm5Fv718mD'
Jul 12 02:55:24 kamailio2 sbc[27795]: WARNING: dialog [dlg_handlers.c:1596]: dlg_ontimeout(): timeout for dlg with CallID 'i5FZlDb2Jt' and tags 'c7AO4enu~' '423vDyyBrU4cQ'
Jul 12 02:55:26 kamailio2 sbc[27795]: WARNING: dialog [dlg_handlers.c:1596]: dlg_ontimeout(): timeout for dlg with CallID 'XLKRf6e5Xq' and tags 'rSTom2iS3' '020pQS21HyQUK'
Jul 12 02:55:29 kamailio2 sbc[27795]: WARNING: dialog [dlg_handlers.c:1596]: dlg_ontimeout(): timeout for dlg with CallID 'nfBiavwuHu' and tags 'P3cHBXiea' 'NQ2ZBN8BZNSBF'
Jul 12 02:55:31 kamailio2 sbc[27795]: WARNING: dialog [dlg_handlers.c:1596]: dlg_ontimeout(): timeout for dlg with CallID '699iA2AdSO' and tags 'eZSvyxAYC' 'tj76025mK04Qe'
Jul 12 02:55:31 kamailio2 sbc[27795]: WARNING: dialog [dlg_handlers.c:1596]: dlg_ontimeout(): timeout for dlg with CallID 'XUuh-7CmGt' and tags 'GyhSzl-c7' 'r0mNXc4DSerjQ'
Jul 12 02:55:33 kamailio2 sbc[27795]: WARNING: dialog [dlg_handlers.c:1596]: dlg_ontimeout(): timeout for dlg with CallID 'oMau5CTVYF' and tags '42SPhgBwh' 'XQeBjgrpNUFpp'
Jul 12 02:55:34 kamailio2 sbc[27795]: WARNING: dialog [dlg_handlers.c:1596]: dlg_ontimeout(): timeout for dlg with CallID 'l0nKr0oj5c' and tags '2ZzNMM0b8' 'B6p7a3U8H0c9j'
Jul 12 02:55:35 kamailio2 sbc[27795]: WARNING: dialog [dlg_handlers.c:1596]: dlg_ontimeout(): timeout for dlg with CallID 'Kl49oLgGHV' and tags 'E5E5P9rFX' 'NB9ZFyjFN2cvD'
Jul 12 02:55:35 kamailio2 sbc[27795]: WARNING: dialog [dlg_handlers.c:1596]: dlg_ontimeout(): timeout for dlg with CallID 'i2lEbuiAFm' and tags 'TDExnc7Bx' 'Urv4SBvp5gyQQ'
Jul 12 02:55:39 kamailio2 sbc[27795]: WARNING: dialog [dlg_handlers.c:1596]: dlg_ontimeout(): timeout for dlg with CallID 'S5k252ptQ7' and tags 'VfPxDw0Bw' '7p13arg6BmDZD'
Jul 12 02:55:39 kamailio2 sbc[27795]: WARNING: dialog [dlg_handlers.c:1596]: dlg_ontimeout(): timeout for dlg with CallID 'fjqooHvA~I' and tags 'NbHBVvU9w' 'XZc6SDF87vB1B'
Jul 12 02:55:40 kamailio2 sbc[27795]: WARNING: dialog [dlg_handlers.c:1596]: dlg_ontimeout(): timeout for dlg with CallID '7b~UWpjiY2' and tags 'eVOoK-UjU' 'QXUHKmmpFmS1m'
Jul 12 02:55:41 kamailio2 sbc[27795]: WARNING: dialog [dlg_handlers.c:1596]: dlg_ontimeout(): timeout for dlg with CallID 'eOSkqIsjd0' and tags '8chfVD~av' 'Zv170QZ4SmQ2N'
Jul 12 02:55:42 kamailio2 sbc[27795]: WARNING: dialog [dlg_handlers.c:1596]: dlg_ontimeout(): timeout for dlg with CallID 'FgI7PLr3km' and tags 'HomQTapK0' 'HvevN50X0NjSr'
Jul 12 02:55:42 kamailio2 sbc[27795]: WARNING: dialog [dlg_handlers.c:1596]: dlg_ontimeout(): timeout for dlg with CallID '9tKCteAnPq' and tags '9QSRb9F3r' '0yyS6HUX3QB2r'
Jul 12 02:55:44 kamailio2 sbc[27795]: WARNING: dialog [dlg_handlers.c:1596]: dlg_ontimeout(): timeout for dlg with CallID 'w0mRMEBDPQ' and tags 'mqY5bERQ1' 'yQXyarSrcyymg'
Jul 12 02:55:46 kamailio2 sbc[27795]: WARNING: dialog [dlg_handlers.c:1596]: dlg_ontimeout(): timeout for dlg with CallID 'AP1gTeQl6U' and tags 'YUz~HwPjE' 'eX0mZc7eUm37a'
Jul 12 02:55:46 kamailio2 sbc[27795]: WARNING: dialog [dlg_handlers.c:1596]: dlg_ontimeout(): timeout for dlg with CallID 'lhxI8T1rXr' and tags 'yb92ex~AK' 'SyZHS3m48eN8F'
Jul 12 02:55:49 kamailio2 sbc[27795]: WARNING: dialog [dlg_handlers.c:1596]: dlg_ontimeout(): timeout for dlg with CallID '2RqFuZyUmj' and tags 'owLepp0dU' '7v1cjy0N0BDej'
Jul 12 02:55:49 kamailio2 sbc[27795]: WARNING: dialog [dlg_handlers.c:1596]: dlg_ontimeout(): timeout for dlg with CallID 'DkuPmsRFZz' and tags '~HNqKE7E2' 'Z90vN6SXFDvUD'
Jul 12 02:55:50 kamailio2 sbc[27795]: WARNING: dialog [dlg_handlers.c:1596]: dlg_ontimeout(): timeout for dlg with CallID 'Zdz48Xh~co' and tags 'hQ4p6Ad8a' 'F6pjtetXrg7er'
Jul 12 02:55:50 kamailio2 sbc[27795]: WARNING: dialog [dlg_handlers.c:1596]: dlg_ontimeout(): timeout for dlg with CallID 'CNoQG-88vC' and tags 'Y-2W95k9F' '0eZg37BpN419c'
Jul 12 02:55:52 kamailio2 sbc[27795]: WARNING: dialog [dlg_handlers.c:1596]: dlg_ontimeout(): timeout for dlg with CallID 'IVtYwdhMVV' and tags 'tpw7O0Hxz' 'p0UrDgSFvyFya'
...
root at kamailio2:~#
```

That seemed weird, all from the same process (`27795`) and almost at the same time?

So I wanted to check randomly for one of the calls, if I take for example the first one (`orcr9cRwLT`) I can't find anything in the logs:

```
root at kamailio2:~# grep orcr9cRwLT /var/log/kamailio/kamailio.log-20180712
Jul 12 02:54:24 kamailio2 sbc[27795]: WARNING: dialog [dlg_handlers.c:1596]: dlg_ontimeout(): timeout for dlg with CallID 'orcr9cRwLT' and tags 'AV-f9MK~v' 'tetjZeg24aZeS'
root at kamailio2:~#
```

The interesting thing is:


```
root at kamailio1:~# grep orcr9cRwLT /var/log/kamailio/kamailio.log-20180712
Jul 11 14:54:24 kamailio1 sbc[29467]: NOTICE: <script>: New request - M=INVITE F=sip:5cfafa09ac2a45428cff23271e15079b at my.domain.com T=sip:inbound at sip.mydomain.com R=sip:inbound at sip.mydomain.com:50814 IP=99.203.1.225:31849 UA='HIDDEN_USER_AGENT' ID=orcr9cRwLT
Jul 11 14:54:24 kamailio1 sbc[29467]: NOTICE: <script>: TLS information: SNI=sip.mydomain.com Ver=TLSv1.2 Cipher=AES256-GCM-SHA384 Bits=256 - M=INVITE R=sip:inbound at sip.mydomain.com:50814 ID=orcr9cRwLT
Jul 11 14:54:24 kamailio1 sbc[29467]: NOTICE: <script>: NAT detected - M=INVITE R=sip:inbound at sip.mydomain.com:50814 ID=orcr9cRwLT
Jul 11 14:54:24 kamailio1 sbc[29467]: NOTICE: <script>: Setting updated contact - M=INVITE ID=orcr9cRwLT
Jul 11 14:54:24 kamailio1 sbc[29467]: NOTICE: <script>: Authentication successful - M=INVITE R=sip:inbound at sip.mydomain.com:50814 ID=orcr9cRwLT
Jul 11 14:54:24 kamailio1 sbc[29467]: NOTICE: <script>: Selected dispatcher node: sip:A.B.C.D:5060 - M=INVITE R=sip:inbound at A.B.C.D:5060 ID=orcr9cRwLT
Jul 11 14:54:24 kamailio1 sbc[29467]: NOTICE: <script>: Forwarding request - M=INVITE R=sip:inbound at A.B.C.D:5060 ID=orcr9cRwLT
Jul 11 14:54:24 kamailio1 sbc[29436]: NOTICE: <script>: Reply - M=INVITE S=100 trying -- your call is important to us ID=orcr9cRwLT
Jul 11 14:54:24 kamailio1 sbc[29430]: NOTICE: <script>: Reply - M=INVITE S=200 OK ID=orcr9cRwLT
Jul 11 14:54:24 kamailio1 sbc[29433]: NOTICE: <script>: Reply - M=INVITE S=200 OK ID=orcr9cRwLT
Jul 11 14:54:24 kamailio1 sbc[29467]: NOTICE: <script>: Setting updated contact - M=ACK ID=orcr9cRwLT
Jul 11 14:54:24 kamailio1 sbc[29467]: NOTICE: <script>: In-dialog request from App - M=ACK R=sip:inbound at E.F.G.H:6061;transport=udp ID=orcr9cRwLT
Jul 11 14:54:24 kamailio1 sbc[29467]: NOTICE: <script>: Forwarding request - M=ACK R=sip:inbound at E.F.G.H:6061;transport=udp ID=orcr9cRwLT
Jul 11 14:54:24 kamailio1 sbc[29467]: NOTICE: <script>: Setting updated contact - M=ACK ID=orcr9cRwLT
Jul 11 14:54:24 kamailio1 sbc[29467]: NOTICE: <script>: In-dialog request from App - M=ACK R=sip:inbound at E.F.G.H:6061;transport=udp ID=orcr9cRwLT
Jul 11 14:54:24 kamailio1 sbc[29467]: NOTICE: <script>: Forwarding request - M=ACK R=sip:inbound at E.F.G.H:6061;transport=udp ID=orcr9cRwLT
root at kamailio1:~#
```


So the call, in fact was started through `kamailio1` (this is a cluster of 2 nodes, `kamailio1` and `kamailio2`, and they replicate dialog information using KDMQ).

The error msgs in `kamailio2` appear exactly 12hr after the call started in `kamailio1`.

I'm starting to wonder if this can have anything to do with KDMQ replication? 

When we moved from v5.0.2 to v5.1.4, we also enabled KMDQ for dialog module (which in v5.0.2 we were not using).

I hope this info helps narrow down the issue.

My next steps unless you guys give me other suggestions:

a) Enable debug logs and wait for issue to reproduce, maybe we see something there.
b) Disable KDMQ replication on dialog module and see if the issue is solved or continues.

I don't like either of them, but I rather start with the most helpful of the two.





 

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/kamailio/kamailio/issues/1591#issuecomment-404738386
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.kamailio.org/pipermail/sr-dev/attachments/20180713/b75df948/attachment-0001.html>


More information about the sr-dev mailing list