More info:
On another cluster, same setup..., during this troubleshooting I disabled DMQ and enabled MySQL for dialog replication, I also left one node outside of rotation to see replication behavior.
Well, with 0 traffic I can see this:
root@sbc01:~# kamctl rpc dlg.stats_active
{
"jsonrpc": "2.0",
"result": {
"starting": 0,
"connecting": 0,
"answering": 0,
"ongoing": 0,
"all": 0
},
"id": 2634
}
root@sbc01:~#
Which is correct, but...:
root@sbc01:~# kamctl rpc stats.fetch all | grep dialog
"dialog.active_dialogs": "38",
"dialog.early_dialogs": "0",
"dialog.expired_dialogs": "38",
"dialog.failed_dialogs": "0",
"dialog.processed_dialogs": "1",
root@sbc01:~#
I wonder if it's correct that those 38 expired
dialogs still count towards the active
counter? Could that have something to do? Maybe those non-existent-expired-dialogs because they are somewhere in the "active" counter, they get replicated only when DMQ is enabled?
So far I have clear that:
1- For sure there is a scenario where counters go below 0.
2- It only happens (so far) when DMQ is enabled
3- In my initial look, I found some logs about dialogs created on node1 were timed-out on node2 (see previous posts)
4- Now I see a discrepancy on active vs expired dialogs on a node with 0 active dialogs.
I feel we are slowly narrowing down what the possible problem can be, but I still don't have a clear picture what causes what.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or mute the thread.