Could it have something to do with "old-non-expired-non-removed" dialogs?
So today I applied latest @charlesrchance patch, and what I normally do (so I don't lose dialogs) is:
1- Restart one node
2- Wait for DMQ replication dialog sync
3- Restart other node
Right after restarting both nodes in that sequence, I already could see values close to ULONG_MAX, so I did a watch
every second and I could see the values move around.
I decided to compare the dialogs to see if I could find anything interesting, what I found is that there were a lot of dialogs with the init_ts
from like 11th and 13th July and being today 28th, those dialogs should not be there.
So I wonder if those "bad" dialogs could be the cause of the counters going below 0.
Maybe some timer to remove those dialogs didn't happen? and because of the way I perform restarts to not lose dialogs, they were constantly there replicated backwards and forwards...
That said, I did a full restart (both nodes at the same time, losing dialogs) and since then the metrics are looking good.
I hope this info helps.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or mute the thread.