Could it have something to do with "old-non-expired-non-removed" dialogs?

So today I applied latest @charlesrchance patch, and what I normally do (so I don't lose dialogs) is:

1- Restart one node
2- Wait for DMQ replication dialog sync
3- Restart other node

Right after restarting both nodes in that sequence, I already could see values close to ULONG_MAX, so I did a watch every second and I could see the values move around.

I decided to compare the dialogs to see if I could find anything interesting, what I found is that there were a lot of dialogs with the init_ts from like 11th and 13th July and being today 28th, those dialogs should not be there.

So I wonder if those "bad" dialogs could be the cause of the counters going below 0.

Maybe some timer to remove those dialogs didn't happen? and because of the way I perform restarts to not lose dialogs, they were constantly there replicated backwards and forwards...

That said, I did a full restart (both nodes at the same time, losing dialogs) and since then the metrics are looking good.

I hope this info helps.


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or mute the thread.