#### Pre-Submission Checklist - [X] Commit message has the format required by CONTRIBUTING guide - [X] Commits are split per component (core, individual modules, libs, utils, ...) - [X] Each component has a single commit (if not, squash them into one commit) - [X] No commits to README files for modules (changes must be done to docbook files in `doc/` subfolder, the README file is autogenerated)
#### Type Of Change - [X] Small bug fix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds new functionality) - [ ] Breaking change (fix or feature that would change existing functionality)
#### Checklist: - [X] PR should be backported to stable branches - [X] Tested changes locally - [ ] Related to issue #XXXX (replace XXXX with an open issue number)
#### Description I've recently being experiencing a loop in nodes removal/addition leading to "ghost nodes". Suppose to have three servers A,B,C. Server C goes down not cleanly, so DMQ doesn't notify the other nodes. Server A is the first to send its ping, with a nodelist including node C. After fr_timer, the transaction for the message to node C times out and the node is removed from node A nodelist. Then node B sends its ping with a nodelist including node C (still alive for A), node A sees node C as a new node and adds it back to its nodelist. Now node B reaching fr_timer timeout removes node C, until next node's A ping, and so on. This does not occur if the delta between node A and node B pings is less than fr_timer. What I propose here is that, upon a failed ping, the failing node is put in disabled state and we wait a 2nd failed ping before removing it from the nodelist. This should prevent dead nodes to come back. You can view, comment on, or merge this pull request online at:
https://github.com/kamailio/kamailio/pull/1840
-- Commit Summary --
* dmq: wait for a 2nd failed ping before deleting a node
-- File Changes --
M src/modules/dmq/notification_peer.c (23)
-- Patch Links --
https://github.com/kamailio/kamailio/pull/1840.patch https://github.com/kamailio/kamailio/pull/1840.diff