<h4>Pre-Submission Checklist</h4>
<ul class="contains-task-list">
<li class="task-list-item"><input type="checkbox" id="" disabled="" class="task-list-item-checkbox" checked=""> Commit message has the format required by CONTRIBUTING guide</li>
<li class="task-list-item"><input type="checkbox" id="" disabled="" class="task-list-item-checkbox" checked=""> Commits are split per component (core, individual modules, libs, utils, ...)</li>
<li class="task-list-item"><input type="checkbox" id="" disabled="" class="task-list-item-checkbox" checked=""> Each component has a single commit (if not, squash them into one commit)</li>
<li class="task-list-item"><input type="checkbox" id="" disabled="" class="task-list-item-checkbox" checked=""> No commits to README files for modules (changes must be done to docbook files<br>
in <code>doc/</code> subfolder, the README file is autogenerated)</li>
</ul>
<h4>Type Of Change</h4>
<ul class="contains-task-list">
<li class="task-list-item"><input type="checkbox" id="" disabled="" class="task-list-item-checkbox" checked=""> Small bug fix (non-breaking change which fixes an issue)</li>
<li class="task-list-item"><input type="checkbox" id="" disabled="" class="task-list-item-checkbox"> New feature (non-breaking change which adds new functionality)</li>
<li class="task-list-item"><input type="checkbox" id="" disabled="" class="task-list-item-checkbox"> Breaking change (fix or feature that would change existing functionality)</li>
</ul>
<h4>Checklist:</h4>
<ul class="contains-task-list">
<li class="task-list-item"><input type="checkbox" id="" disabled="" class="task-list-item-checkbox" checked=""> PR should be backported to stable branches</li>
<li class="task-list-item"><input type="checkbox" id="" disabled="" class="task-list-item-checkbox" checked=""> Tested changes locally</li>
<li class="task-list-item"><input type="checkbox" id="" disabled="" class="task-list-item-checkbox"> Related to issue #XXXX (replace XXXX with an open issue number)</li>
</ul>
<h4>Description</h4>
<p>I've recently being experiencing a loop in nodes removal/addition leading to "ghost nodes".<br>
Suppose to have three servers A,B,C.<br>
Server C goes down not cleanly, so DMQ doesn't notify the other nodes. Server A is the first to send its ping, with a nodelist including node C. After fr_timer, the transaction for the message to node C times out and the node is removed from node A nodelist.<br>
Then node B sends its ping with a nodelist including node C (still alive for A), node A sees node C as a new node and adds it back to its nodelist. Now node B reaching fr_timer timeout removes node C, until next node's A ping, and so on. This does not occur if the delta between node A and node B pings is less than fr_timer.<br>
What I propose here is that, upon a failed ping, the failing node is put in disabled state and we wait a 2nd failed ping before removing it from the nodelist. This should prevent dead nodes to come back.</p>
<hr>
<h4>You can view, comment on, or merge this pull request online at:</h4>
<p> <a href='https://github.com/kamailio/kamailio/pull/1840'>https://github.com/kamailio/kamailio/pull/1840</a></p>
<h4>Commit Summary</h4>
<ul>
<li>dmq: wait for a 2nd failed ping before deleting a node</li>
</ul>
<h4>File Changes</h4>
<ul>
<li>
<strong>M</strong>
<a href="https://github.com/kamailio/kamailio/pull/1840/files#diff-0">src/modules/dmq/notification_peer.c</a>
(23)
</li>
</ul>
<h4>Patch Links:</h4>
<ul>
<li><a href='https://github.com/kamailio/kamailio/pull/1840.patch'>https://github.com/kamailio/kamailio/pull/1840.patch</a></li>
<li><a href='https://github.com/kamailio/kamailio/pull/1840.diff'>https://github.com/kamailio/kamailio/pull/1840.diff</a></li>
</ul>
<p style="font-size:small;-webkit-text-size-adjust:none;color:#666;">—<br />You are receiving this because you are subscribed to this thread.<br />Reply to this email directly, <a href="https://github.com/kamailio/kamailio/pull/1840">view it on GitHub</a>, or <a href="https://github.com/notifications/unsubscribe-auth/AF36ZcmsQ1ba4N_3izq3Cn_h7PI2IgDZks5vLALcgaJpZM4anBLT">mute the thread</a>.<img src="https://github.com/notifications/beacon/AF36Zcs7z7StaVGs2Ppk6jRJbXHjZ9Wkks5vLALcgaJpZM4anBLT.gif" height="1" width="1" alt="" /></p>
<script type="application/json" data-scope="inboxmarkup">{"api_version":"1.0","publisher":{"api_key":"05dde50f1d1a384dd78767c55493e4bb","name":"GitHub"},"entity":{"external_key":"github/kamailio/kamailio","title":"kamailio/kamailio","subtitle":"GitHub repository","main_image_url":"https://github.githubassets.com/images/email/message_cards/header.png","avatar_image_url":"https://github.githubassets.com/images/email/message_cards/avatar.png","action":{"name":"Open in GitHub","url":"https://github.com/kamailio/kamailio"}},"updates":{"snippets":[{"icon":"DESCRIPTION","message":"dmq: wait for a 2nd failed ping before deleting a node (#1840)"}],"action":{"name":"View Pull Request","url":"https://github.com/kamailio/kamailio/pull/1840"}}}</script>
<script type="application/ld+json">[
{
"@context": "http://schema.org",
"@type": "EmailMessage",
"potentialAction": {
"@type": "ViewAction",
"target": "https://github.com/kamailio/kamailio/pull/1840",
"url": "https://github.com/kamailio/kamailio/pull/1840",
"name": "View Pull Request"
},
"description": "View this Pull Request on GitHub",
"publisher": {
"@type": "Organization",
"name": "GitHub",
"url": "https://github.com"
}
}
]</script>