Hi,
Currently there is ping mechanism in DMQ for detecting inactive nodes. However this triggers only on "ping_interval".
DMQ exposes API functions to other modules for sending dmq messages or broadcasting them. This API functions require "dmq_resp_cback_t" to be passed as parameter. Most of the other modules just log a DBG msg in it.
There are two improvement ideas: 1. other modules that use DMQ, to detect node failures too, as they are sending/bcasting messages. This, in addition to DMQ module "ping_interval" mechanism.
2. don't set inactive state of nodes on first failure, but have per node counter of fails, and modparam to tune this check.
To do this either: A. Implement the response callback in each module that use DMQ, to check response code and set inactive state of dmq node => duplicate same code across different modules
B. Send NULL for that "dmq_resp_cback_t" from other modules that use DMQ. In DMQ module itself, check if NULL and if so, use a default callback that checks response code and set inactive state of dmq nodes => cleaner approach
Any opinions on this? Do you see any possible problems with this?
Thanks, Stefan
Hi Stefan,
did not looked to the code right now, but the approach by having the DMQ module doing the active/inactive handling centrally instead of duplicating it for all DMQ API users sounds better to me. We are also doing it in a similar way e.g. for specific database module errors.
Cheers,
Henning
From: Stefan Mititelu via sr-dev sr-dev@lists.kamailio.org Sent: Montag, 17. März 2025 16:38 To: Kamailio (SER) - Development Mailing List sr-dev@lists.kamailio.org Cc: Stefan Mititelu stefan.mititelu@net2phone.com Subject: [sr-dev] DMQ improvements for detecting inactive nodes
Hi, Currently there is ping mechanism in DMQ for detecting inactive nodes. However this triggers only on "ping_interval". DMQ exposes API functions to other modules for sending dmq messages or broadcasting them. This API functions require "dmq_resp_cback_t" to be passed as parameter. Most of the other modules just log a DBG msg in it.
There are two improvement ideas: 1. other modules that use DMQ, to detect node failures too, as they are sending/bcasting messages. This, in addition to DMQ module "ping_interval" mechanism. 2. don't set inactive state of nodes on first failure, but have per node counter of fails, and modparam to tune this check.
To do this either: A. Implement the response callback in each module that use DMQ, to check response code and set inactive state of dmq node => duplicate same code across different modules
B. Send NULL for that "dmq_resp_cback_t" from other modules that use DMQ. In DMQ module itself, check if NULL and if so, use a default callback that checks response code and set inactive state of dmq nodes => cleaner approach
Any opinions on this? Do you see any possible problems with this? Thanks, Stefan