#### Description
This PR introduces `event_routes` that will trigger when `dmq peers` go up and down during the time.
Really useful to detect problems between `peers` and allows addition of prometheus metrics for example, to add automatic visibility.
#### Pre-Submission Checklist <!-- Go over all points below, and after creating the PR, tick all the checkboxes that apply --> <!-- All points should be verified, otherwise, read the CONTRIBUTING guidelines from above--> <!-- If you're unsure about any of these, don't hesitate to ask on sr-dev mailing list --> - [x] Commit message has the format required by CONTRIBUTING guide - [x] Commits are split per component (core, individual modules, libs, utils, ...) - [x] Each component has a single commit (if not, squash them into one commit) - [x] No commits to README files for modules (changes must be done to docbook files in `doc/` subfolder, the README file is autogenerated)
#### Type Of Change - [ ] Small bug fix (non-breaking change which fixes an issue) - [x] New feature (non-breaking change which adds new functionality) - [ ] Breaking change (fix or feature that would change existing functionality)
#### Checklist: <!-- Go over all points below, and after creating the PR, tick the checkboxes that apply --> - [x] PR should be backported to stable branches - [x] Tested changes locally - [ ] Related to issue #XXXX (replace XXXX with an open issue number)
#### Tests
Tested locally simulating crashes and startups to evaluate the trigger of routes.
Couldn't found any problems or strange cases, on ping intervals the status validation, gives the correct output. You can view, comment on, or merge this pull request online at:
https://github.com/kamailio/kamailio/pull/4661
-- Commit Summary --
* dmqueue: add peer up and down event routes
-- File Changes --
M src/modules/dmq/dmq.c (68) M src/modules/dmq/dmq.h (3) M src/modules/dmq/dmqnode.c (30) M src/modules/dmq/dmqnode.h (2) M src/modules/dmq/doc/dmq_admin.xml (79) M src/modules/dmq/notification_peer.c (22)
-- Patch Links --
https://github.com/kamailio/kamailio/pull/4661.patchhttps://github.com/kamai...
@bundasmanu pushed 1 commit.
ef162aff10a47499ce1e5ed19ca6a2d81d2ddd49 dmq: fix clang-format
henningw left a comment (kamailio/kamailio#4661)
Thanks for the PR. Just had a quick look to the code. One question, what is the purpose of the DMQ_EXTRACT_PEER_UP_MAX 128 ?
bundasmanu left a comment (kamailio/kamailio#4661)
Thanks for the PR. Just had a quick look to the code. One question, what is the purpose of the DMQ_EXTRACT_PEER_UP_MAX 128 ?
Hi @henningw
DMQ_EXTRACT_PEER_UP_MAX is basically a safety cap for one extract_node_list() run. It limits how many peers are queued for peer-up callbacks after parsing a notification body (the callback runs after unlocking the list). So it avoids unbounded stack/callback bursts; if more than 128 peers need peer-up in one pass, the extra ones are skipped and a warning is logged.
@bundasmanu pushed 1 commit.
96a7304326984fa56a0821278574fbba807246e2 dmq: trigger peer event routes for status changes from extract_node_list
bundasmanu left a comment (kamailio/kamailio#4661)
@henningw, @miconda can you check, please?
vlitvinov left a comment (kamailio/kamailio#4661)
Hello, Gustavo and thank you for the PR!
I see two new even routes were added, but there are currently 3 different statuses for dmq node: ACTIVE, NOT_ACTIVE, DISABLED. Wouldn't it be better to have single event route for any status change with some new "status" variable available in it? Also naming is confusing. "node_up\down, node_status_change" sounds more reasonable for me because "peer" is used mainly as a role definition in module itself (for example htable or usrloc are dmq peers).
henningw left a comment (kamailio/kamailio#4661)
I agree that "node" is a better fit for the naming of the event routes, e.g. dmq:node-up, dmq:node-down. The same logic is also used in the dispatcher module, so would be fine for me to have two routes like this.
miconda left a comment (kamailio/kamailio#4661)
The overview of the module docs says: `replication between multiple instances, referred as "nodes" (or "peers").`
And the apps are referred as: `data flow between nodes is grouped in a logical entity referred as "channel" (or "bus")`.
Based on these, peer is considered same as node.
Just for clarification, I don't have a preference for a particular name.
bundasmanu left a comment (kamailio/kamailio#4661)
Hi guys,
Thanks for the feedback, really appreciate it.
The topic regarding `node` vs `peer` was also discussed in my head :)
But, in my conscious, i prefer to use `peer` to avoid misunderstandings. `Node` seems to be more generic, and `peer` points to my notification address, except me (actual node), and the events manage only peers. In contrast with `dispatcher`, that i will not include actual server address there, here in `dmq` i can do that on `notification address`.
But, i can switch the naming, if you guys prefer.
Regarding `disable` state, if you guys prefer, i also can include a new `event_route` for it. But i prefer to keep isolated `event_routes` instead of aggregating into a single one, i think its the main logic on other modules.
@bundasmanu pushed 1 commit.
0e78f4ac32c66ffc201c63ab4f914040189fb65c dmq: add peer-disabled event_route
bundasmanu left a comment (kamailio/kamailio#4661)
Included support for `dmqueue` (and tested).
`peer` naming was kept.
bundasmanu left a comment (kamailio/kamailio#4661)
@miconda Its ok, to be merged?
bundasmanu left a comment (kamailio/kamailio#4661)
A re-test was applied and events are triggering in the right way, also no event duplications on the same state were observed.
PS: It will be useful to me, to get this new feature available in the next: 6.1.2 release.
henningw left a comment (kamailio/kamailio#4661)
A re-test was applied and events are triggering in the right way, also no event duplications on the same state were observed.
PS: It will be useful to me, to get this new feature available in the next: 6.1.2 release.
Thank you for the update. Can you please have a look and to the clang format error in this commit? Commit b2729b6 dmq: fix disabled timer state trigger
About the 6.1.2 release - this is not how the Kamailio project manages the stable branches. This feature will be included after the merge in the master branch and then part of the upcoming 6.2.0 release.
@bundasmanu pushed 1 commit.
ae894a30008d4b8a75962b6d9c29c386a412248c Merge branch 'master' into bundasmanu/dmqueue-add_peer_up_down_event_routes
@bundasmanu pushed 1 commit.
694823222218dcdc3b726c72d220b8b671a00849 dmq: fix clang-format
@bundasmanu pushed 1 commit.
f9343e1c2b4038a026b6ea5b27a8f8bf39f0fd52 dmq: fix clang
@bundasmanu pushed 1 commit.
8b4444329887e0082e71622e8b8789c6057af222 dmq: add peer up and down event routes
bundasmanu left a comment (kamailio/kamailio#4661)
A re-test was applied and events are triggering in the right way, also no event duplications on the same state were observed. PS: It will be useful to me, to get this new feature available in the next: 6.1.2 release.
Thank you for the update. Can you please have a look and to the clang format error in this commit? Commit [b2729b6](https://github.com/kamailio/kamailio/commit/b2729b663502f5259c2f1de35f8daa36...) dmq: fix disabled timer state trigger
About the 6.1.2 release - this is not how the Kamailio project manages the stable branches. This feature will be included after the merge in the master branch and then part of the upcoming 6.2.0 release.
Thank you @henningw
The problems on the commits check, are not regarding to my commit, but for something from `tls` module.
henningw left a comment (kamailio/kamailio#4661)
A re-test was applied and events are triggering in the right way, also no event duplications on the same state were observed. PS: It will be useful to me, to get this new feature available in the next: 6.1.2 release.
Thank you for the update. Can you please have a look and to the clang format error in this commit? Commit [b2729b6](https://github.com/kamailio/kamailio/commit/b2729b663502f5259c2f1de35f8daa36...) dmq: fix disabled timer state trigger About the 6.1.2 release - this is not how the Kamailio project manages the stable branches. This feature will be included after the merge in the master branch and then part of the upcoming 6.2.0 release.
Thank you @henningw
The problems on the commits check, are not regarding to my commit, but for something from `tls` module.
Yes, it complains about unrelated other commits. Your commits introduced in this PR are now fine regarding the clang checks.
miconda left a comment (kamailio/kamailio#4661)
@henningw @vlitvinov : you still want to review more, or it should be considered for merging if no other comments soon?
henningw left a comment (kamailio/kamailio#4661)
The naming was discussed earlier, but I don't mind to have it either way. Just one thing - there is now a merge conflict due to the merging of other PRs related to DMQ. @bundasmanu can you please re-base this and force-push to the PR?
@bundasmanu pushed 1 commit.
e3d2ceb94ae16380dbe788e918e2bcfc37f73d33 dmq: merge changes
@bundasmanu pushed 0 commits.
vlitvinov left a comment (kamailio/kamailio#4661)
I'll check it later today and comment here. Thanks.
@bundasmanu pushed 1 commit.
af7a0387a332cbe6547aedd032945fdabc58c8fb dmq: add peer up and down event routes
@vlitvinov commented on this pull request.
if(sr_kemi_route(
+ keng, fmsg, EVENT_ROUTE, &dmq_event_callback, &evname) + < 0) + LM_ERR("dmq kemi event route failed\n"); + } + set_route_type(backup_rt); + } + reset_uri(fmsg); +} + +/** + * Run peer event route for a notification keyed by DMQ_NODE_* status: + * DMQ_NODE_ACTIVE -> dmq:peer-up, DMQ_NODE_NOT_ACTIVE -> dmq:peer-down, + * DMQ_NODE_DISABLED -> dmq:peer-disabled. + */ +void dmq_peer_run_event_route(int status, dmq_node_t *node)
This question is still open. Why do we need "status" when we have it in "node" structure.
@vlitvinov commented on this pull request.
*prev = cur->next;
+ lock_release(&list->lock); + LM_DBG("released dmq_node_list->lock\n"); + if(down) + dmq_peer_run_event_route(DMQ_NODE_NOT_ACTIVE, cur);
yes, in my opinion we should align this and consider that node removing done for DISABLED status always.
int down = (!cur->local && cur->status == DMQ_NODE_ACTIVE);
+ *prev = cur->next; + lock_release(&list->lock); + LM_DBG("released dmq_node_list->lock\n"); + if(down) + dmq_peer_run_event_route(DMQ_NODE_NOT_ACTIVE, cur);
Are this new routes executed for node status change? If so on deleting only ACTIVE and NOT_ACTIVE nodes will change their status to DISABLED. If node is already in DISABLED status, then this event route was already fired.
@@ -427,6 +434,8 @@ dmq_node_t *add_dmq_node(dmq_node_list_t *list, str *uri)
list->count++; lock_release(&list->lock); LM_DBG("released dmq_node_list->lock\n"); + if(!no_peer_evt && !newnode->local && newnode->status == DMQ_NODE_ACTIVE)
uri can contain params, for now only one parameter is allowed and this parameter is status. So node in any available status can be added by this function.
By the way, were these comments generated by AI ?
LM_DBG("updating status on %.*s from %d to %d\n", STR_FMT(&tmp_uri),
ret->status, find->status); ret->status = find->status; total_nodes++; + if(old_status != DMQ_NODE_ACTIVE
Sorry, I didn't get your comment. We want to execute event route when status changed, there are 3 status, if we are firing event route execution we node new status, node->status.
vlitvinov left a comment (kamailio/kamailio#4661)
@bundasmanu was AI involved in writing the code (and comments) in this PR? I almost sure that it was. Please check my comments above.
By the way I agree with @henningw and would rename event routes to be node instead of peer, one more reason is module's rpc commands, we already have "node_list" there.
bundasmanu left a comment (kamailio/kamailio#4661)
@bundasmanu was AI involved in writing the code (and comments) in this PR? I almost sure that it was. Please check my comments above.
By the way I agree with @henningw and would rename event routes to be node instead of peer, one more reason is module's rpc commands, we already have "node_list" there.
Hi @vlitvinov, yes i have used AI (in mixed terms, AI + manual changes) to guide me, as this is my first contribution to dmqueue module, and i'm using that module only a few time ago.
The PR's objective is to detect changes and trigger the PR's, yes, like in dispatcher logic. So, old state logic, from my point of view makes sense to allow comparison between states, and also help avoid retrigger same state in some ocassions. Dispatcher uses some related logic: https://github.com/kamailio/kamailio/pull/4460/changes.
But feel free to also contribute here, as you have more experience with the module.
@bundasmanu pushed 1 commit.
9cc98065535f7708e8018c33edb25412c5bf9370 dmq: feedback improvements
vlitvinov left a comment (kamailio/kamailio#4661)
Hello, Gustavo!
Looks like I misunderstood your original idea. After reading dispatcher module implementation I see that there is only two event routes: up and down, even for more statuses.
I'm sorry for moving your PR to the wrong direction. I prepared some changes based on your feature branch. Renamed event-routes and left only two of them: node-up\node-down.
I just built it but not tested yet. It would be great if you can test it.
Here is the link to my branch (there is only one commit on top of your changes): https://github.com/vlitvinov/kamailio/tree/dmq_event_routes_peer_up_down
miconda left a comment (kamailio/kamailio#4661)
@vlitvinov: the renaming from peer to node is out of the scope for the review. As noted in a comment above, the documentation of the module refer node and peer as the same concept, while channel would be more like application/module endpoint.
- https://github.com/kamailio/kamailio/pull/4661#issuecomment-4168706914
I haven't checked, but if your review/follow up patch fixes some other issues, that is fine, if it is just renaming the event routes, then it is just fine what @bundasmanu has in the PR.
bundasmanu left a comment (kamailio/kamailio#4661)
Hello, Gustavo!
Looks like I misunderstood your original idea. After reading dispatcher module implementation I see that there is only two event routes: up and down, even for more statuses.
I'm sorry for moving your PR to the wrong direction. I prepared some changes based on your feature branch. Renamed event-routes and left only two of them: node-up\node-down.
I just built it but not tested yet. It would be great if you can test it.
Here is the link to my branch (there is only one commit on top of your changes): https://github.com/vlitvinov/kamailio/tree/dmq_event_routes_peer_up_down
Hi @vlitvinov,
Not a problem, maybe my initial explanation could be better :(
I’ve reviewed your commit on the branch and have a quick question: do you intend to remove the `disabled event_route`? I don’t need it for my use case, but it might still be useful for others.
And absolutely, i can help on testing.
vlitvinov left a comment (kamailio/kamailio#4661)
Hello @miconda . There are other changes, not only routes renaming.
@bundasmanu yes, I removed disabled route and kept only two event routes to make it similar to dispatcher module. No matter how many internal statuses we have the node can be up or down. I would appreciate if you can test it. I'll be able to do this later this week.