Hi Henning, 

I did use a maximum of 4 nodes.
In one AWS region and 2 availability zones. (also they support jumbo frames)
This way there is always a registrar available to process requests and they can all be restarted without concerns or dependency on database backend.



On Wed, Dec 1, 2021 at 11:59 AM Henning Westerholt <hw@gilawa.com> wrote:

Hi Julien,

 

jumping into this thread to also ask a related question – what are your experiences with regards to cluster size for DMQ?

 

What is the largest cluster you experience or being told from somebody, and which size would you recommend as an upper bound? I’d guess if you start go grow over 4 or 5 the replication protocol overhead becomes significant.

 

Thanks,

 

Henning

 

--

Henning Westerholt – https://skalatan.de/blog/

Kamailio services – https://gilawa.com

 

From: sr-users <sr-users-bounces@lists.kamailio.org> On Behalf Of Julien Chavanton
Sent: Tuesday, November 30, 2021 10:37 PM
To: Kamailio (SER) - Users Mailing List <sr-users@lists.kamailio.org>
Subject: Re: [SR-Users] DMQ Usrloc/Dialog - Experiences

 

Hi Carsten, likewise on everything you said !


Good questions, a transaction is used and there will be retransmissions.

In such a case we get the normal transaction retransmission mechanism, if a node becomes disconnected for a while, it is possible to request a full sync, from all the other nodes, not very efficient but good for consistency, a restart will do that.

This is why I added batching, on a LAN with support for jumbo frame this makes a huge difference. (far less transactions, who can be expensive).

Even with an MTU of 1400 this will reduce the amount of transactions greatly.

 

There can be race conditions when we sync/replicate while a registration/unregistration is in progress (something highlighted by Torrey) at Kamailio world.

Search the mailing list for "DMQ re-ordering concern", we did raise the concern if any real significant concern there is in the mailing list.

 

Let us know what you think.

I think we should iterate and improve/deal with race condition concerns, this native replication protocol is great IMHO

 

 

 

 

On Tue, Nov 30, 2021 at 1:31 AM Carsten Bock <carsten@ng-voice.com> wrote:

Hi Julien,

 

thanks for your reply - I really appreciate your feedback and your presentations at KamailioWorld, I hope to see you again in person soon.

 

Can you share some details on how many records you synchronize using DMQ and if you ever experienced any loss of records while synchronization using DMQ? The issue is, that on IMS the registration is typically set to 600000 seconds (1 week), as we have other mechanisms to get notified if a user is dropping out of the network (we get notifications from the LTE network itself) and a loss of registration data would result in a user not being reachable for a long period of time.

 

Thanks,

Carsten

--

Carsten Bock I CTO & Founder



ng-voice GmbH

Trostbrücke 1 I 20457 Hamburg I Germany
T +49 179 2021244 I www.ng-voice.com

Registry Office at Local Court Hamburg, HRB 120189
Managing Directors: Dr. David Bachmann, Carsten Bock

 

 

Am Mo., 29. Nov. 2021 um 18:09 Uhr schrieb Julien Chavanton <jchavanton@gmail.com>:

Hi Carsten, from my experience Usrloc + DMQ works very well, rare replication race conditions are insignificant since the state is quite volatile anyway.

However you can only have that many nodes, great for clustering.

 

On Mon, Nov 29, 2021 at 6:47 AM Carsten Bock <carsten@ng-voice.com> wrote:

Hi,

 

I wanted to quickly ask the group about experiences with DMQ and Usrloc/Dialog.

 

I don't expect any issues when choosing such an approach, especially since we would have re-transmits for a message (fr_timer = 30 seconds) if we did not receive any answers or scenarios alike. So I assume, having short "outages" (less than 30 seconds) for the communication is just fine.

 

So far I used DMQ in a non-IMS setup for synchronizing usrloc with ~15k Subscribers and it worked really well.

 

One concern brought up (not by me) was, that synchronizing usrloc with DMQ might not be reliable at scale. I would also love to understand how to treat situations when the communication is lost for a longer period of time (e.g. if the communication between nodes is lost for 5 minutes?).

 

Can anyone share some real-life experiences?

 

Thanks,

Carsten

 

 

 


--

Carsten Bock I CTO & Founder



ng-voice GmbH

Trostbrücke 1 I 20457 Hamburg I Germany
T +49 179 2021244 I www.ng-voice.com

Registry Office at Local Court Hamburg, HRB 120189
Managing Directors: Dr. David Bachmann, Carsten Bock

__________________________________________________________
Kamailio - Users Mailing List - Non Commercial Discussions
  * sr-users@lists.kamailio.org
Important: keep the mailing list in the recipients, do not reply only to the sender!
Edit mailing list options or unsubscribe:
  * https://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-users

__________________________________________________________
Kamailio - Users Mailing List - Non Commercial Discussions
  * sr-users@lists.kamailio.org
Important: keep the mailing list in the recipients, do not reply only to the sender!
Edit mailing list options or unsubscribe:
  * https://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-users

__________________________________________________________
Kamailio - Users Mailing List - Non Commercial Discussions
  * sr-users@lists.kamailio.org
Important: keep the mailing list in the recipients, do not reply only to the sender!
Edit mailing list options or unsubscribe:
  * https://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-users