Apologies for not having included the sr-users in the previous email. Anyhow, the solution suggested by Social Boh indeed worked, as a side effect of which we would not run into the issue described. 

NOTE: There was a peculiar configuration in keepalived with keeping both the nodes as BACKUP and setting the 'nopreempt' flag only in the actual MASTER(one with higher 'priority') side to make the keepalived thing to work properly.

However, I was wondering if someone could confirm if this issue has been fixed inside the source code itself as I am using an older release v5.3.2

Regards,
Harneet Singh

On Thu, Dec 17, 2020 at 9:52 PM harneet singh <hbilling@gmail.com> wrote:
Thanks Social Boh,

I will investigate this possibility and if it does work in our case, that should circumvent the issue.
In addition, is there a fix known to the SR-Users which could be available inside the Dispatcher module to cause syncing just like the syncing is there in the Dialog module via DMQ. I wanted to check this as I am using v5.3.2 and unaware if new releases could have fixed this issue?

Regards,
Harneet Singh

On Wed, Dec 16, 2020 at 10:46 PM Social Boh <social@bohboh.info> wrote:

Maybe a option is leave the secondary like primary when primary go up newly.

Example: Primary go down, Secondary is the new Primary. Primary came back up but Secondary still is the VIP and still process the calls.

Only if Secondary go down, primary is the new VIP.

The keepalived parameter for this behaviour is:

nopreempt

---
I'm SoCIaL, MayBe
El 16/12/2020 a las 11:57 a. m., harneet singh escribió:
Hi All,

I am using Kamailio in HA mode with Keepalived providing the VIP(Virtual IP) functionality, and have run into a rather peculiar issue.

Setup:

Caller ------------ KamailioVIP(Primary and Secondary Kamailio)------- Callee

    HA provided by Keepalived
    DMQ is used between the Primary and Secondary Kamailio for dialog sync

Issue Seen:

Suppose the Primary Kamailio has been brought down and the Secondary one is actually active and tied to the VIP.

When a call is fired from the Caller, it traverses through the Secondary Kamailio and lands onto the callee. The dialogs are updated properly. At this point, the Primary Kamailio is brought up, and dialog state is synced due to the DMQ module.
The Keepalived will now attach the VIP to the Primary Instance. If the caller hangs up the phone at this point, the BYE sip message traverses through the Primary Kamailio instance to the callee and the call is cleared, but there are two issues here.

  1.  The Primary Kamailio throws an error in ds_load_remove() saying that it cannot find the load for the specific call id. This is apparently due to the fact that the dispatcher data is not synced between the two modules but dialog data is. So dialog wise, things are fine. Can this be fixed somehow?

  2. The above is still as grave a problem as the dialogs are cleared. BUT, if we check the 'kamcmd dispatcher.list' on the SECONDARY kamailio even well after the call is cleared, the DLGLOAD shows 1. Since we are using the Call Load based distribution for the dispatching, this is effectively one call stuck on the dispatcher, which leads to resource leak.

Is this a known issue, and if so, do we have a way to circumvent this? 

Regards,
Harneet Singh 
--
"Once you eliminate the impossible, whatever remains, no matter how improbable, must be the truth" - Sir Arthur Conan Doyle

_______________________________________________
Kamailio (SER) - Users Mailing List
sr-users@lists.kamailio.org
https://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-users


--
"Once you eliminate the impossible, whatever remains, no matter how improbable, must be the truth" - Sir Arthur Conan Doyle


--
"Once you eliminate the impossible, whatever remains, no matter how improbable, must be the truth" - Sir Arthur Conan Doyle