[SR-Users] DMQ Cluster - hosts loose network connectivity

José Seabra joseseabra4 at gmail.com
Mon Feb 6 16:22:06 CET 2017


Hello Charles,
2 of them were active during network failure time period and at the time of
reconnection.

As the issue were noticed on Production environment i don't have enough
logs to report you but i have  reproduced the issue on my Lab environment
that has exactly the same DMQ configurations, except the number of dmq
hosts that are 3 instead of 4.

Steps to reproduce the issue:

Start all 3 kamailio nodes.
at this stage all of them are active in dmq.list_nodes.
Server A
{
    host: 172.112.10.243
    port: 5060
    resolved_ip: 172.112.10.243
    status: 2
    last_notification: 0
    local: 0
}
{
    host: 172.112.10.246
    port: 5060
    resolved_ip: 172.112.10.246
    status: 2
    last_notification: 0
    local: 0
}
{
    host: 172.112.10.207
    port: 5060
    resolved_ip: 172.112.10.207
    status: 2
    last_notification: 0
    local: 0
}
{
    host: 172.112.10.206
    port: 5060
    resolved_ip: 172.112.10.206
    status: 2
    last_notification: 0
    local: 1
}

Server B
{
    host: 172.112.10.243
    port: 5060
    resolved_ip: 172.112.10.243
    status: 2
    last_notification: 0
    local: 0
}
{
    host: 172.112.10.206
    port: 5060
    resolved_ip: 172.112.10.206
    status: 2
    last_notification: 0
    local: 0
}
{
    host: 172.112.10.207
    port: 5060
    resolved_ip: 172.112.10.207
    status: 2
    last_notification: 0
    local: 1
}


Server C

{
    host: 172.112.10.246
    port: 5060
    resolved_ip: 172.112.10.246
    status: 2
    last_notification: 0
    local: 0
}
{
    host: 172.112.10.207
    port: 5060
    resolved_ip: 172.112.10.207
    status: 2
    last_notification: 0
    local: 0
}
{
    host: 172.112.10.206
    port: 5060
    resolved_ip: 172.112.10.206
    status: 2
    last_notification: 0
    local: 0
}
{
    host: 172.112.10.243
    port: 5060
    resolved_ip: 172.112.10.243
    status: 2
    last_notification: 0
    local: 1
}


Then, after few minutes i inserted an IPTABLES rule on server C to drop all
packages to 5060 port.

After that the Server A and Server B can see each other:

{
        host: 172.112.10.206
        port: 5060
        resolved_ip: 172.112.10.206
        status: 2
        last_notification: 0
        local: 0
}
{
        host: 172.112.10.207
        port: 5060
        resolved_ip: 172.112.10.207
        status: 2
        last_notification: 0
        local: 1
}

Server B only can see itself:

{
    host: 172.112.10.243
    port: 5060
    resolved_ip: 172.112.10.243
    status: 2
    last_notification: 0
    local: 1
}

This behavior keeps the same after the network connectivity comes up.


Please find out the log files attached on this email for each server.

Server A - 172.112.10.206
Server B - 172.112.10.207
Server C- 172.112.10.243

Let me know if do you need further information.

Regards
José





2017-02-06 14:04 GMT+00:00 Charles Chance <charles.chance at sipcentric.com>:

> Hello,
>
> DMQ will remove nodes from its internal list if they fail to respond to
> its pings - with the exception of the original notification peer
> specified in config. This way, if the network connection is lost, DMQ will
> continue to try the original peer indefinitely until connectivity is
> restored, and rebuild its list of other nodes from there.
>
> Was the original peer (or one of them if multiple defined A/SRV records)
> still active at the time of reconnection?
>
> It would help to diagnose if you can you send your log from around the
> time of disconnection, and also at the time of reconnect.
>
> Regards,
>
> Charles
>
>
> On 6 February 2017 at 10:03, José Seabra <joseseabra4 at gmail.com> wrote:
>
>> Hello Daniel,
>>
>> The parameters that i have configured on my kamailio server are:
>>
>> #!ifdef ENABLE_REG_SYNC
>> modparam("registrar", "sock_flag", 18)
>> modparam("registrar", "sock_hdr_name", "Sock-Info")
>> ####### SIP registrar replication to other nodes ######
>> loadmodule "dmq.so"
>>
>> #######  distributed message queue module paramenters #####
>> modparam("dmq", "server_address", "sip:MY_IP_ADDRESS:MY_PORT_ADDRESS")
>> modparam("dmq", "notification_address", "DMQ_HOSTS")
>> modparam("dmq", "multi_notify", 1)
>> modparam("dmq", "num_workers", 4)
>>
>> loadmodule "dmq_usrloc.so"
>> modparam("dmq_usrloc", "enable", 1)
>> modparam("dmq_usrloc", "sync", 1)
>> modparam("dmq_usrloc", "batch_size", DMQ_BATCH_SIZE)
>> modparam("dmq_usrloc", "batch_usleep", DMQ_BATCH_USLEEP)
>> #!endif
>>
>>
>> Let me know if do you need further information's.
>>
>> Thank you
>> Regards
>> José Seabra
>>
>> 2017-02-06 7:01 GMT+00:00 Daniel-Constantin Mierla <miconda at gmail.com>:
>>
>>> Hello,
>>>
>>> what are the parameters for dmq module?
>>>
>>> Cheers,
>>> Daniel
>>>
>>> On 01/02/2017 11:32, José Seabra wrote:
>>>
>>> Hello there,
>>> My DMQ cluster  has 4 nodes and by some reason 2 of them lost the
>>> network connectivity for long time(~ 4 hours), after the network of these 2
>>> nodes come back they didn't get connected on DMQ  cluster automatically, i
>>> had to restart kamailio to get them again on DMQ list.
>>> My doubts here are:
>>>
>>>    - Is this an expected behavior of DMQ module?
>>>    - Is there any way of put them again on DMQ bus without need restart
>>>    kamailio?
>>>
>>>
>>> Thank you
>>> Best Regards
>>> José Seabra
>>>
>>>
>>> _______________________________________________
>>> SIP Express Router (SER) and Kamailio (OpenSER) - sr-users mailing listsr-users at lists.sip-router.orghttp://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-users
>>>
>>>
>>> --
>>> Daniel-Constantin Mierlawww.twitter.com/miconda -- www.linkedin.com/in/miconda
>>> Kamailio Advanced Training - Mar 6-8 (Europe) and Mar 20-22 (USA) - www.asipto.com
>>> Kamailio World Conference - May 8-10, 2017 - www.kamailioworld.com
>>>
>>>
>>> _______________________________________________
>>> SIP Express Router (SER) and Kamailio (OpenSER) - sr-users mailing list
>>> sr-users at lists.sip-router.org
>>> http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-users
>>>
>>>
>>
>>
>> --
>> Cumprimentos
>> José Seabra
>>
>> _______________________________________________
>> SIP Express Router (SER) and Kamailio (OpenSER) - sr-users mailing list
>> sr-users at lists.sip-router.org
>> http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-users
>>
>>
>
>
> Sipcentric Ltd. Company registered in England & Wales no. 7365592. Registered
> office: Faraday Wharf, Innovation Birmingham Campus, Holt Street,
> Birmingham Science Park, Birmingham B7 4BB.
>
> _______________________________________________
> SIP Express Router (SER) and Kamailio (OpenSER) - sr-users mailing list
> sr-users at lists.sip-router.org
> http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-users
>
>


-- 
Cumprimentos
José Seabra
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.sip-router.org/pipermail/sr-users/attachments/20170206/067fd263/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: dmq_logs.zip
Type: application/zip
Size: 407177 bytes
Desc: not available
URL: <http://lists.sip-router.org/pipermail/sr-users/attachments/20170206/067fd263/attachment.zip>


More information about the sr-users mailing list