[SR-Users] DMQ Cluster - hosts loose network connectivity

Charles Chance charles.chance at sipcentric.com
Tue Feb 7 09:49:21 CET 2017


Hello José,

The issue may have been introduced by the recent multi-notify option.

To test the theory, could you try setting multi_notify to 0 and the
notification address to one of either the A or B server IP addresses?

e.g.

modparam("dmq", "notification_address", "sip:172.112.10.206:5060")
modparam("dmq", "multi_notify", 0)


If the issue is not present in this case then I will look to fix it for
multi-notify scenario.

Either way, please include the server C log for comparison.

Cheers,
Charles

On 7 Feb 2017 07:06, "José Seabra" <joseseabra4 at gmail.com> wrote:

> Hello Charles,
> 2 of them were active during network failure time period and at the time
> of reconnection.
>
> As the issue were noticed on Production environment i don't have enough
> logs to report you but i have  reproduced the issue on my Lab environment
> that has exactly the same DMQ configurations, except the number of dmq
> hosts that are 3 instead of 4.
>
> Steps to reproduce the issue:
>
> Start all 3 kamailio nodes.
> at this stage all of them are active in dmq.list_nodes.
> Server A
> {
>     host: 172.112.10.243
>     port: 5060
>     resolved_ip: 172.112.10.243
>     status: 2
>     last_notification: 0
>     local: 0
> }
> {
>     host: 172.112.10.246
>     port: 5060
>     resolved_ip: 172.112.10.246
>     status: 2
>     last_notification: 0
>     local: 0
> }
> {
>     host: 172.112.10.207
>     port: 5060
>     resolved_ip: 172.112.10.207
>     status: 2
>     last_notification: 0
>     local: 0
> }
> {
>     host: 172.112.10.206
>     port: 5060
>     resolved_ip: 172.112.10.206
>     status: 2
>     last_notification: 0
>     local: 1
> }
>
> Server B
> {
>     host: 172.112.10.243
>     port: 5060
>     resolved_ip: 172.112.10.243
>     status: 2
>     last_notification: 0
>     local: 0
> }
> {
>     host: 172.112.10.206
>     port: 5060
>     resolved_ip: 172.112.10.206
>     status: 2
>     last_notification: 0
>     local: 0
> }
> {
>     host: 172.112.10.207
>     port: 5060
>     resolved_ip: 172.112.10.207
>     status: 2
>     last_notification: 0
>     local: 1
> }
>
>
> Server C
>
> {
>     host: 172.112.10.246
>     port: 5060
>     resolved_ip: 172.112.10.246
>     status: 2
>     last_notification: 0
>     local: 0
> }
> {
>     host: 172.112.10.207
>     port: 5060
>     resolved_ip: 172.112.10.207
>     status: 2
>     last_notification: 0
>     local: 0
> }
> {
>     host: 172.112.10.206
>     port: 5060
>     resolved_ip: 172.112.10.206
>     status: 2
>     last_notification: 0
>     local: 0
> }
> {
>     host: 172.112.10.243
>     port: 5060
>     resolved_ip: 172.112.10.243
>     status: 2
>     last_notification: 0
>     local: 1
> }
>
>
> Then, after few minutes i inserted an IPTABLES rule on server C to drop
> all packages to 5060 port.
>
> After that the Server A and Server B can see each other:
>
> {
>         host: 172.112.10.206
>         port: 5060
>         resolved_ip: 172.112.10.206
>         status: 2
>         last_notification: 0
>         local: 0
> }
> {
>         host: 172.112.10.207
>         port: 5060
>         resolved_ip: 172.112.10.207
>         status: 2
>         last_notification: 0
>         local: 1
> }
>
> Server B only can see itself:
>
> {
>     host: 172.112.10.243
>     port: 5060
>     resolved_ip: 172.112.10.243
>     status: 2
>     last_notification: 0
>     local: 1
> }
>
> This behavior keeps the same after the network connectivity comes up.
>
>
> Please find out the log files attached on this email for each server.
>
> Server A - 172.112.10.206
> Server B - 172.112.10.207
> Server C- 172.112.10.243
>
> Let me know if do you need further information.
>
> Regards
> José
>
>
>
>
>
> 2017-02-06 14:04 GMT+00:00 Charles Chance <charles.chance at sipcentric.com>:
>
>> Hello,
>>
>> DMQ will remove nodes from its internal list if they fail to respond to
>> its pings - with the exception of the original notification peer
>> specified in config. This way, if the network connection is lost, DMQ will
>> continue to try the original peer indefinitely until connectivity is
>> restored, and rebuild its list of other nodes from there.
>>
>> Was the original peer (or one of them if multiple defined A/SRV records)
>> still active at the time of reconnection?
>>
>> It would help to diagnose if you can you send your log from around the
>> time of disconnection, and also at the time of reconnect.
>>
>> Regards,
>>
>> Charles
>>
>>
>> On 6 February 2017 at 10:03, José Seabra <joseseabra4 at gmail.com> wrote:
>>
>>> Hello Daniel,
>>>
>>> The parameters that i have configured on my kamailio server are:
>>>
>>> #!ifdef ENABLE_REG_SYNC
>>> modparam("registrar", "sock_flag", 18)
>>> modparam("registrar", "sock_hdr_name", "Sock-Info")
>>> ####### SIP registrar replication to other nodes ######
>>> loadmodule "dmq.so"
>>>
>>> #######  distributed message queue module paramenters #####
>>> modparam("dmq", "server_address", "sip:MY_IP_ADDRESS:MY_PORT_ADDRESS")
>>> modparam("dmq", "notification_address", "DMQ_HOSTS")
>>> modparam("dmq", "multi_notify", 1)
>>> modparam("dmq", "num_workers", 4)
>>>
>>> loadmodule "dmq_usrloc.so"
>>> modparam("dmq_usrloc", "enable", 1)
>>> modparam("dmq_usrloc", "sync", 1)
>>> modparam("dmq_usrloc", "batch_size", DMQ_BATCH_SIZE)
>>> modparam("dmq_usrloc", "batch_usleep", DMQ_BATCH_USLEEP)
>>> #!endif
>>>
>>>
>>> Let me know if do you need further information's.
>>>
>>> Thank you
>>> Regards
>>> José Seabra
>>>
>>> 2017-02-06 7:01 GMT+00:00 Daniel-Constantin Mierla <miconda at gmail.com>:
>>>
>>>> Hello,
>>>>
>>>> what are the parameters for dmq module?
>>>>
>>>> Cheers,
>>>> Daniel
>>>>
>>>> On 01/02/2017 11:32, José Seabra wrote:
>>>>
>>>> Hello there,
>>>> My DMQ cluster  has 4 nodes and by some reason 2 of them lost the
>>>> network connectivity for long time(~ 4 hours), after the network of these 2
>>>> nodes come back they didn't get connected on DMQ  cluster automatically, i
>>>> had to restart kamailio to get them again on DMQ list.
>>>> My doubts here are:
>>>>
>>>>    - Is this an expected behavior of DMQ module?
>>>>    - Is there any way of put them again on DMQ bus without need
>>>>    restart kamailio?
>>>>
>>>>
>>>> Thank you
>>>> Best Regards
>>>> José Seabra
>>>>
>>>>
>>>> _______________________________________________
>>>> SIP Express Router (SER) and Kamailio (OpenSER) - sr-users mailing listsr-users at lists.sip-router.orghttp://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-users
>>>>
>>>>
>>>> --
>>>> Daniel-Constantin Mierlawww.twitter.com/miconda -- www.linkedin.com/in/miconda
>>>> Kamailio Advanced Training - Mar 6-8 (Europe) and Mar 20-22 (USA) - www.asipto.com
>>>> Kamailio World Conference - May 8-10, 2017 - www.kamailioworld.com
>>>>
>>>>
>>>> _______________________________________________
>>>> SIP Express Router (SER) and Kamailio (OpenSER) - sr-users mailing list
>>>> sr-users at lists.sip-router.org
>>>> http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-users
>>>>
>>>>
>>>
>>>
>>> --
>>> Cumprimentos
>>> José Seabra
>>>
>>> _______________________________________________
>>> SIP Express Router (SER) and Kamailio (OpenSER) - sr-users mailing list
>>> sr-users at lists.sip-router.org
>>> http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-users
>>>
>>>
>>
>>
>> Sipcentric Ltd. Company registered in England & Wales no. 7365592. Registered
>> office: Faraday Wharf, Innovation Birmingham Campus, Holt Street,
>> Birmingham Science Park, Birmingham B7 4BB.
>>
>> _______________________________________________
>> SIP Express Router (SER) and Kamailio (OpenSER) - sr-users mailing list
>> sr-users at lists.sip-router.org
>> http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-users
>>
>>
>
>
> --
> Cumprimentos
> José Seabra
>
> _______________________________________________
> SIP Express Router (SER) and Kamailio (OpenSER) - sr-users mailing list
> sr-users at lists.sip-router.org
> http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-users
>
>

-- 
Sipcentric Ltd. Company registered in England & Wales no. 7365592. Registered 
office: Faraday Wharf, Innovation Birmingham Campus, Holt Street, 
Birmingham Science Park, Birmingham B7 4BB.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.sip-router.org/pipermail/sr-users/attachments/20170207/1b2282eb/attachment.html>


More information about the sr-users mailing list