[sr-dev] [kamailio/kamailio] coredump: dmq_usrloc sync from node in "status: disabled" (#2451)

marcinkowalczyk notifications at github.com
Thu Aug 20 11:25:38 CEST 2020


### Description

I'm running two kamailio registrars with dmq replication 
```
modparam("usrloc", "db_mode", 0)
modparam("usrloc", "use_domain", 1)

loadmodule "dmq.so"
modparam("dmq", "server_address", "sip:own_ip:5062" )
modparam("dmq", "notification_address",  "sip:peer_ip:5062")
modparam("dmq", "ping_interval", 15)

modparam("dmq_usrloc", "enable", 1)
modparam("dmq_usrloc", "sync", 1)
modparam("dmq_usrloc", "replicate_socket_info", 1)
modparam("dmq_usrloc", "usrloc_delete", 1)

```

Replication works fine I can see contacts on both registrars. Problem starts when I try to restart one of regsitrars, than another one crashes just after tries to sync with first one.

I did some more investigation and if 1st node starts when he is still in "active" status in DMQ donor node will crash.

```
# kamcmd dmq.list_nodes
{
        host: 10.0.210.67
        port: 5062
        resolved_ip: 10.0.210.67
        status: disabled
        last_notification: 0
        local: 0
}
{
        host: 10.0.210.58
        port: 5062
        resolved_ip: 10.0.210.58
        status: active
        last_notification: 0
        local: 1
}
```

If 1st node will be down for bit longer time (so DMQ marks is as pending) crash will not happen and sync will be successfull.

```
# kamcmd dmq.list_nodes
{
        host: 10.0.210.67
        port: 5062
        resolved_ip: 10.0.210.67
        status: pending
        last_notification: 0
        local: 0
}
{
        host: 10.0.210.58
        port: 5062
        resolved_ip: 10.0.210.58
        status: active
        last_notification: 0
        local: 1
}
```


### Troubleshooting

#### Reproduction

restart one of nodes in time when all cluster members are in active state

#### Debugging Da
```
[root /]# gdb /usr/sbin/kamailio /core.50795
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-119.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /usr/sbin/kamailio...Reading symbols from /usr/sbin/kamailio...(no debugging symbols found)...done.
(no debugging symbols found)...done.
[New LWP 50795]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/sbin/kamailio -DD -P /run/kamailio/kamailio.pid -f /etc/kamailio/kamailio.'.
Program terminated with signal 11, Segmentation fault.
#0  0x00007f7027eaca38 in usrloc_dmq_send_contact () from /usr/lib64/kamailio/modules/dmq_usrloc.so
Missing separate debuginfos, use: debuginfo-install kamailio-5.4.0-0.el7.centos.x86_64
(gdb) backtrace
#0  0x00007f7027eaca38 in usrloc_dmq_send_contact () from /usr/lib64/kamailio/modules/dmq_usrloc.so
#1  0x00007f7027ea29de in usrloc_get_all_ucontact () from /usr/lib64/kamailio/modules/dmq_usrloc.so
#2  0x00007f7027ea5e3c in usrloc_dmq_execute_action () from /usr/lib64/kamailio/modules/dmq_usrloc.so
#3  0x00007f7027ea8937 in usrloc_dmq_handle_msg () from /usr/lib64/kamailio/modules/dmq_usrloc.so
#4  0x00007f7028c43b88 in worker_loop () from /usr/lib64/kamailio/modules/dmq.so
#5  0x00007f7028c41304 in child_init () from /usr/lib64/kamailio/modules/dmq.so
#6  0x000000000057c313 in init_mod_child ()
#7  0x000000000057bf88 in init_mod_child ()
#8  0x000000000057bf88 in init_mod_child ()
#9  0x000000000057bf88 in init_mod_child ()
#10 0x000000000057bf88 in init_mod_child ()
#11 0x000000000057bf88 in init_mod_child ()
#12 0x000000000057bf88 in init_mod_child ()
#13 0x000000000057cab2 in init_child ()
#14 0x000000000042ab0d in main_loop ()
#15 0x0000000000433a76 in main ()
(gdb) bt full
#0  0x00007f7027eaca38 in usrloc_dmq_send_contact () from /usr/lib64/kamailio/modules/dmq_usrloc.so
No symbol table info available.
#1  0x00007f7027ea29de in usrloc_get_all_ucontact () from /usr/lib64/kamailio/modules/dmq_usrloc.so
No symbol table info available.
#2  0x00007f7027ea5e3c in usrloc_dmq_execute_action () from /usr/lib64/kamailio/modules/dmq_usrloc.so
No symbol table info available.
#3  0x00007f7027ea8937 in usrloc_dmq_handle_msg () from /usr/lib64/kamailio/modules/dmq_usrloc.so
No symbol table info available.
#4  0x00007f7028c43b88 in worker_loop () from /usr/lib64/kamailio/modules/dmq.so
No symbol table info available.
#5  0x00007f7028c41304 in child_init () from /usr/lib64/kamailio/modules/dmq.so
No symbol table info available.
#6  0x000000000057c313 in init_mod_child ()
No symbol table info available.
#7  0x000000000057bf88 in init_mod_child ()
No symbol table info available.
#8  0x000000000057bf88 in init_mod_child ()
No symbol table info available.
#9  0x000000000057bf88 in init_mod_child ()
No symbol table info available.
#10 0x000000000057bf88 in init_mod_child ()
No symbol table info available.
#11 0x000000000057bf88 in init_mod_child ()
No symbol table info available.
#12 0x000000000057bf88 in init_mod_child ()
No symbol table info available.
#13 0x000000000057cab2 in init_child ()
No symbol table info available.
#14 0x000000000042ab0d in main_loop ()
No symbol table info available.
#15 0x0000000000433a76 in main ()
No symbol table info available.
(gdb) quit
[root /]#
```


#### Log Messages

```
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53012]: ALERT: <core> [main.c:777]: handle_sigs(): child process 53050 exited by a signal 11
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53012]: ALERT: <core> [main.c:780]: handle_sigs(): core was not generated
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53012]: INFO: <core> [main.c:802]: handle_sigs(): terminating due to SIGCHLD
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53021]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53014]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53022]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53013]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53023]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53024]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53018]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53028]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53025]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53031]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53026]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53032]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53027]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53035]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53047]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53054]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53038]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53053]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53045]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53057]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53079]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53072]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53033]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53074]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53070]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53077]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53056]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53019]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53036]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53040]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53075]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53043]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53041]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53030]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53020]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53052]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53029]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53015]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53017]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received
Aug 20 11:18:08 10.0.210.58 kamailio-dispatcher[53016]: INFO: <core> [main.c:857]: sig_usr(): signal 15 received

```

#### SIP Traffic

### Possible Solutions

shared db sync

### Additional Information

```
# kamailio  -v
version: kamailio 5.4.0 (x86_64/linux) 6c4fce
flags: USE_TCP, USE_TLS, USE_SCTP, TLS_HOOKS, USE_RAW_SOCKS, DISABLE_NAGLE, USE_MCAST, DNS_IP_HACK, SHM_MMAP, PKG_MALLOC, Q_MALLOC, F_MALLOC, TLSF_MALLOC, DBG_SR_MEMORY, USE_FUTEX, FAST_LOCK-ADAPTIVE_WAIT, USE_DNS_CACHE, USE_DNS_FAILOVER, USE_NAPTR, USE_DST_BLACKLIST, HAVE_RESOLV_RES
ADAPTIVE_WAIT_LOOPS 1024, MAX_RECV_BUFFER_SIZE 262144, MAX_URI_SIZE 1024, BUF_SIZE 65535, DEFAULT PKG_SIZE 8MB
poll method support: poll, epoll_lt, epoll_et, sigio_rt, select.
id: 6c4fce
compiled on 17:15:32 Jul 29 2020 with gcc 4.8.5

```

* **Operating System**:



```
CentOS Linux release 7.8.2003 (Core)
 3.10.0-1127.18.2.el7.x86_64 #1 SMP Sun Jul 26 15:27:06 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

```


-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/kamailio/kamailio/issues/2451
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.kamailio.org/pipermail/sr-dev/attachments/20200820/653198c4/attachment-0001.htm>


More information about the sr-dev mailing list