<!-- Kamailio Project uses GitHub Issues only for bugs in the code or feature requests. Please use this template only for bug reports.
If you have questions about using Kamailio or related to its configuration file, ask on sr-users mailing list:
* http://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-users
If you have questions about developing extensions to Kamailio or its existing C code, ask on sr-dev mailing list:
* http://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-dev
Please try to fill this template as much as possible for any issue. It helps the developers to troubleshoot the issue.
If there is no content to be filled in a section, the entire section can be removed.
You can delete the comments from the template sections when filling.
You can delete next line and everything above before submitting (it is a comment). -->
### Description
We have a cluster of two Kamailio nodes on Debian 10 + v5.5.2.
We have added a node and updated to v5.6.1. So now it's a cluster of:
2x debian10 + v5.6.1 1x debian11 + v5.6.1
### Troubleshooting
I'm not sure what is the issue but per `dmesg` logs it seems related to `dmq_usrloc` module. We have a lot of cores, so it's happening constantly since the upgrade and the extra node addition.
#### Debugging Data
[core.kamailio.53605.1663462643.txt](https://github.com/kamailio/kamailio/files/9592884/core.kamailio.53605.16634...)
#### Log Messages
<!-- Check the syslog file and if there are relevant log messages printed by Kamailio, add them next, or attach to issue, or provide a link to download them (e.g., to a pastebin site). -->
``` [155017.987497] kamailio[53605]: segfault at f8 ip 00007fd49cf02f72 sp 00007ffea6e37e70 error 4 in dmq_usrloc.so[7fd49cef0000+17000] [155017.987527] Code: 40 38 01 d0 89 05 12 95 00 00 48 8b 05 cf 8f 00 00 8b 00 83 f8 01 0f 85 80 00 00 00 48 8b 85 48 ff ff ff 48 8b 80 a0 00 00 00 <8b> 90 f8 00 00 00 48 8b 85 48 ff ff ff 48 8b 80 a0 00 00 00 48 8b ```
#### SIP Traffic
<!-- If the issue is exposed by processing specific SIP messages, grab them with ngrep or save in a pcap file, then add them next, or attach to issue, or provide a link to download them (e.g., to a pastebin site). -->
``` (paste your sip traffic here) ```
### Possible Solutions
<!-- If you found a solution or workaround for the issue, describe it. Ideally, provide a pull request with a fix. -->
### Additional Information
* **Kamailio Version** - output of `kamailio -v`
``` version: kamailio 5.6.1 (x86_64/linux) flags: USE_TCP, USE_TLS, USE_SCTP, TLS_HOOKS, USE_RAW_SOCKS, DISABLE_NAGLE, USE_MCAST, DNS_IP_HACK, SHM_MMAP, PKG_MALLOC, Q_MALLOC, F_MALLOC, TLSF_MALLOC, DBG_SR_MEMORY, USE_FUTEX, FAST_LOCK-ADAPTIVE_WAIT, USE_DNS_CACHE, USE_DNS_FAILOVER, USE_NAPTR, USE_DST_BLOCKLIST, HAVE_RESOLV_RES, TLS_PTHREAD_MUTEX_SHARED ADAPTIVE_WAIT_LOOPS 1024, MAX_RECV_BUFFER_SIZE 262144, MAX_URI_SIZE 1024, BUF_SIZE 65535, DEFAULT PKG_SIZE 8MB poll method support: poll, epoll_lt, epoll_et, sigio_rt, select. id: unknown compiled with gcc 10.2.1 ```
* **Operating System**:
<!-- Details about the operating system, the type: Linux (e.g.,: Debian 8.4, Ubuntu 16.04, CentOS 7.1, ...), MacOS, xBSD, Solaris, ...; Kernel details (output of `lsb_release -a` and `uname -a`) -->
``` No LSB modules are available. Distributor ID: Debian Description: Debian GNU/Linux 11 (bullseye) Release: 11 Codename: bullseye
Linux sip03.example.com 5.18.0-0.deb11.4-amd64 #1 SMP PREEMPT_DYNAMIC Debian 5.18.16-1~bpo11+1 (2022-08-12) x86_64 GNU/Linux ```
In the gdb, run the the following commands and post the output here:
``` frame 0 p *ptr p ptr->sock p *ptr->sock ```
Hi Daniel,
here is the output:
``` GNU gdb (Debian 10.1-1.7) 10.1.90.20210103-git Copyright (C) 2021 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: https://www.gnu.org/software/gdb/bugs/. Find the GDB manual and other documentation resources online at: http://www.gnu.org/software/gdb/documentation/.
For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from /usr/sbin/kamailio... Reading symbols from /usr/lib/debug/.build-id/56/b8c9f5a3c31e6a1813f0c59adb02c922c60137.debug...
warning: Can't open file /dev/zero (deleted) during file-backed mapping note processing [New LWP 53605] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". Core was generated by `/usr/sbin/kamailio -P /run/kamailio/kamailio.pid -f /etc/kamailio/csbc.cfg -m 5'. Program terminated with signal SIGSEGV, Segmentation fault. #0 0x00007fd49cf02f72 in usrloc_dmq_send_multi_contact (ptr=0x7fd49d8f6ea0, aor=..., action=1, node=0x7fd49d84f5f8) at usrloc_sync.c:685 685 usrloc_sync.c: No such file or directory. (gdb) frame 0 #0 0x00007fd49cf02f72 in usrloc_dmq_send_multi_contact (ptr=0x7fd49d8f6ea0, aor=..., action=1, node=0x7fd49d84f5f8) at usrloc_sync.c:685 685 in usrloc_sync.c (gdb) p *ptr $1 = {domain = 0x7fd49d5b32d8, ruid = {s = 0x7fd49d8f7240 "uloc-3-6326536b-abf4-663", len = 24}, aor = 0x7fd49d8e86b8, c = {s = 0x7fd49d8f7000 "sip:1070500@192.168.1.201;transport=UDP;user=phone", len = 50}, received = {s = 0x7fd49d8f71c0 "sip:45.225.71.102:5060", len = 22}, path = {s = 0x0, len = 0}, expires = 1663464479, q = -1, callid = {s = 0x7fd49d8f70a0 "45776c982344d3e367a0325e7fafb494@192.168.1.201", len = 46}, cseq = 2146305210, state = CS_SYNC, flags = 0, cflags = 12, user_agent = {s = 0x7fd49d8f7138 "OXO032/043.001 GW_032/043.001", len = 29}, uniq = {s = 0x0, len = 0}, sock = 0x0, last_modified = 1663460879, last_keepalive = 1663462640, ka_roundtrip = 0, methods = 7823, instance = {s = 0x0, len = 0}, reg_id = 0, server_id = 3, tcpconn_id = -1, keepalive = 1, xavp = 0x0, next = 0x7fd49d96d2c0, prev = 0x0} (gdb) p ptr->sock $2 = (struct socket_info *) 0x0 (gdb) p *ptr->sock Cannot access memory at address 0x0 (gdb) ```
I pushed a commit earlier by looking at the code: 518296523db0c1735c3234d77d6af312f5c9babb Can you try with it?
(I see I referenced a wrong issue id in the commit message)
Hi Daniel,
I have one of the three nodes with `v5.7.0~dev1+bpo10.20220920005412.2299` (installed today from `kamailiodev-nightly` repo) and I haven't seen any more crashes so far.
I will observe the node tomorrow and let you know how it goes.
Once confirmed OK, would it be possible to backport the fix to 5.6 branch? Thanks!
It will be backported if all ok.
Hi @miconda, I wanted to confirm I haven't seen any more cores since applying the fix!
Thank you for looking into this.
Thanks for testing and feedback, commit was backported.
Closed #3242 as completed.