[sr-dev] [kamailio/kamailio] Kamailio freezes with DMQ and dialog (#2535)

pwakano notifications at github.com
Wed Oct 28 23:10:20 CET 2020


<!--
Kamailio Project uses GitHub Issues only for bugs in the code or feature requests. Please use this template only for bug reports.

If you have questions about using Kamailio or related to its configuration file, ask on sr-users mailing list:

  * http://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-users

If you have questions about developing extensions to Kamailio or its existing C code, ask on sr-dev mailing list:

  * http://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-dev

Please try to fill this template as much as possible for any issue. It helps the developers to troubleshoot the issue.

If there is no content to be filled in a section, the entire section can be removed.

You can delete the comments from the template sections when filling.

You can delete next line and everything above before submitting (it is a comment).
-->

### Description

When running a load test Kamailio eventually becames unresponsive and stops processing calls.
Kamilio is configured to use the DMQ replication for dialog and usrloc. Also the dialog keepalived is enabled.


### Troubleshooting

>From investigation, the problem happens faster and easier when there is some network degradation causing packet loss and/or retransmissions but even without any noticeable network issue the freeze eventually happens. 

#### Reproduction

Run a simple load test making calls at a rate of ~5 cps and keep around ~2000 calls connected all the time. A higher cps seems to make it easier to reproduce the problem.
Adding network degradation to the environment makes the problem happens, but when running a tool such as SIPp for the load test, the retransmission can be forced by simply killing the SIPp instance receiving calls which will then force Kamailio to retransmit.

#### Debugging Data

<!--
If you got a core dump, use gdb to extract troubleshooting data - full backtrace,
local variables and the list of the code at the issue location.

  gdb /path/to/kamailio /path/to/corefile
  bt full
  info locals
  list

If you are familiar with gdb, feel free to attach more of what you consider to
be relevant.
-->
Output of kamct trap:
[gdb_kamailio_20201028_213030.txt](https://github.com/kamailio/kamailio/files/5455020/gdb_kamailio_20201028_213030.txt)

#### Log Messages

Local generated requests shows up in the log, but are not sent in the network


### Possible Solutions

Not found so far

### Additional Information

  * **Kamailio Version** - output of `kamailio -v`

```
# kamailio -v
version: kamailio 5.4.2 (x86_64/linux) c3b91f
flags: USE_TCP, USE_TLS, USE_SCTP, TLS_HOOKS, USE_RAW_SOCKS, DISABLE_NAGLE, USE_MCAST, DNS_IP_HACK, SHM_MMAP, PKG_MALLOC, Q_MALLOC, F_MALLOC, TLSF_MALLOC, DBG_SR_MEMORY, USE_FUTEX, FAST_LOCK-ADAPTIVE_WAIT, USE_DNS_CACHE, USE_DNS_FAILOVER, USE_NAPTR, USE_DST_BLACKLIST, HAVE_RESOLV_RES
ADAPTIVE_WAIT_LOOPS 1024, MAX_RECV_BUFFER_SIZE 262144, MAX_URI_SIZE 1024, BUF_SIZE 65535, DEFAULT PKG_SIZE 8MB
poll method support: poll, epoll_lt, epoll_et, sigio_rt, select.
id: c3b91f 
compiled on 13:50:37 Oct 27 2020 with gcc 4.8.5
```

* **Operating System**:

<!--
Details about the operating system, the type: Linux (e.g.,: Debian 8.4, Ubuntu 16.04, CentOS 7.1, ...), MacOS, xBSD, Solaris, ...;
Kernel details (output of `uname -a`)
-->

```
# cat /etc/centos-release
CentOS Linux release 7.8.2003 (Core)
# uname -a
Linux hostname 3.10.0-1127.19.1.el7.x86_64 #1 SMP Tue Aug 25 17:23:54 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
```


-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/kamailio/kamailio/issues/2535
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.kamailio.org/pipermail/sr-dev/attachments/20201028/217c8158/attachment-0001.htm>


More information about the sr-dev mailing list