[sr-dev] [kamailio/kamailio] Random segfault or general protection crash on siptrace.so module (#2718)

Joshua Riffle notifications at github.com
Wed Apr 28 19:49:12 CEST 2021


### Description
After upgrading from Kamailio 5.2.x, a high volume Kamailio 5.4.4 instance randomly crashes with either a general protection or segfault error message in `siptrace.so` during use of `sip_trace` function from one of its child processes (which cascades to the parent crashing). This appears to occur once about every 36 hours on average, but has not yet appeared to correspond with any particular event.

We are continuing to collect debug information and will be populating this ticket as more information becomes available. However, this issue has been observed.

Sip trace function is applied in this example snippet:
```
# ------- siptrace --------
modparam("siptrace", "hep_mode_on", 1)
modparam("siptrace", "hep_version", 3)
modparam("siptrace", "trace_to_database", 0)
modparam("siptrace", "trace_flag", 22)
modparam("siptrace", "trace_on", 1)

request_route {
    #....
    if ( is_method("INVITE") && !has_totag() ) {
        # Only start sip_trace on initial INVITE
        sip_trace("HEP_URL","$ci-MY_IP","d");
    }
    setflag(22);
    #...
}
```


### Troubleshooting
We attempted packet collection with Homer v5 and Homer v7 and changed between HEP protocol v2 and v3. 

#### Reproduction
We have not determined a means of reproducing this issue without simply letting the server run until a crash occurs. There are four almost identical servers all experiencing the same random crashing but not at the same time.

#### Debugging Data
Our next troubleshooting case will be to simply comment out the `sip_trace` function, but this effectively disables the `siptrace` module completely rather than addressing an underlying problem.

Core dumps are still in-progress for retrieval. Debug logs should also be more readily available soon. There will be delays since these are high volume production servers.

<!--
If you got a core dump, use gdb to extract troubleshooting data - full backtrace,
local variables and the list of the code at the issue location.

  gdb /path/to/kamailio /path/to/corefile
  bt full
  info locals
  list

If you are familiar with gdb, feel free to attach more of what you consider to
be relevant.
-->

#### Log Messages

All of them have randomly crashed with the following example log entry. Regardless of troubleshooting tactics to date:
```
kernel: traps: kamailio[7579] general protection ip:7fb1a64e2dbf sp:7ffc60f04180 error:0 in siptrace.so[7fb1a64b8000+4e000]
systemd: kamailio.service: main process exited, code=exited, status=1/FAILURE
systemd: Unit kamailio.service entered failed state.
systemd: kamailio.service failed.
```


#### SIP Traffic
To date, there is no corresponding SIP Traffic with the crash.

### Possible Solutions
To date, only disabling the `siptrace` module seems to be the solution.

### Additional Information

  * **Kamailio Version** - output of `kamailio -v`

```
version: kamailio 5.4.4 (x86_64/linux) e16352
flags: USE_TCP, USE_TLS, USE_SCTP, TLS_HOOKS, USE_RAW_SOCKS, DISABLE_NAGLE, USE_MCAST, DNS_IP_HACK, SHM_MMAP, PKG_MALLOC, Q_MALLOC, F_MALLOC, TLSF_MALLOC, DBG_SR_MEMORY, USE_FUTEX, FAST_LOCK-ADAPTIVE_WAIT, USE_DNS_CACHE, USE_DNS_FAILOVER, USE_NAPTR, USE_DST_BLACKLIST, HAVE_RESOLV_RES
ADAPTIVE_WAIT_LOOPS 1024, MAX_RECV_BUFFER_SIZE 262144, MAX_URI_SIZE 1024, BUF_SIZE 65535, DEFAULT PKG_SIZE 8MB
poll method support: poll, epoll_lt, epoll_et, sigio_rt, select.
id: e16352 
compiled on 15:56:46 Feb 15 2021 with gcc 4.8.5
```

* **Operating System**:

<!--
Details about the operating system, the type: Linux (e.g.,: Debian 8.4, Ubuntu 16.04, CentOS 7.1, ...), MacOS, xBSD, Solaris, ...;
Kernel details (output of `uname -a`)
-->

```
Linux <hostname> 3.10.0-957.27.2.el7.x86_64 #1 SMP Mon Jul 29 17:46:05 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
```


-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/kamailio/kamailio/issues/2718
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.kamailio.org/pipermail/sr-dev/attachments/20210428/6267a2da/attachment-0001.htm>


More information about the sr-dev mailing list