[sr-dev] [kamailio/kamailio] Continued OpenSSL 1.1 crashes with 5.3.0 (#2121)

Nathan Whitehorn notifications at github.com
Sat Nov 2 17:17:16 CET 2019


### Description

After upgrading to 5.3.0 from 5.2.2 (standard packages on FreeBSD 12.0), I am experiencing intermittent crashes related to handling of BYE messages.

### Troubleshooting

#### Reproduction

This happens ~ weekly and I have not found a good way to reproduce it.

#### Debugging Data

An example backtrace is below from the last dumped core (the SIGSEGV one); unfortunately it overwrote the earlier one:
```
* thread #1, name = 'kamailio', stop reason = signal SIGSEGV
  * frame #0: 0x00000008009c5b79 libc.so.7`___lldb_unnamed_symbol403$$libc.so.7 + 41
    frame #1: 0x00000008009ed63e libc.so.7`__free + 990
    frame #2: 0x000000080271562b libthr.so.3`pthread_rwlock_destroy + 59
    frame #3: 0x0000000802bedbf6 libcrypto.so.111`CRYPTO_THREAD_lock_free + 22
    frame #4: 0x0000000802aef3c4 libcrypto.so.111`RSA_free + 100
    frame #5: 0x0000000802b10c32 libcrypto.so.111`EVP_PKEY_free + 66
    frame #6: 0x000000080296ed86 libssl.so.111`___lldb_unnamed_symbol646$$libssl.so.111 + 134
    frame #7: 0x000000080295f93c libssl.so.111`SSL_CTX_free + 236
    frame #8: 0x00000008028aee42 tls.so`tls_free_domain + 114
    frame #9: 0x00000008028af1d7 tls.so`tls_free_cfg + 199
    frame #10: 0x00000008028af2df tls.so`tls_destroy_cfg + 191
    frame #11: 0x00000008028ad1f1 tls.so`destroy_tls_h + 1185
    frame #12: 0x000000000041adea kamailio`destroy_tls + 26
    frame #13: 0x00000000002e36fd kamailio`cleanup + 269
    frame #14: 0x00000000002eb5b7 kamailio`___lldb_unnamed_symbol5$$kamailio + 1351
    frame #15: 0x00000000002ea5e5 kamailio`handle_sigs + 21669
    frame #16: 0x00000000002fb83e kamailio`main_loop + 40014
    frame #17: 0x0000000000307d2b kamailio`main + 50267
    frame #18: 0x00000000002e311b kamailio`_start + 283
```

This is with OpenSSL 1.1 With the LD_PRELOAD hack to 5.2.2, things were completely stable; I am trying to use kamailio without the LD_PRELOAD'ed mutex wrapper now, which I believe is no longer required. It looks like the SSL-related stuff in the TLS crash (which was 5 minutes later!) is unrelated to the initial problem and may just be an artifact of one of the kamailio processes crashing earlier.

#### Log Messages

```
Nov  2 08:11:31 home /usr/local/sbin/kamailio[94702]: CRITICAL: {1 527440 BYE 973470944-5061-16392 at BA.A.B.I} <core> [core/mem/q_malloc.c:149]: qm_debug_check_frag(): BUG: qm: prev. fragm. tail overwritten(c0c0c000, abcdefed)[0x801544c58:0x801544c90]! Memory allocator was called from core: core/action.c:754. Fragment marked by core: core/dset.c:733. Exec from core/mem/q_malloc.c:504.
Nov  2 08:13:41 home /usr/local/sbin/kamailio[94703]: CRITICAL: <core> [core/pass_fd.c:277]: receive_fd(): EOF on 22
Nov  2 08:13:41 home kernel: pid 94702 (kamailio), uid 0: exited on signal 6 (core dumped)
Nov  2 08:13:41 home /usr/local/sbin/kamailio[94692]: ALERT: <core> [main.c:767]: handle_sigs(): child process 94702 exited by a signal 6
Nov  2 08:13:41 home /usr/local/sbin/kamailio[94692]: ALERT: <core> [main.c:770]: handle_sigs(): core was generated
Nov  2 08:14:56 home login[8284]: ROOT LOGIN (root) ON ttyu0
Nov  2 08:16:26 home kernel: pid 94692 (kamailio), uid 0: exited on signal 11 (core dumped)
```

I had an identical problem a week ago, also with a crash on a BYE for an active call:

```
Oct 27 13:00:02 home /usr/local/sbin/kamailio[79819]: CRITICAL: {1 598425 BYE 649761149-5061-291 at BA.A.B.I} <core> [core/mem/q_malloc.c:149]: qm_debug_check_frag(): BUG: qm: prev. fragm. tail overwritten(c0c0c000, abcdefed)[0x801551808:0x801551840]! Memory allocator was called from core: core/action.c:754. Fragment marked by core: core/dset.c:733. Exec from core/mem/q_malloc.c:504.
Oct 27 13:02:09 home /usr/local/sbin/kamailio[79820]: CRITICAL: <core> [core/pass_fd.c:277]: receive_fd(): EOF on 22
Oct 27 13:02:09 home kernel: pid 79819 (kamailio), uid 0: exited on signal 6 (core dumped)
Oct 27 13:02:09 home /usr/local/sbin/kamailio[79809]: ALERT: <core> [main.c:767]: handle_sigs(): child process 79819 exited by a signal 6
Oct 27 13:02:09 home /usr/local/sbin/kamailio[79809]: ALERT: <core> [main.c:770]: handle_sigs(): core was generated
Oct 27 13:04:55 home kernel: pid 79809 (kamailio), uid 0: exited on signal 11 (core dumped)
```

### Possible Solutions

<!--
If you found a solution or workaround for the issue, describe it. Ideally, provide a pull request with a fix.
-->

### Additional Information

  * **Kamailio Version** - output of `kamailio -v`

```
version: kamailio 5.3.0 (x86_64/freebsd) 4cc67a
flags: USE_TCP, USE_TLS, USE_SCTP, TLS_HOOKS, USE_RAW_SOCKS, DISABLE_NAGLE, USE_MCAST, DNS_IP_HACK, SHM_MMAP, PKG_MALLOC, Q_MALLOC, F_MALLOC, TLSF_MALLOC, DBG_SR_MEMORY, FAST_LOCK-ADAPTIVE_WAIT, USE_DNS_CACHE, USE_DNS_FAILOVER, USE_NAPTR, USE_DST_BLACKLIST, HAVE_RESOLV_RES
ADAPTIVE_WAIT_LOOPS 1024, MAX_RECV_BUFFER_SIZE 262144, MAX_URI_SIZE 1024, BUF_SIZE 65535, DEFAULT PKG_SIZE 8MB
poll method support: poll, select, kqueue.
id: 4cc67a 
compiled on 18:51:34 Oct 25 2019 with cc 6.0
```

* **Operating System**:

FreeBSD 12.0

```
FreeBSD home.XXX 12.0-RELEASE-p10 FreeBSD 12.0-RELEASE-p10 GENERIC  amd64
```


-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/kamailio/kamailio/issues/2121
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.kamailio.org/pipermail/sr-dev/attachments/20191102/ea0acd30/attachment.html>


More information about the sr-dev mailing list