### Description
Segfault in Kamailio when using mysql (over ssl) and tls listeners.
We have a reproducible segfault with Kamailio on Ubuntu Xenial. The problems is revealed when two modules (db_mysql and tls) and both using the openssl library. The mysql module is using openssl indirectly, because the connection is encrypted by default when the server supports it.
### Troubleshooting
#### Reproduction
Install Kamailio with - tls listeners enabled - dispatcher module enabled, from mysql db
Example configuration attached:
In this case, reproduction with:
- start kamailio - let dispatcher reload, for example via jsonrpc - make connection on tls, for example with `openssl s_connect`
Kamailio will crash.
#### Debugging Data
Stack trace, with `libssl1.0.0-dbg` installed:
``` #0 0x0000000000000000 in ?? () #1 0x00007ff862d07b0d in getrn (lh=lh@entry=0x7ff8641eb7e8, data=data@entry=0x7ffe1f36e750, rhash=rhash@entry=0x7ffe1f36e6f0) at lhash.c:396 #2 0x00007ff862d0817a in lh_retrieve (lh=0x7ff8641eb7e8, data=data@entry=0x7ffe1f36e750) at lhash.c:248 #3 0x00007ff862d0a651 in int_thread_get_item (d=0x7ffe1f36e750) at err.c:500 #4 0x00007ff862d0b024 in ERR_get_state () at err.c:1023 #5 0x00007ff862d0b25f in ERR_clear_error () at err.c:743 #6 0x00007ff86305c67e in ssl23_accept (s=0x7ff864a282d0) at s23_srvr.c:157 #7 0x00007ff860b70d86 in tls_accept (c=0x7ff864af8810, error=0x7ffe1f36eb30) at tls_server.c:422 #8 0x00007ff860b7a486 in tls_read_f (c=0x7ff864af8810, flags=0x7ffe1f38eedc) at tls_server.c:1116 #9 0x0000000000625ac2 in tcp_read_headers (c=0x7ff864af8810, read_flags=0x7ffe1f38eedc) at core/tcp_read.c:469 #10 0x000000000062d05d in tcp_read_req (con=0x7ff864af8810, bytes_read=0x7ffe1f38eed8, read_flags=0x7ffe1f38eedc) at core/tcp_read.c:1496 #11 0x0000000000631c42 in handle_io (fm=0x7ff885734520, events=1, idx=-1) at core/tcp_read.c:1804 #12 0x0000000000620500 in io_wait_loop_epoll (h=0xae0200 <io_w>, t=2, repeat=0) at core/io_wait.h:1065 #13 0x0000000000633adb in tcp_receive_loop (unix_sock=26) at core/tcp_read.c:1974 #14 0x000000000051a9a1 in tcp_init_children () at core/tcp_main.c:4853 #15 0x000000000042620e in main_loop () at main.c:1745 #16 0x000000000042ca76 in main (argc=7, argv=0x7ffe1f38f578) at main.c:2696 ```
#### Log Messages
``` 2020-04-05T01:27:37.965778+02:00 nathancmp01 kernel: [432825.787355] kamailio[6296]: segfault at 0 ip (null) sp 00007ffe4cdaf248 error 14 in kamailio[400000+47b000] ```
#### SIP Traffic
No SIP traffic needed, just a TLS connection.
### Possible Solutions
Could not reproduce with Kamailio 5.3.3 on Ubuntu Bionic nor Debian Buster. Both are using openssl 1.1.x, so I guess that fixes the problem. But Xenial is still on 1.0.2g...
### Additional Information
Tested with Kamailio 5.2 and 5.3.3.
* **Operating System**:
Repro on: - Ubuntu Xenial
No repro on: - Ubuntu Bionic - Debian Buster
[kamailio.cfg.example.txt](https://github.com/kamailio/kamailio/files/4435235/kamailio.cfg.example.txt)
What is the version for which you got the corefile and you pasted the gdb backrace from it? Get the `kamailio -v` for it.
Get also the output for following gdb commands:
``` bt full frame 7 list info locals ```
Kamailio version (latest stable xenial version from kamailio repo):
``` version: kamailio 5.3.3 (x86_64/linux) flags: USE_TCP, USE_TLS, USE_SCTP, TLS_HOOKS, USE_RAW_SOCKS, DISABLE_NAGLE, USE_MCAST, DNS_IP_HACK, SHM_MMAP, PKG_MALLOC, Q_MALLOC, F_MALLOC, TLSF_MALLOC, DBG_SR_MEMORY, USE_FUTEX, FAST_LOCK-ADAPTIVE_WAIT, USE_DNS_CACHE, USE_DNS_FAILOVER, USE_NAPTR, USE_DST_BLACKLIST, HAVE_RESOLV_RES ADAPTIVE_WAIT_LOOPS 1024, MAX_RECV_BUFFER_SIZE 262144, MAX_URI_SIZE 1024, BUF_SIZE 65535, DEFAULT PKG_SIZE 8MB poll method support: poll, epoll_lt, epoll_et, sigio_rt, select. id: unknown compiled with gcc 5.3.1 ```
GDB output:
``` Program terminated with signal SIGSEGV, Segmentation fault. #0 0x0000000000000000 in ?? ()
(gdb) bt full #0 0x0000000000000000 in ?? () No symbol table info available. #1 0x00007fe9a1931b0d in getrn (lh=lh@entry=0x7fe9a2e167e8, data=data@entry=0x7ffebc880c70, rhash=rhash@entry=0x7ffebc880c10) at lhash.c:396 ret = <optimized out> n1 = <optimized out> hash = <optimized out> nn = <optimized out> cf = <optimized out> #2 0x00007fe9a193217a in lh_retrieve (lh=0x7fe9a2e167e8, data=data@entry=0x7ffebc880c70) at lhash.c:248 hash = 8298067 rn = <optimized out> ret = <optimized out> #3 0x00007fe9a1934651 in int_thread_get_item (d=0x7ffebc880c70) at err.c:500 p = <optimized out> hash = 0x7fe9a2e167e8 #4 0x00007fe9a1935024 in ERR_get_state () at err.c:1023 fallback = {tid = {ptr = 0x0, val = 0}, err_flags = {0 <repeats 16 times>}, err_buffer = {0 <repeats 16 times>}, err_data = {0x0 <repeats 16 times>}, err_data_flags = {0 <repeats 16 times>}, err_file = {0x0 <repeats 16 times>}, err_line = {0 <repeats 16 times>}, top = 0, bottom = 0} ret = <optimized out> tmp = {tid = {ptr = 0x0, val = 140642013755136}, err_flags = {511, 316, -1619302300, 32745, -1552805456, 32745, -1565683712, 32745, -2091823603, -883404243, 234052522, 1484894954, -2091823603, -883404243, 234052522, 1484894954}, err_buffer = {140641446269360, 140641433391104, 17179869256, 120267519968, 140641446272144, 140641433394304, 140732061453760, 6966688, 17774295194511430037, 16205183325810705970, 140641379772512, 1358401357362, 140641379784296, 140641379772516, 140641446269416, 140641433391104}, err_data = { 0x82c76a98fc8c8c89 <error: Cannot access memory at address 0x82c76a98fc8c8c89>, 0x5ce09cb9d15aa665 <error: Cannot access memory at address 0x5ce09cb9d15aa665>, 0x82c76a98fc8c8c89 <error: Cannot access memory at address 0x82c76a98fc8c8c89>, 0x5ce09cb9d15aa665 <error: Cannot access memory at address 0x5ce09cb9d15aa665>, 0x1 <error: Cannot access memory at address 0x1>, 0x7fe9a37211b0 <incomplete sequence \340>, 0xe0 <error: Cannot access memory at address 0xe0>, 0x7fe9a2ad9000 "", 0x1a3721130 <error: Cannot access memory at address 0x1a3721130>, 0x7fe9a2b126c8 "", 0x7ffebc880dc0 "", 0x69fcd8 <futex_release+29> "\211E\374\203}\374\002\017\224\300\017\266\300H\205\300t6H\213E\350H\203\354\bj", 0x48 <error: Cannot access memory at address 0x48>, 0x7fe9a2b126c8 "", 0x8 <error: Cannot access memory at address 0x8>, 0x100000049 <error: Cannot access memory at address 0x100000049>}, err_data_flags = {-1131934208, 32766, 7001680, 0, -1619302304, 32745, -1573631521, 32745, -1619290520, 32745, -1619302300, 32745, -1552805400, 32745, -1565683712, 32745}, err_file = {0x7ffebc880e20 "\300\016\210\274\376\177", 0x7fe99f76cc19 <ser_free+74> "\220\311\303UH\211\345AWAVAUATSH\203\354\070H\211}\270H\213E\270H\211\307\350Ӳ\376\377\211Eȃ", <incomplete sequence \310>, 0x7ffebc880e80 "", 0x49ea76c4c1721d00 <error: Cannot access memory at address 0x49ea76c4c1721d00>, 0x7ffebc880ec0 "\262\367\217\070\271\235;\245", 0x7ffebc880e80 "", 0x48 <error: Cannot access memory at address 0x48>, 0x7fe9a1c58660 <state> "i\t\264F\340\372ﳫc\266\001\375\230Q\234M\263\037\344=M%\310a\274\234\177.Q\364%\224|K\334\062\233\374\065k,\215\362\064\231\277PҒ\035\332\333\363\320MɌ\202\372\346\366\t\206\064B\370X\362B\355\366u\334\341\342S\275\063\236\213\255*ű\327\347\005\214\221\016\001'@\003<\361\257Lf\300\346x1\227\315\344hF\236\020\065>'~5\314s\327\313{\206\307\343\016\034\226\223\354\063\357\v\226Y\241\203\333\032\231\345P\f\017'y\313ytѽ\201i\376k\242\240\317\bz^\344\062k\202Y\320Q\213^J\033g\256\377Y\316\034+\321\375\341\211>\\205;\253\325\313:^\377\350\366ܻ\001"..., 0x7ffebc880ec0 "\262\367\217\070\271\235;\245", 0x7fe9a1932c0c <ssleay_rand_add+780> "H\213\204$\210", 0x14bc880ec0 <error: Cannot access memory at address 0x14bc880ec0>, 0x7ffebc880f28 "", 0x8 <error: Cannot access memory at address 0x8>, 0x7ffebc880eb0 "\276\002", 0x8 <error: Cannot access memory at address 0x8>, 0x0}, err_line = {0 <repeats 12 times>, 702, 0, 61, 0}, top = 948959154, bottom = -1522819655} tmpp = 0x0 i = <optimized out> tid = {ptr = 0x0, val = 140642013755136} #5 0x00007fe9a193525f in ERR_clear_error () at err.c:743 i = <optimized out> es = <optimized out> #6 0x00007fe9a1c8667e in ssl23_accept (s=0x7fe9a36532d0) at s23_srvr.c:157 buf = <optimized out> Time = 1586173596 cb = 0x0 ret = -1 new_state = <optimized out> state = <optimized out> #7 0x00007fe99f792bd4 in tls_accept (c=0x7fe9a3723810, error=0x7ffebc881050) at tls_server.c:422 ret = -1131933680 ssl = 0x7fe9a36532d0 cert = 0x7fe9a3720e58 tls_c = 0x7fe9a355a0d0 tls_log = -1619452682 __func__ = "tls_accept" pkey = 0x0 #8 0x00007fe99f79c2d4 in tls_read_f (c=0x7fe9a3723810, flags=0x7ffebc8a13fc) at tls_server.c:1116 r = 0x7fe9a3723890 bytes_free = 16383 bytes_read = 305 read_size = 16383 ssl_error = 0 ssl_read = 0 ssl = 0x7fe9a36532d0 rd_buf = "\026\003\001\001,\001\000\001(\003\003\272\063\350r\362p\215ԩ<ߨyD2\317T\323\022\221!\253l\231?ڿ\236yv\277\305\000\000\252\300\060\300,\300(\300$\300\024\300\n\000\245\000\243\000\241\000\237\000k\000j\000i\000h\000\071\000\070\000\067\000\066\000\210\000\207\000\206\000\205\300\062\300.\300*\300&\300\017\300\005\000\235\000=\000\065\000\204\300/\300+\300'\300#\300\023\300\t\000\244\000\242\000\240\000\236\000g\000@\000?\000>\000\063\000\062\000\061\000\060\000\232\000\231\000\230\000\227\000E\000D\000C\000B\300\061\300-\300)\300%\300\016\300\004\000\234\000<\000/\000\226\000A\300\021\300\a\300\f\300\002\000\005\000\004\300\022\300\b\000\026\000"...
wr_buf = '\000' <repeats 50352 times>... rd = {buf = 0x7ffebc881120 "\026\003\001\001,\001", pos = 0, used = 305, size = 65536} wr = {buf = 0x7ffebc891120 "", pos = 0, used = 0, size = 65536} tls_c = 0x7fe9a355a0d0 enc_rd_buf = 0x0 n = 0 flush_flags = 0 err_src = 0x7fe99f7c5046 "TLS read:" x = 0 tls_dbg = 0 __func__ = "tls_read_f" #9 0x000000000067420f in tcp_read_headers (c=0x7fe9a3723810, read_flags=0x7ffebc8a13fc) at core/tcp_read.c:469 bytes = 0 remaining = 0 p = 0x0 r = 0x7fe9a3723890 mc = 0 body_len = 0 mfline = 0x0 mtransid = {s = 0x0, len = 0} __func__ = "tcp_read_headers" #10 0x000000000067b7aa in tcp_read_req (con=0x7fe9a3723810, bytes_read=0x7ffebc8a13f8, read_flags=0x7ffebc8a13fc) at core/tcp_read.c:1496 bytes = -1 total_bytes = 0 resp = 1 size = 24 req = 0x7fe9a3723890 dst = {send_sock = 0x0, to = {s = {sa_family = 24989, sa_data = "f\000\000\000\000\000\070\024\212\274\376\177\000"}, sin = {sin_family = 24989, sin_port = 102, sin_addr = {s_addr = 0}, sin_zero = "8\024\212\274\376\177\000"}, sin6 = {sin6_family = 24989, sin6_port = 102, sin6_flowinfo = 0, sin6_addr = {__in6_u = { __u6_addr8 = "8\024\212\274\376\177\000\000\b\000\000\000\000\000\000", __u6_addr16 = {5176, 48266, 32766, 0, 8, 0, 0, 0}, __u6_addr32 = {3163165752, 32766, 8, 0}}}, sin6_scope_id = 20}}, id = 0, send_flags = {f = 1, blst_imask = 0}, proto = 1 '\001', proto_pad0 = 0 '\000', proto_pad1 = 0} c = 32 ' ' ret = -1131801808 __func__ = "tcp_read_req" #11 0x000000000068038f in handle_io (fm=0x7fe9c4368d48, events=1, idx=-1) at core/tcp_read.c:1804 ret = 8 n = 8 read_flags = 1 con = 0x7fe9a3723810 s = 7 resp = 0 t = 0 __func__ = "handle_io" #12 0x000000000066ec4d in io_wait_loop_epoll (h=0xb0b300 <io_w>, t=2, repeat=0) at core/io_wait.h:1062 n = 1 r = 0 fm = 0x7fe9c4368d48 revents = 1 __func__ = "io_wait_loop_epoll" #13 0x0000000000682228 in tcp_receive_loop (unix_sock=26) at core/tcp_read.c:1974 __func__ = "tcp_receive_loop" #14 0x0000000000559d89 in tcp_init_children () at core/tcp_main.c:5174 r = 0 i = -1 reader_fd_1 = 26 pid = 0 si_desc = "tcp receiver (tls:128.199.61.178:5061)\000\000\t\000\000\000\001", '\000' <repeats 11 times>, "\320o\265\304\351\177\000\000S\236~\000\000\000\000\000\000\000\000 \000\000\000\000\000\000\200\000\000\000\000\000\006\000\000\000\000\000\000\000\200\026\212\274\376\177\000\000[\314Y\000\000\000\000\000\200\026\212\274\376\177\000\000\213\033b\000\000\000\000" si = 0x0 __func__ = "tcp_init_children" #15 0x0000000000427282 in main_loop () at main.c:1761 i = 10 pid = 17994 si = 0x0 si_desc = "udp receiver child=9 sock=128.199.61.178:5061\000\000\000\350"{\000\000\000\000\000\000\035r\301\304v\352I\177 \000\020\000\000\000\000\320o\265\304\351\177\000\000S\236~\000\000\000\000\000\000\000\000 \000\000\000\000\000\000\200\000\000\000\000\000\006\000\000\000\000\000\000\000\320\027\212\274\376\177\000\000\210\020e\000\000\000\000" nrprocs = 10 woneinit = 1 __func__ = "main_loop" #16 0x000000000042eadb in main (argc=7, argv=0x7ffebc8a1cb8) at main.c:2802 cfg_stream = 0x1042010 c = -1 r = 0 tmp = 0x7ffebc8a2f40 "" tmp_len = -1 port = 0 proto = -985238336 ahost = 0x0 aport = 0 options = 0x780aa8 ":f:cm:M:dVIhEeb:l:L:n:vKrRDTN:W:w:t:u:g:P:G:SQ:O:a:A:x:X:Y:" ret = -1 seed = 3827584968 rfd = 4 debug_save = 0 debug_flag = 0 dont_fork_cnt = 0 n_lst = 0x7fe9c5257e9a <_dl_runtime_resolve_xsave+138> p = 0x7fe9c4a67410 "\332(" st = {st_dev = 19, st_ino = 493, st_nlink = 2, st_mode = 16877, st_uid = 110, st_gid = 2500, __pad0 = 0, st_rdev = 0, st_size = 40, st_blksize = 4096, st_blocks = 0, st_atim = { tv_sec = 1586173463, tv_nsec = 457366237}, st_mtim = {tv_sec = 1586173463, tv_nsec = 457366237}, st_ctim = {tv_sec = 1586173463, tv_nsec = 461366299}, __glibc_reserved = {0, 0, 0}} tbuf = "$\032\212\274\376\177\000\000\330\314$\305\351\177\000\000\000\000\000\000\000\000\000\000'6\276\304\351\177\000\000\300\033\212\274\376\177\000\000(\032\212\274\376\177\000\000&\260be\000\000\000\000\300\212\225\001\000\000\000\000&\000\000\000\000\000\000\000\000\033\212\274\376\177", '\000' <repeats 13 times>, "\377\000\000\000\000", '/' <repeats 16 times>, "\000\000\000\000\000\000\000\377\000\000\377", '\000' <repeats 12 times>, "\377", '\000' <repeats 14 times>, "\377\000\000\000\377\000\000\000\377", '\000' <repeats 177 times>... option_index = 0 long_options = {{name = 0x78363a "help", has_arg = 0, flag = 0x0, val = 104}, {name = 0x77d344 "version", has_arg = 0, flag = 0x0, val = 118}, {name = 0x78363f "alias", has_arg = 1, flag = 0x0, val = 1024}, {name = 0x783645 "subst", has_arg = 1, flag = 0x0, val = 1025}, {name = 0x78364b "substdef", has_arg = 1, flag = 0x0, val = 1026}, {name = 0x783654 "substdefs", has_arg = 1, flag = 0x0, val = 1027}, {name = 0x78365e "server-id", has_arg = 1, flag = 0x0, val = 1028}, {name = 0x0, has_arg = 0, flag = 0x0, val = 0}} __func__ = "main"
(gdb) frame 7 #7 0x00007fe99f792bd4 in tls_accept (c=0x7fe9a3723810, error=0x7ffebc881050) at tls_server.c:422 422 tls_server.c: No such file or directory.
(gdb) list 417 in tls_server.c
(gdb) info locals ret = -1131933680 ssl = 0x7fe9a36532d0 cert = 0x7fe9a3720e58 tls_c = 0x7fe9a355a0d0 tls_log = -1619452682 __func__ = "tls_accept" pkey = 0x0 ```
I do not have access to Xenial to try to reproduce, but maybe it helps to get all log messages from kamailio start to the crash. Set first debug=3 in kamailio.cfg, then start kamailio and reproduce the issue. Take all log messages printed by kamailio (usually to /var/log/syslog) and send them over. There should be a lot with `DEBUG: ...`.
Attached is the debug log from Kamailio: [xenial-debug3.log](https://github.com/kamailio/kamailio/files/4472261/xenial-debug3.log)
I have a docker-based reproduction for various OS versions, if you want to test.
The logs suggest that you set the parameter `tls_force_run` for tls module. It is not in the sample config attached above. Any reason for setting it? What is its value?
Yes, I was adding that to test for repro on Ubuntu Bionic.
It's not needed for Xenial. When removed, we still have the same segfault.
Can you send me the config you use to reproduce on docker container. I got access to a Xenial box and tried with a minimal config only having tls module and starts fine. You said you can reproduce it when db_mysql is used as well, so having your config for docker will help.
Hi,
I'm attaching a zip containing all files needed for repro. Just use `start.sh` and it should reproduce the segfault. [kam-2274.zip](https://github.com/kamailio/kamailio/files/4558686/kam-2274.zip)
On a different report that I investigated with Xenial and latest libssl security upgrade it seemed to be some conflict with libmysqlclient using tcp/tls connection over IP socket. Is your kamailio connectiong via IP socket to mysql server (dbhost is 127.0.0.1 or other ip)? If yes, and myslq server is on the same host, can you test with dbhost being `localhost`, which makes the mysql client lib to use the unix socket file for connectivity.
Checking this will sort out if it is somehow a related issue or something different. For the other case, when using unix socket file for mysql connectivity, all runs fine on Xenial with latest libssl.
Yes, same story here. Using 'localhost' for DB connection URL solves the problem.
But we can't use this for our production scenario, as our DB is not on localhost...
The conflict seems to be between libmysqlclient and libssl, there were no changes in our code and works with unix socket file. It could be an effect of how we initialize the libssl, but to discover that it will require to look at the libmysqlclient code.
Some ideas you can try for the moment: * update libmysqlclient or libssl to newer versions * use mysql-proxy (or other sql proxy) from localhost to remote mysql server
Were you able to get it working with any of the suggestions above?
We settled on accepting this as known issue for now. We just don't ds-reload this instance...
Updating to a non-Xenial libmysqlclient or libssl feels like it could introduce a whole range of new problems (also outside of Kamailio).
An option you can try is to link the tls module of kamailio to the libssl static libs. I pushed a commit to tls Makefile to guide on such process:
* https://github.com/kamailio/kamailio/commit/3e7278f28c43b830a197e2f7b212ec6f...
I tested a bit myself on Xenial and the libssl.a and libcrypto.a from the deb package are not compiled with -fPIC, so I had to download and compile libssl myself in a folder. I used the most recent 1.0.2u version (debs install 1.0.2a).
The process should be like:
- in kamailio soruce code folder, after applying the commit referenced above, edit src/modules/tls/Makefile and set: ``` LIBSSL_STATIC = yes LIBSSL_STATIC_SRCLIB = yes ``` - download the sources of openssl v1.0.2u and place them in: ``` /usr/local/src/openssl ``` - compile the openssl with next commands (do not install, only compile): ``` ./config --shared
make ``` - go back to kamailio sources and compile/install
- the ldd on tls.so should not show any libssl/libcrypto, because they are part of the file now, like:
``` ldd src/modules/tls/tls.so linux-vdso.so.1 => (0x00007ffedfa9b000) libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fc8b94d4000) /lib64/ld-linux-x86-64.so.2 (0x00007fc8b9dbb000 ```
In this way, the system uses the default installed libssl, only Kamailio is using 1.0.2a.
To be honest, I rather stay with the upstream Kamailio from Xenial and accept this bug for now.
The burden of having to maintain a custom Kamailio build and package does not outweigh the complications caused by this issue, in our case.
I can understand that fixing Xenial bugs are not your priority... if there's no plan to fix this in Kamailio, we should just close this with "won't fix".
It does not seem to be a bug in Kamailio, but some conflict in the libmysqlclient and libssl. In another case I investigated, the crash happened when running an embedded script in another language that used its own mysql connector library, independent of db_mysql module.
I am closing it, if someone comes with further details, we can look further.
Closed #2274.