### Description
There's a regression after enabling Address Space Layout Randomization (ASLR). Kamailio causes SIGSEGV while loading `app_lua` (or other KEMI Interpreters probably).
### Troubleshooting
#### Reproduction
The regression has appeared after update FreeBSD kernel to https://github.com/freebsd/freebsd-src/commit/10192e77cfacd1f27601882af61883...
Kamailio falls after `loadmodule "app_lua.so"` in `kamailio.cfg`.
#### Debugging Data
``` root@server:/usr/jails/containers/kamailio/var/coredump/986# jexec kamailio lldb -c /var/coredump/986/kamailio.13521.core -- /usr/local/sbin/kamailio (lldb) target create "/usr/local/sbin/kamailio" --core "/var/coredump/986/kamailio.13521.core" Core file '/var/coredump/986/kamailio.13521.core' (x86_64) was loaded. (lldb) bt all * thread #1, name = 'kamailio', stop reason = signal SIGSEGV * frame #0: 0x0000000825b61350 libc.so.7`strncmp(s1=<unavailable>, s2=<unavailable>, n=<unavailable>) at strncmp.c:47:7 frame #1: 0x00000000003fc394 kamailio`sr_kemi_modules_add(klist=0x0000000861afb410) at kemi.c:3392:8 frame #2: 0x0000000861ae6154 app_lua.so`mod_register(path="/usr/local/lib/kamailio/modules/app_lua.so", dlflags=0x0000000821805b48, p1=0x0000000000000000, p2=0x0000000000000000) at app_lua_mod.c:605:2 frame #3: 0x00000000005e4b30 kamailio`load_module(mod_path="app_lua.so") at sr_module.c:592:7 frame #4: 0x0000000000885936 kamailio`yyparse at cfg.y:1965:8 frame #5: 0x00000000002ff384 kamailio`main(argc=7, argv=0x000000082180a848) at main.c:2506:6 frame #6: 0x00000000002ddd90 kamailio`_start(ap=<unavailable>, cleanup=<unavailable>) at crt1_c.c:75:7 ```
`_sr_kemi_modules[].mname.s` for `app_lua` points to incorrect data after second call `mod_register()@app_lua_mod.c` from `load_module()@sr_module.c` (reloading the module for setting correct `dlflags`).
#### Log Messages
``` 08:35:53.689647 DEBUG: <core> [core/cfg.y:1964]: yyparse(): loading module kemix.so 08:35:53.689684 DEBUG: <core> [core/sr_module.c:516]: ksr_locate_module(): found module to load </usr/local/lib/kamailio/modules/kemix.so> 08:35:53.689698 DEBUG: <core> [core/sr_module.c:566]: load_module(): trying to load </usr/local/lib/kamailio/modules/kemix.so> 08:35:53.689796 DEBUG: <core> [core/kemi.c:3398]: sr_kemi_modules_add(): adding module: kx 08:35:53.689847 DEBUG: <core> [core/cfg.lex:2039]: pp_define(): defining id: MOD_kemix 08:35:53.689895 DEBUG: <core> [core/cfg.y:1964]: yyparse(): loading module app_lua.so 08:35:53.689931 DEBUG: <core> [core/sr_module.c:516]: ksr_locate_module(): found module to load </usr/local/lib/kamailio/modules/app_lua.so> 08:35:53.689948 DEBUG: <core> [core/sr_module.c:566]: load_module(): trying to load </usr/local/lib/kamailio/modules/app_lua.so> 08:35:53.690418 DEBUG: <core> [core/kemi.c:3494]: sr_kemi_eng_register(): registered config routing enginge [lua] 08:35:53.690444 DEBUG: <core> [core/kemi.c:3398]: sr_kemi_modules_add(): adding module: app_lua ```
### Possible Solutions
Temporary solution is disabling ASLR, ex. FreeBSD: ``` # sysctl kern.elf64.aslr.enable=0 # sysctl kern.elf64.aslr.pie_enable=0 ```
### Additional Information
* **Kamailio Version** - output of `kamailio -v`
``` version: kamailio 5.6.1 (x86_64/freebsd) b36a13 flags: USE_TCP, USE_TLS, USE_SCTP, TLS_HOOKS, USE_RAW_SOCKS, DISABLE_NAGLE, USE_MCAST, DNS_IP_HACK, SHM_MMAP, PKG_MALLOC, Q_MALLOC, F_MALLOC, TLSF_MALLOC, DBG_SR_MEMORY, FAST_LOCK-ADAPTIVE_WAIT, USE_DNS_CACHE, USE_DNS_FAILOVER, USE_NAPTR, USE_DST_BLOCKLIST, HAVE_RESOLV_RES, TLS_PTHREAD_MUTEX_SHARED ADAPTIVE_WAIT_LOOPS 1024, MAX_RECV_BUFFER_SIZE 262144, MAX_URI_SIZE 1024, BUF_SIZE 65535, DEFAULT PKG_SIZE 8MB poll method support: poll, select, kqueue. id: b36a13 compiled on 08:12:34 Jul 27 2022 with cc FreeBSD clang version 14.0.5 (https://github.com/llvm/llvm-project.git llvmorg-14.0.5-0-gc12386ae247c) ```
5.5.4 is affected too
* **Operating System**:
``` FreeBSD 13/stable ```
@henningw, the problem could appears not only on FreeBSD environment, but on any ASLR environment.
I've wrote a test case for the regression close to Kamailio behavior:
_libtest.c_: ```c #include "modules.h"
static char *str= "app_lua";
int mod_register() { modules_add(str);
return 0; } ``` _modules.c_: ```c #include <stdio.h>
void modules_add(char *msg) { printf("modules_add(%p): %s\n", msg, msg); } ``` _main.c_: ```c #include <stdio.h> #include <dlfcn.h> #include "modules.h"
typedef int (*mod_register_function)();
int testlib(int num) { mod_register_function mr; char* error;
void* h = dlopen("libtest.so", RTLD_NOW); if (h == 0) { printf("Error loading\n"); return 1; } dlerror(); mr = (mod_register_function)dlsym(h, "mod_register"); if ((error = (char*)dlerror()) != 0) { printf("dlsym error: %s\n", error); return 1; } printf("Call mod_register() #%d: ", num); mr(); dlclose(h);
return 0; }
int main() { int err; err = testlib(1); if (err != 0) return err;
err = testlib(2); if (err != 0) return err;
return 0; } ```
And ran it on non-ASLR and ASLR environment: _non-ASLR_: ``` boris@boris:~/aslr_test% ./aslr_test Call mod_register() #1: modules_add(0x800646528): app_lua Call mod_register() #2: modules_add(0x800646528): app_lua ```
_ASLR_: ``` boris@boris:~/aslr_test% ./aslr_test Call mod_register() #1: modules_add(0x825abc528): app_lua Call mod_register() #2: modules_add(0x825bfe528): app_lua ``` And how can we see: `str` address is changed on ASLR environment, and we cannot use it after reloading library.
I suppose using static variable after reloading library is incorrect way.
@drTr0jan thanks for the clarification and the test case. Can you show the exact gcc line how do you compiled the test cases? You can use "make Q=0" if you do it in the kamailio make file environment.
@henningw , I've build with clang: ``` clang -o aslr_test modules.c main.c clang --shared -fPIC -o libtest.so modules.c libtest.c ```
Thanks. We could try to disable the PIC mode, it will probably then deactivate the ASRL. But it will probably also break the library, as it could not re-located anymore. Maybe @miconda has an idea.
I didn't get the time yet to read properly about ASRL to comment technically, but I don't think it was supported before and got broken over the time, so this does not seem to be a regression, but a request for an enhancement (new feature).
Thanks Daniel. The ASRL is a system-wide feature, and was aparently activated by default from the FreeBSD developers. @drTr0jan The strange things is that ASRL seems to be enabled since a rather long time (just checked an Ubuntu 20.04 and a Debian Buster), both have it activated. I am wondering why we are not seeing more reports then. Did you executed the tests above on FreeBSD or on a Linux machine? ``` $ cat /proc/sys/kernel/randomize_va_space 2
```
My comment was related to what a regression of an application is supposed to be, respectively something worked fine and in the same context a new version of the app does not work anymore. Because the description was saying that the problem showed up after enabling Address Space Layout Randomization (ASLR), sounded like a new context, plus that there was no change to app_lua. Similar, we do not say that it is a regression because a module does not work with a new library version, but rather *lib version N not supported yet*. But of course, this could be own interpretation.
If ASLR is enabled on Debian/Ubuntu, then it is something specific to the FreeBSD implementation.
Can you try with master branch or the patch from the commit 43f764cae870b15a96b8ca88f1eb195d4ceb8455 ?
If still not solved, can you try with master branch and use:
``` loadmodule("app_lua.so", "g") ```
for loading `app_lua` module (instead of the classic `loadmodule "app_lua.so"`)?
@miconda, thx.
Kamailio continues falls with 43f764cae870b15a96b8ca88f1eb195d4ceb8455 after any `sr_kemi_route()` call.
But the option "g" has solved the problem. Kamailio works. But I need more tests.
Hmm, interesting, because 43f764cae870b15a96b8ca88f1eb195d4ceb8455 clones the module name in the core and it should no longer be influenced by the reload.
Anyhow, good to know that the variant with `g` option makes it work.
@miconda, Kamailio has fallen with 43f764cae870b15a96b8ca88f1eb195d4ceb8455 not after loading `app_lua` (as before) but after calling `app_lua` methods. I've tried fix the subject earlier but faced with same problem.
Closing this one, noting that `g` option makes it work.
Closed #3202 as completed.
@miconda, what about to merge f5c98a49c98aedcf6e1afec3c42dd862d0eeb9a3 and 69ba64e26e3876ce84053a691dee2f2ad9bb6185 to 5.6?
Those are changes that introduce nu syntax to config file and do not qualify for back porting.
As I wrote in previous comments, this report is not about a regression, because there was not a past commit that introduced such limitation, which could have been eventually reverted to get it back working. It is about adding support for a new use case.
Others can comment and we can see if there is a better decision that can be taken.
@drTr0jan I agree that its not the optimal solution for the affected OS. But as you mentioned the easiest workaround is to just deactivate it on the systems until 5.7.0 next year. OS that build their own packages (like the mentioned FreeBSD) could of course also add the two patches to their own version as custom backport.