We have installed Kamailio 5.6.0 from http://deb.kamailio.org/kamailio56, but it is intermittently crashing with a segfault like this:
May 29 04:46:53 ss2 kernel: [26920.901572] kamailio[1416]: segfault at 43a ip 00007f78d1396ab5 sp 00007ffd2487ca10 error 4 in app_perl.so[7f78d1392000+21000] May 29 04:46:53 ss2 kernel: [26920.901575] Code: 8b 05 8f 34 02 00 48 8b 00 48 89 c7 e8 7b ff ff ff e8 3f e5 ff ff 48 8b 15 78 34 02 00 48 89 02 48 8b 05 6e 34 02 00 48 8b 00 <0f> b6 90 3a 04 00 00 48 8b 05 5d 34 02 00 48 8b 00 83 ca 02 88 90 May 29 04:46:56 ss2 kernel: [26923.737834] kamailio[1409]: segfault at 43a ip 00007f78d1396ab5 sp 00007ffd2487ca10 error 4 in app_perl.so[7f78d1392000+21000] May 29 04:46:56 ss2 kernel: [26923.737838] Code: 8b 05 8f 34 02 00 48 8b 00 48 89 c7 e8 7b ff ff ff e8 3f e5 ff ff 48 8b 15 78 34 02 00 48 89 02 48 8b 05 6e 34 02 00 48 8b 00 <0f> b6 90 3a 04 00 00 48 8b 05 5d 34 02 00 48 8b 00 83 ca 02 88 90
Perl is version 5.30.0, and the system is running Ubuntu 20.04. A Perl library is loaded using: modparam( "app_perl", "filename", "/path/to/our/Kamailio.pm" )
We have not found a core file anywhere on the server. Please let us know what other information you need, thanks.
Hello, thanks for the report. Looks like the app_perl module is the cause here. You find more information about how to get a core dump e.g. here http://www.kamailio.org/wiki/tutorials/troubleshooting/coredumpfile
If it does not crash with 5.5.x, the only change that I could spot related the module is the commit 50557b8433e137a9095b4d48df8ac9b8c3fd8807 . It is not in the c code, but the perl module, maybe you can try removing the line added by the commit and see if it makes any difference.
This is a new installation, and we're using the packages from deb.kamailio.org. The crash happened on version 5.4 and then we upgraded to 5.6 in the hope it would be fixed, but it's not. We will use the documentation to try and get a core file, thanks.
We had another occurrence and the log says:
Jun 8 00:50:56 ss2 /usr/sbin/kamailio[353181]: CRITICAL: <core> [core/pass_fd.c:277]: receive_fd(): EOF on 15 Jun 8 00:50:56 ss2 /usr/sbin/kamailio[353147]: ALERT: <core> [main.c:774]: handle_sigs(): child process 353151 exited by a signal 11 Jun 8 00:50:56 ss2 /usr/sbin/kamailio[353147]: ALERT: <core> [main.c:777]: handle_sigs(): core was generated Jun 8 00:50:56 ss2 /usr/sbin/kamailio[353147]: INFO: <core> [main.c:799]: handle_sigs(): terminating due to SIGCHLD
So we see "core was generated", however I can't find the core file. Kamailio is running with "-w /tmp", and should have write permission on /tmp. According to /proc/<kamailio pid>/limits the "Max core file size" is unlimited.
Would anyone have a suggestion on where the core file could be? Thank you.
Likely the core dump is *hijacked* by another application, it's sort of the default configuration in Ubuntu. Check it with:
``` cat /proc/sys/kernel/core_pattern ```
And see if it is piped to another application.
Thanks, it was indeed piping the core file to apport.
We've got a core file after disabling apport, and here is the backtrace. Please let us know if anything else is needed.
(gdb) bt full #0 0x00007f7b07791ab5 in perl_reload () from /lib/x86_64-linux-gnu/kamailio/modules/app_perl.so No symbol table info available. #1 0x00007f7b07793e7f in app_perl_reset_interpreter () from /lib/x86_64-linux-gnu/kamailio/modules/app_perl.so No symbol table info available. #2 0x00007f7b0778e2b1 in perl_exec2 () from /lib/x86_64-linux-gnu/kamailio/modules/app_perl.so No symbol table info available. #3 0x00007f7b0778e26e in perl_exec1 () from /lib/x86_64-linux-gnu/kamailio/modules/app_perl.so No symbol table info available. #4 0x000056222b5e3b2f in do_action () No symbol table info available. #5 0x000056222b5f2a97 in run_actions () No symbol table info available. #6 0x000056222b5f331a in run_top_route () No symbol table info available. #7 0x000056222b7a931f in receive_msg () No symbol table info available. #8 0x000056222b65886a in udp_rcv_loop () No symbol table info available. #9 0x000056222b54fccc in main_loop () No symbol table info available. #10 0x000056222b55ce41 in main () No symbol table info available.
Can you also install the debug package on the respective system and execute the backtrace again? Its missing the debug information.
Sure, we have done that and attached is a new backtrace.
[backtrace1.txt](https://github.com/kamailio/kamailio/files/8897727/backtrace1.txt)
The backtrace points to the line that has:
``` PL_exit_flags |= PERL_EXIT_DESTRUCT_END; ```
Which suggests that global variable from Perl lib `PL_exit_flags` is messed up. You can check it with gdb -- open the core file with:
``` gdb /path/to/kamailio /path/to/corefile ```
Then run:
``` p PL_exit_flags ```
The crash happens during the reload, which might be done periodically, a matter of the modparam. You can try to disable it and see if the still crash happens. This periodical reload tried to help with Perl interpreter and libs memory leaks, so watch the system memory as well to be sure it not increasing without an obvious reason.
Thank you. The output of that gdb command is below. We'll try adjusting the reset_cycles setting and see what effect that has.
(gdb) p PL_exit_flags No symbol "PL_exit_flags" in current context.
So PL_exit_flags seems to be a define, so it is not found in symbols table.
On a system I have, I searched for its define and I found to be `vTHX->Iexit_flags` . Get:
``` p vTHX p vTHX->Iexit_flags p *p vTHX ```
Should that just be run on the gdb commandline? No luck so far:
(gdb) p vTHX No symbol "vTHX" in current context. (gdb) p vTHX->Iexit_flags No symbol "vTHX" in current context. (gdb) p *p vTHX No symbol "p" in current context.
We tried compiling Kamailio 5.4.6 from source on the same server and it still segfaults when the perl interpreter is reset. Would you have any suggestions on ways to find out why the interpreter reset causes this segfault? Just disabling the reset doesn't seem like a great solution, as we'd be vulnerable to any memory leak which crept up unexpectedly. Thanks.
We have reproduced the same problem on two other servers running Ubuntu 20.04. The problem doesn't occur on Ubuntu 18.04 servers running the same perl code.
Have you any suggestions on how we can find the cause of the segfault in app_perl? Thank you.
Based on the output of `(gdb) p vTHX`, it seems that there are different global variables in the perl interpreter, not the same as I found in my system.
If you can reproduce on a test system, then you can try to run kamailio with valgrind or strace to see if it catches any buffer overflows.
How should we enable the debug information when Kamailio is compiled from source please?
At the moment running valgrind on a test system we get an output like the one below.
==3246448== HEAP SUMMARY: ==3246448== in use at exit: 9,225,711 bytes in 293 blocks ==3246448== total heap usage: 4,020 allocs, 3,727 frees, 9,833,603 bytes allocated ==3246448== ==3246448== LEAK SUMMARY: ==3246448== definitely lost: 0 bytes in 0 blocks ==3246448== indirectly lost: 0 bytes in 0 blocks ==3246448== possibly lost: 0 bytes in 0 blocks ==3246448== still reachable: 9,225,711 bytes in 293 blocks ==3246448== suppressed: 0 bytes in 0 blocks ==3246448== Rerun with --leak-check=full to see details of leaked memory ==3246448== ==3246448== For lists of detected and suppressed errors, rerun with: -s ==3246448== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0) ==3246474== Invalid read of size 8 ==3246474== at 0x5A89590: Perl__invlist_intersection_maybe_complement_2nd (in /usr/lib/x86_64-linux-gnu/libperl.so.5.30.0) ==3246474== by 0x5A89BD4: ??? (in /usr/lib/x86_64-linux-gnu/libperl.so.5.30.0) ==3246474== by 0x5A99C7E: ??? (in /usr/lib/x86_64-linux-gnu/libperl.so.5.30.0) ==3246474== by 0x5AA0267: ??? (in /usr/lib/x86_64-linux-gnu/libperl.so.5.30.0) ==3246474== by 0x5AA4D02: ??? (in /usr/lib/x86_64-linux-gnu/libperl.so.5.30.0) ==3246474== by 0x5AA52CE: ??? (in /usr/lib/x86_64-linux-gnu/libperl.so.5.30.0) ==3246474== by 0x5AA0086: ??? (in /usr/lib/x86_64-linux-gnu/libperl.so.5.30.0) ==3246474== by 0x5AA4D02: ??? (in /usr/lib/x86_64-linux-gnu/libperl.so.5.30.0) ==3246474== by 0x5AA52CE: ??? (in /usr/lib/x86_64-linux-gnu/libperl.so.5.30.0) ==3246474== by 0x5AAA36B: Perl_re_op_compile (in /usr/lib/x86_64-linux-gnu/libperl.so.5.30.0) ==3246474== by 0x5A3EC34: Perl_pmruntime (in /usr/lib/x86_64-linux-gnu/libperl.so.5.30.0) ==3246474== by 0x5A7B23A: Perl_yyparse (in /usr/lib/x86_64-linux-gnu/libperl.so.5.30.0) ...etc..
Closing it, it seems the specific perl lib version on the Ubuntu 20.04 is the problem, upgrading to Ubuntu 22.04 being a solution.
Closed #3134 as completed.