[sr-dev] Crash bug

Alex Balashov abalashov at evaristesys.com
Fri Mar 27 12:47:00 CET 2015


This was a rather peculiar crash:

 From the logs, it would appear that Kamailio simply stopped processing 
messages at some point. There's about 8 minutes of zero log output at a 
time of constantly incoming traffic.


At some point, this situation is resolved when all Kamailio processes 
die with a normal SIGTERM, when someone manually restarted it:

Mar 26 20:40:10 Proxy1 /usr/local/sbin/kamailio[27498]: NOTICE: <core>
[main.c:739]: handle_sigs(): Thank you for flying kamailio!!!
Mar 26 20:40:10 Proxy1 /usr/local/sbin/kamailio[27535]: INFO: <core>
[main.c:850]: sig_usr(): signal 15 received.
...

But there are a few things here that are difficult to explain from the log:

1. Why was there no SIP stack response for 8 minutes, no logging 
activity, etc?

2. We have a script that checks if Kamailio processes are running every 
1 second, and restarts Kamailio if it's not. It sends an e-mail 
informing us of that development also.

It's a rather naive check:

    ps aux | grep kamailio | grep -v 'grep kamailio' | wc -l

But in this case, the script was not triggered, which would imply that 
some Kamailio processes--perhaps all--remained running.

There is no indication in the logs that any process died for any reason, 
except for the 'signal 15' received by all processes at the time of 
manual restart.

3. Why was a core dump generated at the time of the restart, if nothing 
crashed?

#3 is most interesting to me, because if it were some other problem, 
e.g. blocking of SIP worker threads for some reason, then I wouldn't 
expect a core dump upon service shutdown.

There is no other indication of any child process dying with SIGSEGV or 
SIGABRT.

-- Alex

On 03/27/2015 06:17 AM, Alex Balashov wrote:

> Hello,
>
> The system experienced another crash yesterday, but unfortunately the
> core dump is not very insightful, possibly due to being incomplete:
>
> BFD: Warning: /tmp/./core.kamailio.500.1427402410.27498 is truncated:
> expected core file size >= 8602058752, found: 1769852928.
> [New Thread 27498]
> Cannot access memory at address 0x7f52891e3168
> Cannot access memory at address 0x7f52891e3168
> Cannot access memory at address 0x7f52891e3168
> Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols
> found)...done.
> Loaded symbols for /lib64/ld-linux-x86-64.so.2
> Failed to read a valid object file image from memory.
> Core was generated by `/usr/local/sbin/kamailio -P /var/run/kamailio.pid
> -m 8192 -u evaristesys -g eva'.
> Program terminated with signal 11, Segmentation fault.
> #0  0x00007f5286d97e45 in ?? ()
> Missing separate debuginfos, use: debuginfo-install
> glibc-2.12-1.149.el6_6.5.x86_64
> (gdb) where
> #0  0x00007f5286d97e45 in ?? ()
> Cannot access memory at address 0x7fffbe32a210
>
>
> That's not much help at all, so I cannot possibly say it is for the same
> reasons as before.
>
>
>


-- 
Alex Balashov | Principal | Evariste Systems LLC
303 Perimeter Center North, Suite 300
Atlanta, GA 30346
United States

Tel: +1-800-250-5920 (toll-free) / +1-678-954-0671 (direct)
Web: http://www.evaristesys.com/, http://www.csrpswitch.com/



More information about the sr-dev mailing list