[sr-dev] some sip router processes won't die

Andrei Pelinescu-Onciul andrei at iptel.org
Mon Sep 28 20:07:47 CEST 2009


On Sep 26, 2009 at 12:35, Juha Heinanen <jh at tutpro.com> wrote:
> Andrei Pelinescu-Onciul writes:
> 
>  > If it happens again, could you try to attach with gdb to the processes
>  > eating the cpu and send me some back traces and the output of print
>  > pt[process_no]? You could try using a larger exit_timeout (e.g.
>  > exit_timeout=1800), just to  be sure you'll catch them.
> 
> andrei,
> 
> it did happen again.  here is some gdb info.  i noticed that after a
> while the processes stopped consuming lots of cpu time, but still didn't
> die.

It's strange, That's the pkg memory status (memory dump at the end, for
debugging).
One possibility is that there is a lot to log (e.g. memory leak?) and
the syslog daemon slows things down.
Another possibility is a nasty memory corruption bug, that happens to
create some kind of loop in the list of free fragments (e.g. someone
 writes more then allocated, overwriting some malloc internal
 information).
Did you change memlog in the .cfg? What was your debug level?
Do you have in the log, line containing: "fm_status"? If so could you
send me the output of grep "f_malloc\.c" logfile ?

Does the same happen if you compile with -DDBG_QM_MALLOC and without
 -DF_MALLOC (qm_malloc might catch a problem sooner)?

Andrei

> 
> (gdb) where
> #0  fm_status (qm=0x8230b20) at mem/f_malloc.c:614
> #1  0x08088a63 in sig_usr (signo=15) at main.c:747
> #2  <signal handler called>
> #3  0xb7f4d424 in __kernel_vsyscall ()
> #4  0xb7dee8ba in sigwaitinfo () from /lib/i686/cmov/libc.so.6
> #5  0x08113f37 in slow_timer_main () at timer.c:1108
> #6  0x08088475 in main_loop () at main.c:1435
> #7  0x0808aaf7 in main (argc=Cannot access memory at address 0x0
> ) at main.c:2178
> 
> (gdb) print  pt[22724]
> $1 = {pid = 0, unix_sock = 0, idx = 0, desc = '\0' <repeats 127 times>}
> (gdb) 
> 
> another process gave this:
> 
> (gdb) where
> #0  0x08121488 in fm_status (qm=0x8230b20) at mem/f_malloc.c:615
> #1  0x08088a63 in sig_usr (signo=15) at main.c:747
> #2  <signal handler called>
> #3  0xb7f4d422 in __kernel_vsyscall ()
> #4  0xb7ea3831 in recvfrom () from /lib/i686/cmov/libc.so.6
> #5  0x0811aab7 in udp_rcv_loop () at udp_server.c:446
> #6  0x08087e03 in main_loop () at main.c:1387
> #7  0x0808aaf7 in main (argc=6, argv=0x821d700) at main.c:2178



More information about the sr-dev mailing list