[SR-Users] Best practices for troubleshooting deadlocks?

Alex Balashov abalashov at evaristesys.com
Mon Sep 14 23:47:02 CEST 2015


Hello,

Very occasionally, we encounter what appear to be deadlocks in all UDP 
receiver threads. All Kamailio processes are running, but no SIP 
messages are being processed.

On one of our high-volume installation, this happens extremely 
infrequently -- maybe once every month or two. On these occasions, the 
operator restarts the proxy before we get a chance to go in and figure 
out what's going on.

So, I'm trying to provide the operator with a procedure to execute prior 
to restarting the proxy on these occasions, so that we can see a 
snapshot of where the receiver threads are stuck. As far as I can tell, 
unless Kamailio itself segfaults, there's no specific PID that one can 
attach GDB to in order to get an overhead snapshot of all the child 
processes.

Here's what I came up with:

---------------------------------------------
#!/bin/bash

kamcmd -s /tmp/kamailio_ctl ps > thread_log.txt
echo >> thread_log.txt

while read PID;
do
	gdb --pid=$PID<<EOF>>thread_log.txt
set print elements 0
thread apply all bt full
generate-core-file
detach
EOF
done < <(kamcmd -s /tmp/kamailio_ctl ps | grep 'udp receiver' | awk 
'{print $1}')
---------------------------------------------

As far as I can tell, this should give me the most ample visibility into 
the state of the threads, with further core dumps to inspect if 
follow-up is needed. Hopefully this will result in some fixes back to 
the project.

However, if there are any other suggestions for information to grab in 
such a scenario, I'm all ears.

Thanks in advance!

-- Alex

-- 
Alex Balashov | Principal | Evariste Systems LLC
303 Perimeter Center North, Suite 300
Atlanta, GA 30346
United States

Tel: +1-800-250-5920 (toll-free) / +1-678-954-0671 (direct)
Web: http://www.evaristesys.com/, http://www.csrpswitch.com/



More information about the sr-users mailing list