[SR-Users] Best practices for troubleshooting deadlocks?
Ovidiu Sas
osas at voipembedded.com
Mon Sep 28 19:37:19 CEST 2015
There is 'kamctl trap' which does a backtrace on all kamailio
processes, similar with what your script does.
Use top to identify which processes are locked (100% CPU utilization)
and after that ... code inspection.
-ovidiu
On Mon, Sep 28, 2015 at 1:26 PM, Alex Balashov
<abalashov at evaristesys.com> wrote:
> We just encountered another one of these famed deadlocks. Any suggestions
> for how to analyse them beyond what I've already trotted out here?
>
>
> On 09/14/2015 05:47 PM, Alex Balashov wrote:
>
>> Hello,
>>
>> Very occasionally, we encounter what appear to be deadlocks in all UDP
>> receiver threads. All Kamailio processes are running, but no SIP
>> messages are being processed.
>>
>> On one of our high-volume installation, this happens extremely
>> infrequently -- maybe once every month or two. On these occasions, the
>> operator restarts the proxy before we get a chance to go in and figure
>> out what's going on.
>>
>> So, I'm trying to provide the operator with a procedure to execute prior
>> to restarting the proxy on these occasions, so that we can see a
>> snapshot of where the receiver threads are stuck. As far as I can tell,
>> unless Kamailio itself segfaults, there's no specific PID that one can
>> attach GDB to in order to get an overhead snapshot of all the child
>> processes.
>>
>> Here's what I came up with:
>>
>> ---------------------------------------------
>> #!/bin/bash
>>
>> kamcmd -s /tmp/kamailio_ctl ps > thread_log.txt
>> echo >> thread_log.txt
>>
>> while read PID;
>> do
>> gdb --pid=$PID<<EOF>>thread_log.txt
>> set print elements 0
>> thread apply all bt full
>> generate-core-file
>> detach
>> EOF
>> done < <(kamcmd -s /tmp/kamailio_ctl ps | grep 'udp receiver' | awk
>> '{print $1}')
>> ---------------------------------------------
>>
>> As far as I can tell, this should give me the most ample visibility into
>> the state of the threads, with further core dumps to inspect if
>> follow-up is needed. Hopefully this will result in some fixes back to
>> the project.
>>
>> However, if there are any other suggestions for information to grab in
>> such a scenario, I'm all ears.
>>
>> Thanks in advance!
>>
>> -- Alex
>>
>
>
> --
> Alex Balashov | Principal | Evariste Systems LLC
> 303 Perimeter Center North, Suite 300
> Atlanta, GA 30346
> United States
>
> Tel: +1-800-250-5920 (toll-free) / +1-678-954-0671 (direct)
> Web: http://www.evaristesys.com/, http://www.csrpswitch.com/
>
> _______________________________________________
> SIP Express Router (SER) and Kamailio (OpenSER) - sr-users mailing list
> sr-users at lists.sip-router.org
> http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-users
--
VoIP Embedded, Inc.
http://www.voipembedded.com
More information about the sr-users
mailing list