[sr-dev] SCSCF crashing during registration

Hugh Waite hugh.waite at crocodile-rcs.com
Thu Mar 13 22:47:06 CET 2014


Dan,
There are two cores because of a crash in one process followed by a 
crash when the other processes are trying to shutdown.

What's interesting is that the bt doesn't show useful pointers. If you 
have installed from RPMs make sure the kamailio-debuginfo is from the 
same build as the other RPMs.

Also, do the logs say anything? There should be a log entry from the 
kernel for the segfault/signal that says which module crashed (e.g. 
registrar.so) and possibly (hopefully) an error message just before that.

Hugh


On 13/03/2014 19:53, Daniel Ciprus wrote:
> Jason,
>
> I've tried multiple combinations for pattern but I'm getting only 2 
> core files ...
>
> Details:
>
>  ~]# cat /proc/sys/kernel/core_pattern
> /tmp/core.%e.sig%s.%p
>
> ~]# lsb_release -a
> LSB Version: 
> :base-4.0-amd64:base-4.0-noarch:core-4.0-amd64:core-4.0-noarch:graphics-4.0-amd64:graphics-4.0-noarch:printing-4.0-amd64:printing-4.0-noarch
> Distributor ID: RedHatEnterpriseServer
> Description:    Red Hat Enterprise Linux Server release 6.5 (Santiago)
> Release:        6.5
> Codename:       Santiago
>
>
>
> (gdb) bt
> #0  0x00000000005350b0 in ?? ()
> #1  0x000000000053542a in ?? ()
> #2  0x00000000005356c7 in timer_main ()
> #3  0x000000000046d572 in main_loop ()
> #4  0x000000000047030b in main ()
> (gdb) bt full
> #0  0x00000000005350b0 in ?? ()
> No symbol table info available.
> #1  0x000000000053542a in ?? ()
> No symbol table info available.
> #2  0x00000000005356c7 in timer_main ()
> No symbol table info available.
> #3  0x000000000046d572 in main_loop ()
> No symbol table info available.
> #4  0x000000000047030b in main ()
> No symbol table info available.
> (gdb)
>
> (gdb) bt
> #0  0x00000031ba432925 in raise () from /lib64/libc.so.6
> #1  0x00000031ba434105 in abort () from /lib64/libc.so.6
> #2  0x0000000000546750 in ?? ()
> #3  0x000000000054853a in qm_free ()
> #4  0x00007f23d98f87de in free_local_ack_unsafe (lack=0x7f23d3319d70) 
> at uac.c:600
> #5  0x00007f23d988ea57 in free_cell (dead_cell=0x7f23d3319a70) at 
> h_table.c:217
> #6  0x00007f23d988f2ee in free_hash_table () at h_table.c:441
> #7  0x00007f23d98a2fca in tm_shutdown () at t_funcs.c:122
> #8  0x00000000004f7c7a in destroy_modules ()
> #9  0x0000000000466e63 in cleanup ()
> #10 0x0000000000467f65 in ?? ()
> #11 0x0000000000469679 in handle_sigs ()
> #12 0x000000000046db19 in main_loop ()
> #13 0x000000000047030b in main ()
> (gdb) bt full
> #0  0x00000031ba432925 in raise () from /lib64/libc.so.6
> No symbol table info available.
> #1  0x00000031ba434105 in abort () from /lib64/libc.so.6
> No symbol table info available.
> #2  0x0000000000546750 in ?? ()
> No symbol table info available.
> #3  0x000000000054853a in qm_free ()
> No symbol table info available.
> #4  0x00007f23d98f87de in free_local_ack_unsafe (lack=0x7f23d3319d70) 
> at uac.c:600
>         __FUNCTION__ = "free_local_ack_unsafe"
> #5  0x00007f23d988ea57 in free_cell (dead_cell=0x7f23d3319a70) at 
> h_table.c:217
>         b = 0x0
>         i = 0
>         rpl = 0x0
>         tt = 0x0
>         foo = 0x2fd3221000
>         cbs = 0x0
>         cbs_tmp = 0x7f23d35386b8
>         __FUNCTION__ = "free_cell"
> #6  0x00007f23d988f2ee in free_hash_table () at h_table.c:441
>         p_cell = 0x7f23d3319a70
>         tmp_cell = 0x7f23d353dca0
>         i = 580
>         __FUNCTION__ = "free_hash_table"
> #7  0x00007f23d98a2fca in tm_shutdown () at t_funcs.c:122
>         __FUNCTION__ = "tm_shutdown"
> #8  0x00000000004f7c7a in destroy_modules ()
> No symbol table info available.
> #9  0x0000000000466e63 in cleanup ()
> No symbol table info available.
> #10 0x0000000000467f65 in ?? ()
> No symbol table info available.
> #11 0x0000000000469679 in handle_sigs ()
> No symbol table info available.
> #12 0x000000000046db19 in main_loop ()
> No symbol table info available.
> #13 0x000000000047030b in main ()
> No symbol table info available.
> (gdb)
>
>
>
> On 03/13/2014 02:58 PM, Jason Penton wrote:
>> I don't think these cores indicate the real crash... I'd like to get 
>> some more detail on what actually happened? Daniel, can you 
>> re-create? Keep in mind that if your core dump config on your box is 
>> not configured to name your cores according to process id or 
>> timestamp one core will overwrite the other..... as a result you will 
>> never see the core that is the root cause.
>>
>> Which OS are you running?
>>
>> if Linux, I use the following in /etc/sysctl.conf:
>>
>> kernel.core_pattern=/tmp/core.%e.%p.%h.%t
>>
>>
>> On Thu, Mar 13, 2014 at 8:45 PM, Carsten Bock <carsten at ng-voice.com 
>> <mailto:carsten at ng-voice.com>> wrote:
>>
>>     It looks a little bit like a "double free".
>>
>>     You could try to disable the call to "abort()" in case this happens:
>>     mem_safety=1
>>     See: http://www.kamailio.org/wiki/cookbooks/devel/core#mem_safety
>>
>>     Kind regards,
>>     Carsten
>>
>>     2014-03-13 19:44 GMT+01:00 Carsten Bock <carsten at ng-voice.com
>>     <mailto:carsten at ng-voice.com>>:
>>     > It looks a little bit like a "double free".
>>     >
>>     > You could try to disable the call to "abort()" in case this
>>     happens:
>>     >
>>     >
>>     > 2014-03-13 17:22 GMT+01:00 Daniel Ciprus
>>     <daniel.ciprus at acision.com <mailto:daniel.ciprus at acision.com>>:
>>     >> There are no more core files on the filesystem :-(
>>     >>
>>     >> On 03/13/2014 12:18 PM, Jason Penton wrote:
>>     >>
>>     >> I'm afraid this is also not the correct core. Can you check
>>     the timestamp on
>>     >> the cores? Can  you re-create the crash and send me the
>>     correct core?
>>     >>
>>     >>
>>     >>
>>     >>
>>     >> On Thu, Mar 13, 2014 at 5:36 PM, Daniel Ciprus
>>     <daniel.ciprus at acision.com <mailto:daniel.ciprus at acision.com>>
>>     >> wrote:
>>     >>>
>>     >>> So I cleaned up my junkyard and I got 2 core files:
>>     >>>
>>     >>> (gdb) bt
>>     >>> #0  0x00000000005350b0 in ?? ()
>>     >>> #1  0x000000000053542a in ?? ()
>>     >>> #2  0x00000000005356c7 in timer_main ()
>>     >>> #3  0x000000000046d572 in main_loop ()
>>     >>> #4  0x000000000047030b in main ()
>>     >>> (gdb) bt full
>>     >>> #0  0x00000000005350b0 in ?? ()
>>     >>>
>>     >>> No symbol table info available.
>>     >>> #1  0x000000000053542a in ?? ()
>>     >>>
>>     >>> No symbol table info available.
>>     >>> #2  0x00000000005356c7 in timer_main ()
>>     >>>
>>     >>> No symbol table info available.
>>     >>> #3  0x000000000046d572 in main_loop ()
>>     >>>
>>     >>> No symbol table info available.
>>     >>> #4  0x000000000047030b in main ()
>>     >>>
>>     >>> No symbol table info available.
>>     >>> (gdb)
>>     >>>
>>     >>>
>>     >>> (gdb) bt full
>>     >>> #0  0x00000031ba432925 in raise () from /lib64/libc.so.6
>>     >>> No symbol table info available.
>>     >>> #1  0x00000031ba434105 in abort () from /lib64/libc.so.6
>>     >>> No symbol table info available.
>>     >>> #2  0x0000000000546750 in ?? ()
>>     >>> No symbol table info available.
>>     >>> #3  0x000000000054853a in qm_free ()
>>     >>> No symbol table info available.
>>     >>> #4  0x00007f5bf7d5a7de in free_local_ack_unsafe
>>     (lack=0x7f5bf1894528) at
>>     >>> uac.c:600
>>     >>>         __FUNCTION__ = "free_local_ack_unsafe"
>>     >>> #5  0x00007f5bf7cf0a57 in free_cell (dead_cell=0x7f5bf1894228) at
>>     >>> h_table.c:217
>>     >>>
>>     >>>         b = 0x0
>>     >>>         i = 0
>>     >>>         rpl = 0x0
>>     >>>         tt = 0x0
>>     >>>         foo = 0x2ff1683000
>>     >>>         cbs = 0x0
>>     >>>         cbs_tmp = 0x7f5bf198e508
>>     >>>         __FUNCTION__ = "free_cell"
>>     >>> #6  0x00007f5bf7cf12ee in free_hash_table () at h_table.c:441
>>     >>>         p_cell = 0x7f5bf1894228
>>     >>>         tmp_cell = 0x7f5bf1894228
>>     >>>         i = 3533
>>     >>>         __FUNCTION__ = "free_hash_table"
>>     >>> #7  0x00007f5bf7d04fca in tm_shutdown () at t_funcs.c:122
>>     >>>
>>     >>>         __FUNCTION__ = "tm_shutdown"
>>     >>> #8  0x00000000004f7c7a in destroy_modules ()
>>     >>> No symbol table info available.
>>     >>> #9  0x0000000000466e63 in cleanup ()
>>     >>> No symbol table info available.
>>     >>> #10 0x0000000000467f65 in ?? ()
>>     >>> No symbol table info available.
>>     >>> #11 0x0000000000469679 in handle_sigs ()
>>     >>> No symbol table info available.
>>     >>> #12 0x000000000046db19 in main_loop ()
>>     >>> No symbol table info available.
>>     >>> #13 0x000000000047030b in main ()
>>     >>> No symbol table info available.
>>     >>> (gdb)
>>     >>>
>>     >>>
>>     >>> On 03/13/2014 11:18 AM, Jason Penton wrote:
>>     >>>
>>     >>> Hi Daniel,
>>     >>>
>>     >>> this is the wrong core file. This is the one created on
>>     shutdown of
>>     >>> kamailio. Can you do a bt on the other core file that you
>>     probably have...
>>     >>>
>>     >>> Cheers
>>     >>> Jason
>>     >>>
>>     >>>
>>     >>> On Thu, Mar 13, 2014 at 5:05 PM, Daniel Ciprus
>>     <daniel.ciprus at acision.com <mailto:daniel.ciprus at acision.com>>
>>     >>> wrote:
>>     >>>>
>>     >>>> Folks,
>>     >>>>
>>     >>>> This is happening during the registration on SCSCF.
>>     >>>>
>>     >>>> Server:: kamailio (4.2.0-dev2 (x86_64/linux))
>>     >>>> Build:: mi_core.c compiled on 10:01:09 Mar 13 2014 with gcc
>>     4.4.6
>>     >>>> Flags:: STATS: Off, USE_TCP, USE_TLS, TLS_HOOKS, USE_RAW_SOCKS,
>>     >>>> DISABLE_NAGLE, USE_MCAST, DNS_IP_HACK, SHM_MEM, SHM_MMAP,
>>     PKG_MALLOC,
>>     >>>> DBG_QM_MALLOC, USE_FUTEX, FAST_LOCK-ADAPTIVE_WAIT,
>>     USE_DNS_CACHE,
>>     >>>> USE_DNS_FAILOVER, USE_NAPTR, USE_DST_BLACKLIST, HAVE_RESOLV_RES
>>     >>>> GIT:: unknown
>>     >>>> Now:: Thu Mar 13 11:04:47 2014
>>     >>>> Up since:: Thu Mar 13 10:58:12 2014
>>     >>>> Up time:: 395 [sec]
>>     >>>>
>>     >>>> (gdb) bt
>>     >>>> #0  0x00000031ba432925 in raise () from /lib64/libc.so.6
>>     >>>> #1  0x00000031ba434105 in abort () from /lib64/libc.so.6
>>     >>>> #2  0x0000000000546750 in ?? ()
>>     >>>> #3  0x000000000054853a in qm_free ()
>>     >>>> #4  0x00007fb4def5b7de in free_local_ack_unsafe
>>     (lack=0x7fb4d8b31728) at
>>     >>>> uac.c:600
>>     >>>> #5  0x00007fb4deef1a57 in free_cell
>>     (dead_cell=0x7fb4d8b31428) at
>>     >>>> h_table.c:217
>>     >>>> #6  0x00007fb4deef22ee in free_hash_table () at h_table.c:441
>>     >>>> #7  0x00007fb4def05fca in tm_shutdown () at t_funcs.c:122
>>     >>>> #8  0x00000000004f7c7a in destroy_modules ()
>>     >>>> #9  0x0000000000466e63 in cleanup ()
>>     >>>> #10 0x0000000000467f65 in ?? ()
>>     >>>> #11 0x0000000000469679 in handle_sigs ()
>>     >>>> #12 0x000000000046db19 in main_loop ()
>>     >>>> #13 0x000000000047030b in main ()
>>     >>>> (gdb) bt full
>>     >>>> #0  0x00000031ba432925 in raise () from /lib64/libc.so.6
>>     >>>> No symbol table info available.
>>     >>>> #1  0x00000031ba434105 in abort () from /lib64/libc.so.6
>>     >>>> No symbol table info available.
>>     >>>> #2  0x0000000000546750 in ?? ()
>>     >>>> No symbol table info available.
>>     >>>> #3  0x000000000054853a in qm_free ()
>>     >>>> No symbol table info available.
>>     >>>> #4  0x00007fb4def5b7de in free_local_ack_unsafe
>>     (lack=0x7fb4d8b31728) at
>>     >>>> uac.c:600
>>     >>>>         __FUNCTION__ = "free_local_ack_unsafe"
>>     >>>> #5  0x00007fb4deef1a57 in free_cell
>>     (dead_cell=0x7fb4d8b31428) at
>>     >>>> h_table.c:217
>>     >>>>         b = 0x0
>>     >>>>         i = 0
>>     >>>>         rpl = 0x0
>>     >>>>         tt = 0x0
>>     >>>>         foo = 0x2fd8a8b000
>>     >>>>         cbs = 0x0
>>     >>>>         cbs_tmp = 0x7fb4d8d9c9e0
>>     >>>>         __FUNCTION__ = "free_cell"
>>     >>>> #6  0x00007fb4deef22ee in free_hash_table () at h_table.c:441
>>     >>>>         p_cell = 0x7fb4d8b31428
>>     >>>>         tmp_cell = 0x7fb4d8b31428
>>     >>>>         i = 11517
>>     >>>>         __FUNCTION__ = "free_hash_table"
>>     >>>> #7  0x00007fb4def05fca in tm_shutdown () at t_funcs.c:122
>>     >>>>         __FUNCTION__ = "tm_shutdown"
>>     >>>> #8  0x00000000004f7c7a in destroy_modules ()
>>     >>>> No symbol table info available.
>>     >>>> #9  0x0000000000466e63 in cleanup ()
>>     >>>> No symbol table info available.
>>     >>>> #10 0x0000000000467f65 in ?? ()
>>     >>>> No symbol table info available.
>>     >>>> #11 0x0000000000469679 in handle_sigs ()
>>     >>>> No symbol table info available.
>>     >>>> #12 0x000000000046db19 in main_loop ()
>>     >>>> No symbol table info available.
>>     >>>> #13 0x000000000047030b in main ()
>>     >>>> No symbol table info available.
>>     >>>> (gdb)
>>     >>>>
>>     >>>>
>>     >>>>
>>     >>>>
>>     >>>>
>>     >>>> --
>>     >>>> Daniel Ciprus
>>     >>>> Integration engineer
>>     >>>> http://www.acision.com
>>     >>>>
>>     >>>> 9954 Mayland Dr
>>     >>>> Suite 3100
>>     >>>> Richmond, VA 23233
>>     >>>> USA
>>     >>>> T: +1 804 762 5601 <tel:%2B1%20804%20762%205601>
>>     >>>> E: daniel.ciprus at acision.com <mailto:daniel.ciprus at acision.com>
>>     >>>>
>>     >>>> ________________________________
>>     >>>> This e-mail and any attachment is for authorised use by the
>>     intended
>>     >>>> recipient(s) only. It may contain proprietary material,
>>     confidential
>>     >>>> information and/or be subject to legal privilege. It should
>>     not be copied,
>>     >>>> disclosed to, retained or used by, any other party. If you
>>     are not an
>>     >>>> intended recipient then please promptly delete this e-mail
>>     and any
>>     >>>> attachment and all copies and inform the sender. Thank you for
>>     >>>> understanding.
>>     >>>>
>>     >>>>
>>     >>>> _______________________________________________
>>     >>>> sr-dev mailing list
>>     >>>> sr-dev at lists.sip-router.org <mailto:sr-dev at lists.sip-router.org>
>>     >>>> http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-dev
>>     >>>>
>>     >>>
>>     >>>
>>     >>> --
>>     >>> Daniel Ciprus
>>     >>> Integration engineer
>>     >>> http://www.acision.com
>>     >>>
>>     >>> 9954 Mayland Dr
>>     >>> Suite 3100
>>     >>> Richmond, VA 23233
>>     >>> USA
>>     >>> T: +1 804 762 5601 <tel:%2B1%20804%20762%205601>
>>     >>> E: daniel.ciprus at acision.com <mailto:daniel.ciprus at acision.com>
>>     >>>
>>     >>> ________________________________
>>     >>> This e-mail and any attachment is for authorised use by the
>>     intended
>>     >>> recipient(s) only. It may contain proprietary material,
>>     confidential
>>     >>> information and/or be subject to legal privilege. It should
>>     not be copied,
>>     >>> disclosed to, retained or used by, any other party. If you
>>     are not an
>>     >>> intended recipient then please promptly delete this e-mail
>>     and any
>>     >>> attachment and all copies and inform the sender. Thank you for
>>     >>> understanding.
>>     >>>
>>     >>>
>>     >>> _______________________________________________
>>     >>> sr-dev mailing list
>>     >>> sr-dev at lists.sip-router.org <mailto:sr-dev at lists.sip-router.org>
>>     >>> http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-dev
>>     >>>
>>     >>
>>     >>
>>     >> --
>>     >> Daniel Ciprus
>>     >> Integration engineer
>>     >> http://www.acision.com
>>     >>
>>     >> 9954 Mayland Dr
>>     >> Suite 3100
>>     >> Richmond, VA 23233
>>     >> USA
>>     >> T: +1 804 762 5601 <tel:%2B1%20804%20762%205601>
>>     >> E: daniel.ciprus at acision.com <mailto:daniel.ciprus at acision.com>
>>     >>
>>     >> ________________________________
>>     >> This e-mail and any attachment is for authorised use by the
>>     intended
>>     >> recipient(s) only. It may contain proprietary material,
>>     confidential
>>     >> information and/or be subject to legal privilege. It should
>>     not be copied,
>>     >> disclosed to, retained or used by, any other party. If you are
>>     not an
>>     >> intended recipient then please promptly delete this e-mail and any
>>     >> attachment and all copies and inform the sender. Thank you for
>>     >> understanding.
>>     >>
>>     >>
>>     >> _______________________________________________
>>     >> sr-dev mailing list
>>     >> sr-dev at lists.sip-router.org <mailto:sr-dev at lists.sip-router.org>
>>     >> http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-dev
>>     >>
>>     >
>>     >
>>     >
>>     > --
>>     > Carsten Bock
>>     > CEO (Geschäftsführer)
>>     >
>>     > ng-voice GmbH
>>     > Schomburgstr. 80
>>     > D-22767 Hamburg / Germany
>>     >
>>     > http://www.ng-voice.com
>>     > mailto:carsten at ng-voice.com <mailto:carsten at ng-voice.com>
>>     >
>>     > Office +49 40 34927219 <tel:%2B49%2040%2034927219>
>>     > Fax +49 40 34927220 <tel:%2B49%2040%2034927220>
>>     >
>>     > Sitz der Gesellschaft: Hamburg
>>     > Registergericht: Amtsgericht Hamburg, HRB 120189
>>     > Geschäftsführer: Carsten Bock
>>     > Ust-ID: DE279344284
>>     >
>>     > Hier finden Sie unsere handelsrechtlichen Pflichtangaben:
>>     > http://www.ng-voice.com/imprint/ <http://www.ng-voice.com/imprint/>
>>
>>
>>
>>     --
>>     Carsten Bock
>>     CEO (Geschäftsführer)
>>
>>     ng-voice GmbH
>>     Schomburgstr. 80
>>     D-22767 Hamburg / Germany
>>
>>     http://www.ng-voice.com
>>     mailto:carsten at ng-voice.com <mailto:carsten at ng-voice.com>
>>
>>     Office +49 40 34927219 <tel:%2B49%2040%2034927219>
>>     Fax +49 40 34927220 <tel:%2B49%2040%2034927220>
>>
>>     Sitz der Gesellschaft: Hamburg
>>     Registergericht: Amtsgericht Hamburg, HRB 120189
>>     Geschäftsführer: Carsten Bock
>>     Ust-ID: DE279344284
>>
>>     Hier finden Sie unsere handelsrechtlichen Pflichtangaben:
>>     http://www.ng-voice.com/imprint/
>>
>>
>
> -- 
> *Daniel Ciprus*
> Integration engineer
> http://www.acision.com
>
> 9954 Mayland Dr
> Suite 3100
> Richmond, VA 23233
> USA
> T: +1 804 762 5601
> E: daniel.ciprus at acision.com
>
> ------------------------------------------------------------------------
> This e-mail and any attachment is for authorised use by the intended 
> recipient(s) only. It may contain proprietary material, confidential 
> information and/or be subject to legal privilege. It should not be 
> copied, disclosed to, retained or used by, any other party. If you are 
> not an intended recipient then please promptly delete this e-mail and 
> any attachment and all copies and inform the sender. Thank you for 
> understanding.
>
>
>
> _______________________________________________
> sr-dev mailing list
> sr-dev at lists.sip-router.org
> http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-dev


-- 
Hugh Waite
Principal Design Engineer
Crocodile RCS Ltd.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.sip-router.org/pipermail/sr-dev/attachments/20140313/b0a58e48/attachment-0001.html>


More information about the sr-dev mailing list