[sr-dev] SCSCF crashing during registration
Hugh Waite
hugh.waite at crocodile-rcs.com
Thu Mar 13 22:47:06 CET 2014
Dan,
There are two cores because of a crash in one process followed by a
crash when the other processes are trying to shutdown.
What's interesting is that the bt doesn't show useful pointers. If you
have installed from RPMs make sure the kamailio-debuginfo is from the
same build as the other RPMs.
Also, do the logs say anything? There should be a log entry from the
kernel for the segfault/signal that says which module crashed (e.g.
registrar.so) and possibly (hopefully) an error message just before that.
Hugh
On 13/03/2014 19:53, Daniel Ciprus wrote:
> Jason,
>
> I've tried multiple combinations for pattern but I'm getting only 2
> core files ...
>
> Details:
>
> ~]# cat /proc/sys/kernel/core_pattern
> /tmp/core.%e.sig%s.%p
>
> ~]# lsb_release -a
> LSB Version:
> :base-4.0-amd64:base-4.0-noarch:core-4.0-amd64:core-4.0-noarch:graphics-4.0-amd64:graphics-4.0-noarch:printing-4.0-amd64:printing-4.0-noarch
> Distributor ID: RedHatEnterpriseServer
> Description: Red Hat Enterprise Linux Server release 6.5 (Santiago)
> Release: 6.5
> Codename: Santiago
>
>
>
> (gdb) bt
> #0 0x00000000005350b0 in ?? ()
> #1 0x000000000053542a in ?? ()
> #2 0x00000000005356c7 in timer_main ()
> #3 0x000000000046d572 in main_loop ()
> #4 0x000000000047030b in main ()
> (gdb) bt full
> #0 0x00000000005350b0 in ?? ()
> No symbol table info available.
> #1 0x000000000053542a in ?? ()
> No symbol table info available.
> #2 0x00000000005356c7 in timer_main ()
> No symbol table info available.
> #3 0x000000000046d572 in main_loop ()
> No symbol table info available.
> #4 0x000000000047030b in main ()
> No symbol table info available.
> (gdb)
>
> (gdb) bt
> #0 0x00000031ba432925 in raise () from /lib64/libc.so.6
> #1 0x00000031ba434105 in abort () from /lib64/libc.so.6
> #2 0x0000000000546750 in ?? ()
> #3 0x000000000054853a in qm_free ()
> #4 0x00007f23d98f87de in free_local_ack_unsafe (lack=0x7f23d3319d70)
> at uac.c:600
> #5 0x00007f23d988ea57 in free_cell (dead_cell=0x7f23d3319a70) at
> h_table.c:217
> #6 0x00007f23d988f2ee in free_hash_table () at h_table.c:441
> #7 0x00007f23d98a2fca in tm_shutdown () at t_funcs.c:122
> #8 0x00000000004f7c7a in destroy_modules ()
> #9 0x0000000000466e63 in cleanup ()
> #10 0x0000000000467f65 in ?? ()
> #11 0x0000000000469679 in handle_sigs ()
> #12 0x000000000046db19 in main_loop ()
> #13 0x000000000047030b in main ()
> (gdb) bt full
> #0 0x00000031ba432925 in raise () from /lib64/libc.so.6
> No symbol table info available.
> #1 0x00000031ba434105 in abort () from /lib64/libc.so.6
> No symbol table info available.
> #2 0x0000000000546750 in ?? ()
> No symbol table info available.
> #3 0x000000000054853a in qm_free ()
> No symbol table info available.
> #4 0x00007f23d98f87de in free_local_ack_unsafe (lack=0x7f23d3319d70)
> at uac.c:600
> __FUNCTION__ = "free_local_ack_unsafe"
> #5 0x00007f23d988ea57 in free_cell (dead_cell=0x7f23d3319a70) at
> h_table.c:217
> b = 0x0
> i = 0
> rpl = 0x0
> tt = 0x0
> foo = 0x2fd3221000
> cbs = 0x0
> cbs_tmp = 0x7f23d35386b8
> __FUNCTION__ = "free_cell"
> #6 0x00007f23d988f2ee in free_hash_table () at h_table.c:441
> p_cell = 0x7f23d3319a70
> tmp_cell = 0x7f23d353dca0
> i = 580
> __FUNCTION__ = "free_hash_table"
> #7 0x00007f23d98a2fca in tm_shutdown () at t_funcs.c:122
> __FUNCTION__ = "tm_shutdown"
> #8 0x00000000004f7c7a in destroy_modules ()
> No symbol table info available.
> #9 0x0000000000466e63 in cleanup ()
> No symbol table info available.
> #10 0x0000000000467f65 in ?? ()
> No symbol table info available.
> #11 0x0000000000469679 in handle_sigs ()
> No symbol table info available.
> #12 0x000000000046db19 in main_loop ()
> No symbol table info available.
> #13 0x000000000047030b in main ()
> No symbol table info available.
> (gdb)
>
>
>
> On 03/13/2014 02:58 PM, Jason Penton wrote:
>> I don't think these cores indicate the real crash... I'd like to get
>> some more detail on what actually happened? Daniel, can you
>> re-create? Keep in mind that if your core dump config on your box is
>> not configured to name your cores according to process id or
>> timestamp one core will overwrite the other..... as a result you will
>> never see the core that is the root cause.
>>
>> Which OS are you running?
>>
>> if Linux, I use the following in /etc/sysctl.conf:
>>
>> kernel.core_pattern=/tmp/core.%e.%p.%h.%t
>>
>>
>> On Thu, Mar 13, 2014 at 8:45 PM, Carsten Bock <carsten at ng-voice.com
>> <mailto:carsten at ng-voice.com>> wrote:
>>
>> It looks a little bit like a "double free".
>>
>> You could try to disable the call to "abort()" in case this happens:
>> mem_safety=1
>> See: http://www.kamailio.org/wiki/cookbooks/devel/core#mem_safety
>>
>> Kind regards,
>> Carsten
>>
>> 2014-03-13 19:44 GMT+01:00 Carsten Bock <carsten at ng-voice.com
>> <mailto:carsten at ng-voice.com>>:
>> > It looks a little bit like a "double free".
>> >
>> > You could try to disable the call to "abort()" in case this
>> happens:
>> >
>> >
>> > 2014-03-13 17:22 GMT+01:00 Daniel Ciprus
>> <daniel.ciprus at acision.com <mailto:daniel.ciprus at acision.com>>:
>> >> There are no more core files on the filesystem :-(
>> >>
>> >> On 03/13/2014 12:18 PM, Jason Penton wrote:
>> >>
>> >> I'm afraid this is also not the correct core. Can you check
>> the timestamp on
>> >> the cores? Can you re-create the crash and send me the
>> correct core?
>> >>
>> >>
>> >>
>> >>
>> >> On Thu, Mar 13, 2014 at 5:36 PM, Daniel Ciprus
>> <daniel.ciprus at acision.com <mailto:daniel.ciprus at acision.com>>
>> >> wrote:
>> >>>
>> >>> So I cleaned up my junkyard and I got 2 core files:
>> >>>
>> >>> (gdb) bt
>> >>> #0 0x00000000005350b0 in ?? ()
>> >>> #1 0x000000000053542a in ?? ()
>> >>> #2 0x00000000005356c7 in timer_main ()
>> >>> #3 0x000000000046d572 in main_loop ()
>> >>> #4 0x000000000047030b in main ()
>> >>> (gdb) bt full
>> >>> #0 0x00000000005350b0 in ?? ()
>> >>>
>> >>> No symbol table info available.
>> >>> #1 0x000000000053542a in ?? ()
>> >>>
>> >>> No symbol table info available.
>> >>> #2 0x00000000005356c7 in timer_main ()
>> >>>
>> >>> No symbol table info available.
>> >>> #3 0x000000000046d572 in main_loop ()
>> >>>
>> >>> No symbol table info available.
>> >>> #4 0x000000000047030b in main ()
>> >>>
>> >>> No symbol table info available.
>> >>> (gdb)
>> >>>
>> >>>
>> >>> (gdb) bt full
>> >>> #0 0x00000031ba432925 in raise () from /lib64/libc.so.6
>> >>> No symbol table info available.
>> >>> #1 0x00000031ba434105 in abort () from /lib64/libc.so.6
>> >>> No symbol table info available.
>> >>> #2 0x0000000000546750 in ?? ()
>> >>> No symbol table info available.
>> >>> #3 0x000000000054853a in qm_free ()
>> >>> No symbol table info available.
>> >>> #4 0x00007f5bf7d5a7de in free_local_ack_unsafe
>> (lack=0x7f5bf1894528) at
>> >>> uac.c:600
>> >>> __FUNCTION__ = "free_local_ack_unsafe"
>> >>> #5 0x00007f5bf7cf0a57 in free_cell (dead_cell=0x7f5bf1894228) at
>> >>> h_table.c:217
>> >>>
>> >>> b = 0x0
>> >>> i = 0
>> >>> rpl = 0x0
>> >>> tt = 0x0
>> >>> foo = 0x2ff1683000
>> >>> cbs = 0x0
>> >>> cbs_tmp = 0x7f5bf198e508
>> >>> __FUNCTION__ = "free_cell"
>> >>> #6 0x00007f5bf7cf12ee in free_hash_table () at h_table.c:441
>> >>> p_cell = 0x7f5bf1894228
>> >>> tmp_cell = 0x7f5bf1894228
>> >>> i = 3533
>> >>> __FUNCTION__ = "free_hash_table"
>> >>> #7 0x00007f5bf7d04fca in tm_shutdown () at t_funcs.c:122
>> >>>
>> >>> __FUNCTION__ = "tm_shutdown"
>> >>> #8 0x00000000004f7c7a in destroy_modules ()
>> >>> No symbol table info available.
>> >>> #9 0x0000000000466e63 in cleanup ()
>> >>> No symbol table info available.
>> >>> #10 0x0000000000467f65 in ?? ()
>> >>> No symbol table info available.
>> >>> #11 0x0000000000469679 in handle_sigs ()
>> >>> No symbol table info available.
>> >>> #12 0x000000000046db19 in main_loop ()
>> >>> No symbol table info available.
>> >>> #13 0x000000000047030b in main ()
>> >>> No symbol table info available.
>> >>> (gdb)
>> >>>
>> >>>
>> >>> On 03/13/2014 11:18 AM, Jason Penton wrote:
>> >>>
>> >>> Hi Daniel,
>> >>>
>> >>> this is the wrong core file. This is the one created on
>> shutdown of
>> >>> kamailio. Can you do a bt on the other core file that you
>> probably have...
>> >>>
>> >>> Cheers
>> >>> Jason
>> >>>
>> >>>
>> >>> On Thu, Mar 13, 2014 at 5:05 PM, Daniel Ciprus
>> <daniel.ciprus at acision.com <mailto:daniel.ciprus at acision.com>>
>> >>> wrote:
>> >>>>
>> >>>> Folks,
>> >>>>
>> >>>> This is happening during the registration on SCSCF.
>> >>>>
>> >>>> Server:: kamailio (4.2.0-dev2 (x86_64/linux))
>> >>>> Build:: mi_core.c compiled on 10:01:09 Mar 13 2014 with gcc
>> 4.4.6
>> >>>> Flags:: STATS: Off, USE_TCP, USE_TLS, TLS_HOOKS, USE_RAW_SOCKS,
>> >>>> DISABLE_NAGLE, USE_MCAST, DNS_IP_HACK, SHM_MEM, SHM_MMAP,
>> PKG_MALLOC,
>> >>>> DBG_QM_MALLOC, USE_FUTEX, FAST_LOCK-ADAPTIVE_WAIT,
>> USE_DNS_CACHE,
>> >>>> USE_DNS_FAILOVER, USE_NAPTR, USE_DST_BLACKLIST, HAVE_RESOLV_RES
>> >>>> GIT:: unknown
>> >>>> Now:: Thu Mar 13 11:04:47 2014
>> >>>> Up since:: Thu Mar 13 10:58:12 2014
>> >>>> Up time:: 395 [sec]
>> >>>>
>> >>>> (gdb) bt
>> >>>> #0 0x00000031ba432925 in raise () from /lib64/libc.so.6
>> >>>> #1 0x00000031ba434105 in abort () from /lib64/libc.so.6
>> >>>> #2 0x0000000000546750 in ?? ()
>> >>>> #3 0x000000000054853a in qm_free ()
>> >>>> #4 0x00007fb4def5b7de in free_local_ack_unsafe
>> (lack=0x7fb4d8b31728) at
>> >>>> uac.c:600
>> >>>> #5 0x00007fb4deef1a57 in free_cell
>> (dead_cell=0x7fb4d8b31428) at
>> >>>> h_table.c:217
>> >>>> #6 0x00007fb4deef22ee in free_hash_table () at h_table.c:441
>> >>>> #7 0x00007fb4def05fca in tm_shutdown () at t_funcs.c:122
>> >>>> #8 0x00000000004f7c7a in destroy_modules ()
>> >>>> #9 0x0000000000466e63 in cleanup ()
>> >>>> #10 0x0000000000467f65 in ?? ()
>> >>>> #11 0x0000000000469679 in handle_sigs ()
>> >>>> #12 0x000000000046db19 in main_loop ()
>> >>>> #13 0x000000000047030b in main ()
>> >>>> (gdb) bt full
>> >>>> #0 0x00000031ba432925 in raise () from /lib64/libc.so.6
>> >>>> No symbol table info available.
>> >>>> #1 0x00000031ba434105 in abort () from /lib64/libc.so.6
>> >>>> No symbol table info available.
>> >>>> #2 0x0000000000546750 in ?? ()
>> >>>> No symbol table info available.
>> >>>> #3 0x000000000054853a in qm_free ()
>> >>>> No symbol table info available.
>> >>>> #4 0x00007fb4def5b7de in free_local_ack_unsafe
>> (lack=0x7fb4d8b31728) at
>> >>>> uac.c:600
>> >>>> __FUNCTION__ = "free_local_ack_unsafe"
>> >>>> #5 0x00007fb4deef1a57 in free_cell
>> (dead_cell=0x7fb4d8b31428) at
>> >>>> h_table.c:217
>> >>>> b = 0x0
>> >>>> i = 0
>> >>>> rpl = 0x0
>> >>>> tt = 0x0
>> >>>> foo = 0x2fd8a8b000
>> >>>> cbs = 0x0
>> >>>> cbs_tmp = 0x7fb4d8d9c9e0
>> >>>> __FUNCTION__ = "free_cell"
>> >>>> #6 0x00007fb4deef22ee in free_hash_table () at h_table.c:441
>> >>>> p_cell = 0x7fb4d8b31428
>> >>>> tmp_cell = 0x7fb4d8b31428
>> >>>> i = 11517
>> >>>> __FUNCTION__ = "free_hash_table"
>> >>>> #7 0x00007fb4def05fca in tm_shutdown () at t_funcs.c:122
>> >>>> __FUNCTION__ = "tm_shutdown"
>> >>>> #8 0x00000000004f7c7a in destroy_modules ()
>> >>>> No symbol table info available.
>> >>>> #9 0x0000000000466e63 in cleanup ()
>> >>>> No symbol table info available.
>> >>>> #10 0x0000000000467f65 in ?? ()
>> >>>> No symbol table info available.
>> >>>> #11 0x0000000000469679 in handle_sigs ()
>> >>>> No symbol table info available.
>> >>>> #12 0x000000000046db19 in main_loop ()
>> >>>> No symbol table info available.
>> >>>> #13 0x000000000047030b in main ()
>> >>>> No symbol table info available.
>> >>>> (gdb)
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>> --
>> >>>> Daniel Ciprus
>> >>>> Integration engineer
>> >>>> http://www.acision.com
>> >>>>
>> >>>> 9954 Mayland Dr
>> >>>> Suite 3100
>> >>>> Richmond, VA 23233
>> >>>> USA
>> >>>> T: +1 804 762 5601 <tel:%2B1%20804%20762%205601>
>> >>>> E: daniel.ciprus at acision.com <mailto:daniel.ciprus at acision.com>
>> >>>>
>> >>>> ________________________________
>> >>>> This e-mail and any attachment is for authorised use by the
>> intended
>> >>>> recipient(s) only. It may contain proprietary material,
>> confidential
>> >>>> information and/or be subject to legal privilege. It should
>> not be copied,
>> >>>> disclosed to, retained or used by, any other party. If you
>> are not an
>> >>>> intended recipient then please promptly delete this e-mail
>> and any
>> >>>> attachment and all copies and inform the sender. Thank you for
>> >>>> understanding.
>> >>>>
>> >>>>
>> >>>> _______________________________________________
>> >>>> sr-dev mailing list
>> >>>> sr-dev at lists.sip-router.org <mailto:sr-dev at lists.sip-router.org>
>> >>>> http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-dev
>> >>>>
>> >>>
>> >>>
>> >>> --
>> >>> Daniel Ciprus
>> >>> Integration engineer
>> >>> http://www.acision.com
>> >>>
>> >>> 9954 Mayland Dr
>> >>> Suite 3100
>> >>> Richmond, VA 23233
>> >>> USA
>> >>> T: +1 804 762 5601 <tel:%2B1%20804%20762%205601>
>> >>> E: daniel.ciprus at acision.com <mailto:daniel.ciprus at acision.com>
>> >>>
>> >>> ________________________________
>> >>> This e-mail and any attachment is for authorised use by the
>> intended
>> >>> recipient(s) only. It may contain proprietary material,
>> confidential
>> >>> information and/or be subject to legal privilege. It should
>> not be copied,
>> >>> disclosed to, retained or used by, any other party. If you
>> are not an
>> >>> intended recipient then please promptly delete this e-mail
>> and any
>> >>> attachment and all copies and inform the sender. Thank you for
>> >>> understanding.
>> >>>
>> >>>
>> >>> _______________________________________________
>> >>> sr-dev mailing list
>> >>> sr-dev at lists.sip-router.org <mailto:sr-dev at lists.sip-router.org>
>> >>> http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-dev
>> >>>
>> >>
>> >>
>> >> --
>> >> Daniel Ciprus
>> >> Integration engineer
>> >> http://www.acision.com
>> >>
>> >> 9954 Mayland Dr
>> >> Suite 3100
>> >> Richmond, VA 23233
>> >> USA
>> >> T: +1 804 762 5601 <tel:%2B1%20804%20762%205601>
>> >> E: daniel.ciprus at acision.com <mailto:daniel.ciprus at acision.com>
>> >>
>> >> ________________________________
>> >> This e-mail and any attachment is for authorised use by the
>> intended
>> >> recipient(s) only. It may contain proprietary material,
>> confidential
>> >> information and/or be subject to legal privilege. It should
>> not be copied,
>> >> disclosed to, retained or used by, any other party. If you are
>> not an
>> >> intended recipient then please promptly delete this e-mail and any
>> >> attachment and all copies and inform the sender. Thank you for
>> >> understanding.
>> >>
>> >>
>> >> _______________________________________________
>> >> sr-dev mailing list
>> >> sr-dev at lists.sip-router.org <mailto:sr-dev at lists.sip-router.org>
>> >> http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-dev
>> >>
>> >
>> >
>> >
>> > --
>> > Carsten Bock
>> > CEO (Geschäftsführer)
>> >
>> > ng-voice GmbH
>> > Schomburgstr. 80
>> > D-22767 Hamburg / Germany
>> >
>> > http://www.ng-voice.com
>> > mailto:carsten at ng-voice.com <mailto:carsten at ng-voice.com>
>> >
>> > Office +49 40 34927219 <tel:%2B49%2040%2034927219>
>> > Fax +49 40 34927220 <tel:%2B49%2040%2034927220>
>> >
>> > Sitz der Gesellschaft: Hamburg
>> > Registergericht: Amtsgericht Hamburg, HRB 120189
>> > Geschäftsführer: Carsten Bock
>> > Ust-ID: DE279344284
>> >
>> > Hier finden Sie unsere handelsrechtlichen Pflichtangaben:
>> > http://www.ng-voice.com/imprint/ <http://www.ng-voice.com/imprint/>
>>
>>
>>
>> --
>> Carsten Bock
>> CEO (Geschäftsführer)
>>
>> ng-voice GmbH
>> Schomburgstr. 80
>> D-22767 Hamburg / Germany
>>
>> http://www.ng-voice.com
>> mailto:carsten at ng-voice.com <mailto:carsten at ng-voice.com>
>>
>> Office +49 40 34927219 <tel:%2B49%2040%2034927219>
>> Fax +49 40 34927220 <tel:%2B49%2040%2034927220>
>>
>> Sitz der Gesellschaft: Hamburg
>> Registergericht: Amtsgericht Hamburg, HRB 120189
>> Geschäftsführer: Carsten Bock
>> Ust-ID: DE279344284
>>
>> Hier finden Sie unsere handelsrechtlichen Pflichtangaben:
>> http://www.ng-voice.com/imprint/
>>
>>
>
> --
> *Daniel Ciprus*
> Integration engineer
> http://www.acision.com
>
> 9954 Mayland Dr
> Suite 3100
> Richmond, VA 23233
> USA
> T: +1 804 762 5601
> E: daniel.ciprus at acision.com
>
> ------------------------------------------------------------------------
> This e-mail and any attachment is for authorised use by the intended
> recipient(s) only. It may contain proprietary material, confidential
> information and/or be subject to legal privilege. It should not be
> copied, disclosed to, retained or used by, any other party. If you are
> not an intended recipient then please promptly delete this e-mail and
> any attachment and all copies and inform the sender. Thank you for
> understanding.
>
>
>
> _______________________________________________
> sr-dev mailing list
> sr-dev at lists.sip-router.org
> http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-dev
--
Hugh Waite
Principal Design Engineer
Crocodile RCS Ltd.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.sip-router.org/pipermail/sr-dev/attachments/20140313/b0a58e48/attachment-0001.html>
More information about the sr-dev
mailing list