[SR-Users] Kamailio 1.5.5 No TLS Segmentation Fault

Timo Reimann timo.reimann at 1und1.de
Thu Feb 17 19:21:36 CET 2011


Hey Stagg,


On 17.02.2011 17:41, Stagg Shelton wrote:
> I have not tried the latest SVN yet.  My system just did another core again today.  The backtrace seems to show it at the same location.  Below is the backtrace from today.  If grabbing the latest SVN is the only chance of stopping this behavior I can do that, but I have to be as careful as possible to try and ensure that new anomalies are not introduced as a result of a code change.  Is the information from the backtrace even helpful for identifying the root cause of the issue?

I am quite sure the segfault you are experiencing stems from a bug which
has been fixed in revision 6049.

Basically, your backtrace indicates that Kamailio is trying to shut
down. Along the way, unref_dlg_from_cb() is called from a tm callback
with the purpose of decreasing the dialog reference counter. Prior to
revision 6049, the function didn't verify that (a) the dialog is still
valid and (b) no shutdown is in progress but went straight onward
calling unref_dlg().

Since your dialog pointer looks fine, condition (b) must be affecting
you in dlg_hash.c:474 where things start to go wrong. There, the
dialog's hash table entry is used as the lookup key for the the dialog
hash table "d_table". However, the latter has been deallocated and
nullified before (during dialog module shutdown), thereby triggering the
segfault when d_table is touched.

That pretty much sums up what must have likely happened. I strongly
suggest you upgrade to the latest SVN version of the 1.5 branch and
don't worry too much about any anomalies possibly introduced. Between
the release 1.5.5 and now, only 4 commits were done in the 1.5 branch,
each and everyone of them being bugfixes.

Hope this helps.


Cheers,

--Timo



> 
> Core was generated by `/sbin/kamailio -m 512'.
> Program terminated with signal 11, Segmentation fault.
> #0  0x00007f85596b0fa7 in unref_dlg (dlg=0x7f853f9d1588, cnt=1) at dlg_hash.c:474
> 474		d_entry = &(d_table->entries[dlg->h_entry]);
> Missing separate debuginfos, use: debuginfo-install bzip2-libs-1.0.5-5.fc11.x86_64 db4-4.7.25-11.fc11.x86_64 e2fsprogs-libs-1.41.9-2.fc11.x86_64 elfutils-libelf-0.147-1.fc11.x86_64 glibc-2.10.2-1.x86_64 keyutils-libs-1.2-5.fc11.x86_64 krb5-libs-1.6.3-31.fc11.x86_64 libacl-2.2.49-3.fc11.x86_64 libattr-2.4.43-3.fc11.x86_64 libcap-2.16-4.fc11.1.x86_64 libconfuse-2.6-2.fc11.x86_64 libgcc-4.4.1-2.fc11.x86_64 libselinux-2.0.80-1.fc11.x86_64 lm_sensors-3.1.0-1.fc11.x86_64 lua-5.1.4-3.fc11.x86_64 mysql-libs-5.1.46-1.fc11.x86_64 net-snmp-libs-5.4.2.1-13.fc11.x86_64 nspr-devel-4.8.4-1.3.fc11.x86_64 nss-devel-3.12.6-1.2.fc11.x86_64 nss-softokn-freebl-3.12.6-1.2.fc11.x86_64 openssl-0.9.8n-1.fc11.x86_64 pcre-7.8-2.fc11.x86_64 perl-libs-5.10.0-82.fc11.x86_64 popt-1.13-5.fc11.x86_64 radiusclient-ng-0.5.6-4.fc11.x86_64 rpm-libs-4.7.2-1.fc11.x86_64 tcp_wrappers-libs-7.6-55.fc11.x86_64 xz-libs-4.999.9-0.1.beta.20091007git.fc11.x86_64 zlib-1.2.3-22.fc11.x86_64
> (gdb) bt full
> #0  0x00007f85596b0fa7 in unref_dlg (dlg=0x7f853f9d1588, cnt=1) at dlg_hash.c:474
>         d_entry = 0x0
>         __FUNCTION__ = "unref_dlg"
> #1  0x00007f85596ac80f in unref_dlg_from_cb (t=0x7f853f91d1b0, type=4096, param=0x7f855fcc56e0) at dlg_handlers.c:622
>         dlg = 0x7f853f9d1588
> #2  0x00007f855fa93ea3 in run_trans_callbacks (type=4096, trans=0x7f853f91d1b0, req=0x0, rpl=0x0, code=0) at t_hooks.c:240
>         cbp = 0x7f853f76e278
>         backup = 0x71a9d0
>         trans_backup = 0xffffffffffffffff
>         __FUNCTION__ = "run_trans_callbacks"
> #3  0x00007f855fa823cc in free_cell (dead_cell=0x7f853f91d1b0) at h_table.c:132
>         b = 0x0
>         i = 1
>         rpl = 0x0
>         tt = 0x0
>         foo = 0x0
>         p = 0x7f853f73bc30
> #4  0x00007f855fa82bb6 in free_hash_table () at h_table.c:345
>         p_cell = 0x7f853f91d1b0
>         tmp_cell = 0x0
>         i = 6172
> #5  0x00007f855fa8f2a4 in tm_shutdown () at t_funcs.c:109
>         __FUNCTION__ = "tm_shutdown"
> #6  0x00000000004529f6 in destroy_modules () at sr_module.c:321
>         t = 0x7349d0
>         foo = 0x734910
>         __FUNCTION__ = "destroy_modules"
> #7  0x000000000041f6b4 in cleanup (show_status=1) at main.c:331
> No locals.
> #8  0x0000000000420597 in handle_sigs () at main.c:517
>         chld = 0
>         chld_status = 134
>         i = 10
>         do_exit = 1
> ---Type <return> to continue, or q <return> to quit---
>         shutdown_time = 60
>         __FUNCTION__ = "handle_sigs"
> #9  0x00000000004217b5 in main_loop () at main.c:859
>         chd_rank = 12
>         i = 4
>         pid = 29418
>         si = 0x0
>         __FUNCTION__ = "main_loop"
> #10 0x0000000000423410 in main (argc=3, argv=0x7fff4b00bb18) at main.c:1321
>         cfg_log_stderr = 0
>         cfg_stream = 0x225a010
>         c = -1
>         r = 0
>         tmp_len = 0
>         port = 0
>         proto = 4910128
>         ret = -1
>         rfd = 4
>         tmp = 0x7fff4b00cf84 ""
>         options = 0x4b77e0 "f:cCm:b:l:n:N:rRvdDFETSVhw:t:u:g:P:G:W:"
>         rand_source = 0x4b7d9c "/dev/urandom"
>         seed = 3664355638
>         __FUNCTION__ = "main"
> 
> --Stagg
> 
> On Feb 15, 2011, at 4:39 AM, Timo Reimann wrote:
> 
>> Hi Stagg,
>>
>> with regards to the failing function, there was a bugfix in the dialog
>> module which, unfortunately, didn't make it into 1.5.5 in time (revision
>> 6049). Could you try the latest SVN of 1.5 and see if it solves the issue?
>>
>> Thanks.
>>
>>
>> Cheers,
>>
>> --Timo
>>
>>
>>
>> On 14.02.2011 21:07, Stagg Shelton wrote:
>>> Hello,
>>>
>>> We have been having a problem with Kamilio faulting and dumping core files on occasion.  I have not been able to reproduce the failure at will, but notice the back trace seems to point toward actions with the dialogue.  Below is from a backtrace of a core file from just a few minutes ago.  Can anyone determine what may have caused the system to error and stop processing?
>>>
>>> Thanks
>>> Stagg
>>>
>>> Core was generated by `/sbin/kamailio -m 512'.
>>> Program terminated with signal 11, Segmentation fault.
>>> #0  0x00007f8a11d55fa7 in unref_dlg (dlg=0x7f89f7e07470, cnt=1) at dlg_hash.c:474
>>> 474		d_entry = &(d_table->entries[dlg->h_entry]);
>>> Missing separate debuginfos, use: debuginfo-install bzip2-libs-1.0.5-5.fc11.x86_64 db4-4.7.25-11.fc11.x86_64 e2fsprogs-libs-1.41.9-2.fc11.x86_64 elfutils-libelf-0.147-1.fc11.x86_64 glibc-2.10.2-1.x86_64 keyutils-libs-1.2-5.fc11.x86_64 krb5-libs-1.6.3-31.fc11.x86_64 libacl-2.2.49-3.fc11.x86_64 libattr-2.4.43-3.fc11.x86_64 libcap-2.16-4.fc11.1.x86_64 libconfuse-2.6-2.fc11.x86_64 libgcc-4.4.1-2.fc11.x86_64 libselinux-2.0.80-1.fc11.x86_64 lm_sensors-3.1.0-1.fc11.x86_64 lua-5.1.4-3.fc11.x86_64 mysql-libs-5.1.46-1.fc11.x86_64 net-snmp-libs-5.4.2.1-13.fc11.x86_64 nspr-devel-4.8.4-1.3.fc11.x86_64 nss-devel-3.12.6-1.2.fc11.x86_64 nss-softokn-freebl-3.12.6-1.2.fc11.x86_64 openssl-0.9.8n-1.fc11.x86_64 pcre-7.8-2.fc11.x86_64 perl-libs-5.10.0-82.fc11.x86_64 popt-1.13-5.fc11.x86_64 radiusclient-ng-0.5.6-4.fc11.x86_64 rpm-libs-4.7.2-1.fc11.x86_64 tcp_wrappers-libs-7.6-55.fc11.x86_64 xz-libs-4.999.9-0.1.beta.20091007git.fc11.x86_64 zlib-1.2.3-22.fc11.x86_64
>>> (gdb) bt full
>>> #0  0x00007f8a11d55fa7 in unref_dlg (dlg=0x7f89f7e07470, cnt=1) at dlg_hash.c:474
>>>        d_entry = 0x0
>>>        __FUNCTION__ = "unref_dlg"
>>> #1  0x00007f8a11d5180f in unref_dlg_from_cb (t=0x7f89f7d9c660, type=4096, param=0x7f8a1836a6e0) at dlg_handlers.c:622
>>>        dlg = 0x7f89f7e07470
>>> #2  0x00007f8a18138ea3 in run_trans_callbacks (type=4096, trans=0x7f89f7d9c660, req=0x0, rpl=0x0, code=0) at t_hooks.c:240
>>>        cbp = 0x7f89f7dc30e8
>>>        backup = 0x71a9d0
>>>        trans_backup = 0xffffffffffffffff
>>>        __FUNCTION__ = "run_trans_callbacks"
>>> #3  0x00007f8a181273cc in free_cell (dead_cell=0x7f89f7d9c660) at h_table.c:132
>>>        b = 0x0
>>>        i = 1
>>>        rpl = 0x0
>>>        tt = 0x0
>>>        foo = 0x7fff4282f190
>>>        p = 0x7f89f7d3b068
>>> #4  0x00007f8a18127bb6 in free_hash_table () at h_table.c:345
>>>        p_cell = 0x7f89f7d9c660
>>>        tmp_cell = 0x0
>>>        i = 4075
>>> #5  0x00007f8a181342a4 in tm_shutdown () at t_funcs.c:109
>>>        __FUNCTION__ = "tm_shutdown"
>>> #6  0x00000000004529f6 in destroy_modules () at sr_module.c:321
>>>        t = 0x7349d0
>>>        foo = 0x734910
>>>        __FUNCTION__ = "destroy_modules"
>>> #7  0x000000000041f6b4 in cleanup (show_status=1) at main.c:331
>>> No locals.
>>> #8  0x0000000000420597 in handle_sigs () at main.c:517
>>>        chld = 0
>>>        chld_status = 134
>>>        i = 12
>>>        do_exit = 1
>>> ---Type <return> to continue, or q <return> to quit---
>>>        shutdown_time = 60
>>>        __FUNCTION__ = "handle_sigs"
>>> #9  0x00000000004217b5 in main_loop () at main.c:859
>>>        chd_rank = 12
>>>        i = 4
>>>        pid = 21442
>>>        si = 0x0
>>>        __FUNCTION__ = "main_loop"
>>> #10 0x0000000000423410 in main (argc=3, argv=0x7fff4282f498) at main.c:1321
>>>        cfg_log_stderr = 0
>>>        cfg_stream = 0x1fe1010
>>>        c = -1
>>>        r = 0
>>>        tmp_len = 0
>>>        port = 0
>>>        proto = 4910128
>>>        ret = -1
>>>        rfd = 4
>>>        tmp = 0x7fff4282ff8a ""
>>>        options = 0x4b77e0 "f:cCm:b:l:n:N:rRvdDFETSVhw:t:u:g:P:G:W:"
>>>        rand_source = 0x4b7d9c "/dev/urandom"
>>>        seed = 3628387751
>>>        __FUNCTION__ = "main"
>>> (gdb) 
>>> (gdb) quit



More information about the sr-users mailing list