Could be a buffer overflow somewhere.
First, do a 'bt full' for both cores and send the output, just to see if
something is strage inside the structures.
Then, can you compile with MEMDBG=1 in Makefile.defs, reinstall and test
again. Check the logs for memory related error messages and see if you
get head/tail overwritten.
Cheers,
Daniel
On 8/28/13 9:11 AM, Alex Balashov wrote:
With the patch applied, I sometimes get this
crash, too:
(gdb) where
#0 0x0000000000539ed9 in qm_detach_free (qm=0x7f516c197010,
frag=0x7f516c3b6dc0) at mem/q_malloc.c:268
#1 0x000000000053a118 in qm_malloc (qm=0x7f516c197010, size=960)
at mem/q_malloc.c:386
#2 0x00000000004bc41d in rval_new_empty (extra_size=102) at
rvalue.c:236
#3 0x00000000004bc48f in rval_new_str (s=0x7ffff3194e70,
extra_size=80)
at rvalue.c:260
#4 0x00000000004beb87 in rval_convert (h=0x7ffff3196cb0,
msg=0x7f516b6e7920,
type=RV_STR, v=0x7f516c2e3728, c=0x7ffff3195030) at rvalue.c:1321
#5 0x00000000004c002b in rval_str_lop2 (h=0x7ffff3196cb0,
msg=0x7f516b6e7920,
res=0x7ffff31954d8, op=RVE_EQ_OP, l=0x7f516c2e3728,
c1=0x7ffff3195030,
r=0x7f516c2e3ea8, c2=0x0) at rvalue.c:1752
#6 0x00000000004c0c61 in rval_expr_eval_int (h=0x7ffff3196cb0,
msg=0x7f516b6e7920, res=0x7ffff31954d8, rve=0x7f516c2e4580)
at rvalue.c:2058
#7 0x0000000000418d5a in do_action (h=0x7ffff3196cb0,
a=0x7f516c2e59d0,
msg=0x7f516b6e7920) at action.c:1050
#8 0x0000000000421aa7 in run_actions (h=0x7ffff3196cb0,
a=0x7f516c2e3580,
msg=0x7f516b6e7920) at action.c:1573
#9 0x000000000042047f in do_action (h=0x7ffff3196cb0,
a=0x7f516c2e5d70,
msg=0x7f516b6e7920) at action.c:1374
#10 0x0000000000421aa7 in run_actions (h=0x7ffff3196cb0,
a=0x7f516c2daae0,
msg=0x7f516b6e7920) at action.c:1573
#11 0x0000000000418fa2 in do_action (h=0x7ffff3196cb0,
a=0x7f516c2eb450,
msg=0x7f516b6e7920) at action.c:1065
#12 0x0000000000421aa7 in run_actions (h=0x7ffff3196cb0,
a=0x7f516c2eb450,
msg=0x7f516b6e7920) at action.c:1573
#13 0x0000000000418ffb in do_action (h=0x7ffff3196cb0,
a=0x7f516c2eb550,
msg=0x7f516b6e7920) at action.c:1069
#14 0x0000000000421aa7 in run_actions (h=0x7ffff3196cb0,
a=0x7f516c2c4170,
msg=0x7f516b6e7920) at action.c:1573
#15 0x0000000000416f3a in do_action (h=0x7ffff3196cb0,
a=0x7f516c2f5800,
---Type <return> to continue, or q <return> to quit---
msg=0x7f516b6e7920) at action.c:690
#16 0x0000000000421aa7 in run_actions (h=0x7ffff3196cb0,
a=0x7f516c2ef330,
msg=0x7f516b6e7920) at action.c:1573
#17 0x0000000000422231 in run_top_route (a=0x7f516c2ef330,
msg=0x7f516b6e7920,
c=0x0) at action.c:1658
#18 0x00007f516b49b220 in run_failure_handlers (t=0x7f506769a3b0,
rpl=0x7f516c3b5df0, code=480, extra_flags=64) at t_reply.c:1024
#19 0x00007f516b49c39b in t_should_relay_response
(Trans=0x7f506769a3b0,
new_code=480, branch=0, should_store=0x7ffff3196f90,
should_relay=0x7ffff3196f94, cancel_data=0x7ffff31971a0,
reply=0x7f516c3b5df0) at t_reply.c:1300
#20 0x00007f516b49dec4 in relay_reply (t=0x7f506769a3b0,
p_msg=0x7f516c3b5df0,
branch=0, msg_status=480, cancel_data=0x7ffff31971a0,
do_put_on_wait=1)
at t_reply.c:1703
#21 0x00007f516b4a0f46 in reply_received (p_msg=0x7f516c3b5df0)
at t_reply.c:2370
#22 0x0000000000458861 in do_forward_reply (msg=0x7f516c3b5df0, mode=0)
at forward.c:799
#23 0x00000000004590d0 in forward_reply (msg=0x7f516c3b5df0) at
forward.c:882
#24 0x000000000049e276 in receive_msg (
buf=0x9065c0 "SIP/2.0 480 Temporarily Unavailable\r\nVia:
SIP/2.0/UDP 55.177.31.199;branch=z9hG4bK25c.fe28da07.0\r\nVia:
SIP/2.0/UDP
192.13.219.87:5060;branch=z9hG4bK-1cc0-521da21e-332440e0-d482ab\r\nRecord-Route:
<sip:6"..., len=862,
rcv_info=0x7ffff3197520) at receive.c:272
#25 0x000000000052ffa1 in udp_rcv_loop () at udp_server.c:557
#26 0x0000000000467de2 in main_loop () at main.c:1638
#27 0x000000000046ad8b in main (argc=13, argv=0x7ffff3197858) at
main.c:2566
I've seen it before in this scenario, but so infrequently that I
didn't think it was worth mentioning.
On 08/28/2013 03:01 AM, Alex Balashov wrote:
Hi Daniel,
With your patch applied (setting param list head to NULL), it now
crashes in a different place:
Program terminated with signal 11, Segmentation fault.
#0 0x000000000055e602 in free_to_params (tb=0x7f31fee421a0)
at parser/parse_to.c:827
827 foo = tp->next;
Missing separate debuginfos, use: debuginfo-install
cyrus-sasl-lib-2.1.23-13.el6_3.1.x86_64 glibc-2.12-1.107.el6.x86_64
keyutils-libs-1.4-4.el6.x86_64 krb5-libs-1.10.3-10.el6_4.2.x86_64
libcom_err-1.41.12-14.el6.x86_64 libselinux-2.0.94-5.3.el6_4.1.x86_64
nspr-4.9.2-1.el6.x86_64 nss-3.14.0.0-12.el6.x86_64
nss-softokn-freebl-3.12.9-11.el6.x86_64 nss-util-3.14.0.0-2.el6.x86_64
openldap-2.4.23-32.el6_4.1.x86_64 openssl-1.0.0-27.el6_4.2.x86_64
postgresql92-libs-9.2.4-1PGDG.rhel6.x86_64 zlib-1.2.3-29.el6.x86_64
(gdb) where
#0 0x000000000055e602 in free_to_params (tb=0x7f31fee421a0)
at parser/parse_to.c:827
#1 0x000000000055e658 in free_to (tb=0x7f31fee421a0) at
parser/parse_to.c:838
#2 0x000000000053e2a9 in clean_hdr_field (hf=0x7f31fee23bc0)
at parser/hf.c:113
#3 0x000000000053e51d in free_hdr_field_lst (hf=0x7f31fee20a60)
at parser/hf.c:223
#4 0x0000000000542d04 in free_sip_msg (msg=0x7f31fee40df0)
at parser/msg_parser.c:729
#5 0x000000000049e39d in receive_msg (
buf=0x9065c0 "SIP/2.0 480 Temporarily Unavailable\r\nVia:
SIP/2.0/UDP 55.177.31.199;branch=z9hG4bKbe3a.dab6345.0\r\nVia:
SIP/2.0/UDP
192.13.219.87:5060;branch=z9hG4bK-1a97-521d9f57-331967d3-3174bfdc\r\nRecord-Route:
<sip"..., len=866,
rcv_info=0x7fff34138bd0) at receive.c:296
#6 0x000000000052ffa1 in udp_rcv_loop () at udp_server.c:557
#7 0x0000000000467de2 in main_loop () at main.c:1638
#8 0x000000000046ad8b in main (argc=13, argv=0x7fff34138f08) at
main.c:2566
-- Alex
On 08/27/2013 08:49 AM, Alex Balashov wrote:
> Hi Daniel,
>
> On 08/27/2013 08:47 AM, Daniel-Constantin Mierla wrote:
>
>> Hello,
>>
>> can you try this patch?
>> -
>>
http://git.sip-router.org/cgi-bin/gitweb.cgi/sip-router/?a=commit;h=14835f8…
>>
>>
>>
>>
>>
>>
>> One reason for such crash could be double-free, which could
>> eventually
>> happen because the pointer to params was not reset after freeing the
>> list.
>
> I will certainly try it, thank you.
>
> However, it is curious that this crash occurs only in this exact
> situation, only when calling this PBX, only when it has two
> registrants
> to fork among, only when I use this combination of request
> routes/subroutines.
>