[SR-Users] kamailio with evapi crashing on high volume
Daniel-Constantin Mierla
miconda at gmail.com
Fri Sep 7 11:18:31 CEST 2018
Hello,
I write directly to you in case you can grant access to the system.
Otherwise I need also the output of 'p *dead_cell' for the core that has
the frame 0:
#0 0x00007f6af0b90b70 in free_cell_helper (dead_cell=0x7f6ab0a6baa8,
silent=0, fname=0x7f6af0c8f630 "timer.c", fline=654) at h_table.c:230
Cheers,
Daniel
On 07.09.18 09:56, Jayesh Nambiar wrote:
> Hi Daniel,
> This happens randomly. It is not a testbed and never reproducible on
> the test server. I can still give you access if you wish to analyse
> the core files to check whats happening here exactly.
> It simply feels like on heavy load the evapi socket gives up. Earlier
> I was running 5.0.2 and in that kamailio would get stuck by not
> sending any async events on evapi socket. Not even the
> evapi:connection-closed event gets triggered. After I upgraded to
> latest stable version (5.1.5), it at least started crashing. Here are
> two core dumps put onto pastebin.
> https://pastebin.com/nn6gJapm
> https://pastebin.com/ph7b8vFH
>
> Thanks for all the support,
>
> - Jayesh
>
> On Fri, Sep 7, 2018 at 1:55 AM Daniel-Constantin Mierla
> <miconda at gmail.com <mailto:miconda at gmail.com>> wrote:
>
> Hello,
>
> are you able to reproduce it somehow? Or just happens randomly?
>
> Is it on a testbed where I could get access to investigate the
> corefiles? If not, then I will ask for more details from cores
> over email, first would be 'bt full' for both cores and 'p
> *dead_cell' for the second one.
>
> Cheers,
> Daniel
>
> On 05.09.18 17:47, Jayesh Nambiar wrote:
>> Hi Daniel,
>> Have got these core dumps. Let me know if I should be doing a bt
>> full. I'll pastebin and send. Thanks,
>>
>> Core was generated by `/usr/local/kamailio/sbin/kamailio -P
>> /var/run/siptrunk.pid -f /usr/local/carrie'.
>> Program terminated with signal SIGSEGV, Segmentation fault.
>> #0 0x0000000000505d30 in sip_msg_shm_clone
>> (org_msg=0x7f6ab0d9f618, sip_msg_len=0x7ffdddb2e8bc,
>> clone_lumps=1) at core/sip_msg_clone.c:491
>> 491LUMP_LIST_LEN(len, org_msg->add_rm);
>> (gdb) bt
>> #0 0x0000000000505d30 in sip_msg_shm_clone
>> (org_msg=0x7f6ab0d9f618, sip_msg_len=0x7ffdddb2e8bc,
>> clone_lumps=1) at core/sip_msg_clone.c:491
>> #1 0x00007f6af0bdf68d in fake_req (shmem_msg=0x7f6ab0d9f618,
>> extra_flags=0, uac=0x7f6ab1738980, len=0x7ffdddb2e8bc) at
>> t_reply.c:854
>> #2 0x00007f6af0c3aa27 in t_continue_helper (hash_index=58039,
>> label=413633661, rtact=0x7f6af10500f0, cbname=0x0, cbparam=0x0)
>> at t_suspend.c:293
>> #3 0x00007f6af0c3eed4 in t_continue (hash_index=58039,
>> label=413633661, route=0x7f6af10500f0) at t_suspend.c:583
>> #4 0x00007f6aae4dd010 in w_t_continue (msg=0x7ffdddb2fa60,
>> idx=0x7f6af1098e90 "8\306\t\361j\177", lbl=0x7f6af1098ff0
>> "\240\264\t\361j\177", rtn=0x7f6af1099150 "0\275\t\361j\177") at
>> tmx_mod.c:760
>> #5 0x000000000045b477 in do_action (h=0x7ffdddb2f850,
>> a=0x7f6af109ab38, msg=0x7ffdddb2fa60) at core/action.c:1085
>> #6 0x0000000000467fd5 in run_actions (h=0x7ffdddb2f850,
>> a=0x7f6af1096630, msg=0x7ffdddb2fa60) at core/action.c:1565
>> #7 0x000000000045b234 in do_action (h=0x7ffdddb2f850,
>> a=0x7f6af10a0f80, msg=0x7ffdddb2fa60) at core/action.c:1058
>> #8 0x0000000000467fd5 in run_actions (h=0x7ffdddb2f850,
>> a=0x7f6af10a0f80, msg=0x7ffdddb2fa60) at core/action.c:1565
>> #9 0x0000000000468797 in run_top_route (a=0x7f6af10a0f80,
>> msg=0x7ffdddb2fa60, c=0x0) at core/action.c:1654
>> #10 0x00007f6aabe79370 in evapi_run_cfg_route
>> (evenv=0x7ffdddb30250, rt=3, rtname=0x7f6aac08cb18
>> <_evapi_rts+56>) at evapi_dispatch.c:161
>> #11 0x00007f6aabe7f271 in evapi_recv_client (loop=0x7f6aabe698e0,
>> watcher=0x27af5e0, revents=1) at evapi_dispatch.c:467
>> #12 0x00007f6aabc5fd73 in ev_invoke_pending () from
>> /usr/lib/x86_64-linux-gnu/libev.so.4
>> #13 0x00007f6aabc633de in ev_run () from
>> /usr/lib/x86_64-linux-gnu/libev.so.4
>> #14 0x00007f6aabe7867c in ev_loop (loop=0x7f6aabe698e0, flags=0)
>> at /usr/include/ev.h:835
>> #15 0x00007f6aabe83fc6 in evapi_run_dispatcher
>> (laddr=0x7f6af0f72300 "127.0.0.1", lport=8060) at
>> evapi_dispatch.c:705
>> #16 0x00007f6aabe6e262 in child_init (rank=0) at evapi_mod.c:213
>> #17 0x0000000000542cad in init_mod_child (m=0x7f6af0f71b70,
>> rank=0) at core/sr_module.c:943
>> #18 0x0000000000542971 in init_mod_child (m=0x7f6af0f72968,
>> rank=0) at core/sr_module.c:939
>> #19 0x0000000000542971 in init_mod_child (m=0x7f6af0f73d38,
>> rank=0) at core/sr_module.c:939
>> #20 0x0000000000542971 in init_mod_child (m=0x7f6af0f74670,
>> rank=0) at core/sr_module.c:939
>> #21 0x0000000000542971 in init_mod_child (m=0x7f6af0f76708,
>> rank=0) at core/sr_module.c:939
>> #22 0x0000000000542971 in init_mod_child (m=0x7f6af0f76c08,
>> rank=0) at core/sr_module.c:939
>> #23 0x0000000000542971 in init_mod_child (m=0x7f6af0f770d0,
>> rank=0) at core/sr_module.c:939
>> #24 0x0000000000542971 in init_mod_child (m=0x7f6af0f77cf0,
>> rank=0) at core/sr_module.c:939
>> #25 0x0000000000542971 in init_mod_child (m=0x7f6af0f78808,
>> rank=0) at core/sr_module.c:939
>> #26 0x0000000000542971 in init_mod_child (m=0x7f6af0f78bd8,
>> rank=0) at core/sr_module.c:939
>> #27 0x0000000000542971 in init_mod_child (m=0x7f6af0f794c8,
>> rank=0) at core/sr_module.c:939
>> #28 0x0000000000542971 in init_mod_child (m=0x7f6af0f79920,
>> rank=0) at core/sr_module.c:939
>> #29 0x0000000000542971 in init_mod_child (m=0x7f6af0f7a330,
>> rank=0) at core/sr_module.c:939
>> #30 0x0000000000542971 in init_mod_child (m=0x7f6af0f7afd0,
>> rank=0) at core/sr_module.c:939
>> #31 0x0000000000542971 in init_mod_child (m=0x7f6af0f7bc80,
>> rank=0) at core/sr_module.c:939
>> #32 0x000000000054303d in init_child (rank=0) at core/sr_module.c:970
>> #33 0x0000000000425399 in main_loop () at main.c:1701
>> #34 0x000000000042bd5c in main (argc=13, argv=0x7ffdddb31088) at
>> main.c:2638
>>
>> And this:
>> [New LWP 15804]
>> [Thread debugging using libthread_db enabled]
>> Using host libthread_db library
>> "/lib/x86_64-linux-gnu/libthread_db.so.1".
>> Core was generated by `/usr/local/kamailio/sbin/kamailio -P
>> /var/run/siptrunk.pid -f /usr/local/carrie'.
>> Program terminated with signal SIGSEGV, Segmentation fault.
>> #0 0x00007f6af0b90b70 in free_cell_helper
>> (dead_cell=0x7f6ab0a6baa8, silent=0, fname=0x7f6af0c8f630
>> "timer.c", fline=654) at h_table.c:230
>> 230foo = tt->next;
>> (gdb) bt
>> #0 0x00007f6af0b90b70 in free_cell_helper
>> (dead_cell=0x7f6ab0a6baa8, silent=0, fname=0x7f6af0c8f630
>> "timer.c", fline=654) at h_table.c:230
>> #1 0x00007f6af0c24409 in wait_handler (ti=932640643,
>> wait_tl=0x7f6ab0a6bb28, data=0x7f6ab0a6baa8) at timer.c:654
>> #2 0x00000000004bb445 in timer_list_expire (t=932640643,
>> h=0x7f6ab03ad158, slow_l=0x7f6ab03ae480, slow_mark=271) at
>> core/timer.c:874
>> #3 0x00000000004bb8ab in timer_handler () at core/timer.c:939
>> #4 0x00000000004bbd30 in timer_main () at core/timer.c:978
>> #5 0x00000000004250f9 in main_loop () at main.c:1691
>> #6 0x000000000042bd5c in main (argc=13, argv=0x7ffdddb31088) at
>> main.c:2638
>>
>> On Wed, Sep 5, 2018 at 3:13 PM Daniel-Constantin Mierla
>> <miconda at gmail.com <mailto:miconda at gmail.com>> wrote:
>>
>> Hello,
>>
>> the backtrace doesn't show any hint about kamailio, only from
>> closelog() up.
>>
>> It may be the core generated by shutdown procedure, have you
>> enabled one core file per pid/process? If not, do it and
>> reproduce the issue again, you may get two core files, one
>> being the runtime issue and the other one from shutdown
>> procedure, which likely is an effect of the other one. The
>> one from the runtime is more relevant.
>>
>> Cheers,
>> Daniel
>>
>>
>> On 05.09.18 10:09, Jayesh Nambiar wrote:
>>> Hello,
>>> I'm using kamailio 5.1.5 with evapi. I have a node.js
>>> connecting with kamailio evapi to which I send events and
>>> also consume events based on which I do the routing. I have
>>> 8 evapi workers defined in the config.
>>> The problem is that kamailio randomly crashes on high load.
>>> I'm assuming that it is related to the evapi module as rest
>>> of the config is pretty straight forward. I could get a core
>>> file and here's the core dump:
>>> [New LWP 14042]
>>> [Thread debugging using libthread_db enabled]
>>> Using host libthread_db library
>>> “/lib/x86_64-linux-gnu/libthread_db.so.1”.
>>> Core was generated by `/usr/local/kamailio/sbin/kamailio -P
>>> /var/run/siptrunk.pid -f /usr/local/carrie’.
>>> Program terminated with signal SIGABRT, Aborted.
>>> #0 0x00007f9995283428 in __GI_raise (sig=sig at entry=6) at
>>> ../sysdeps/unix/sysv/linux/raise.c:54
>>> 54 ../sysdeps/unix/sysv/linux/raise.c: No such file or
>>> directory.
>>> (gdb) bt
>>> #0 0x00007f9995283428 in __GI_raise (sig=sig at entry=6) at
>>> ../sysdeps/unix/sysv/linux/raise.c:54
>>> #1 0x00007f999528502a in __GI_abort () at abort.c:89
>>> #2 0x000000000041a029 in sig_alarm_abort (signo=14) at
>>> main.c:646
>>> #3 <signal handler called>
>>> #4 0x00007f999534f497 in __libc_cleanup_routine
>>> (f=<optimized out>) at ../sysdeps/nptl/libc-lockP.h:291
>>> #5 closelog () at ../misc/syslog.c:415
>>> #6 0x0000000000000000 in ?? ()
>>>
>>> Any help in this regards is would allow me to identify the
>>> reason of the crash. Thanks for the support.
>>>
>>> - Jayesh
>>>
>>>
>>> _______________________________________________
>>> Kamailio (SER) - Users Mailing List
>>> sr-users at lists.kamailio.org <mailto:sr-users at lists.kamailio.org>
>>> https://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-users
>>
>> --
>> Daniel-Constantin Mierla -- www.asipto.com <http://www.asipto.com>
>> www.twitter.com/miconda <http://www.twitter.com/miconda> -- www.linkedin.com/in/miconda <http://www.linkedin.com/in/miconda>
>> Kamailio World Conference -- www.kamailioworld.com <http://www.kamailioworld.com>
>> Kamailio Advanced Training, Nov 12-14, 2018, in Berlin -- www.asipto.com <http://www.asipto.com>
>>
>
> --
> Daniel-Constantin Mierla -- www.asipto.com <http://www.asipto.com>
> www.twitter.com/miconda <http://www.twitter.com/miconda> -- www.linkedin.com/in/miconda <http://www.linkedin.com/in/miconda>
> Kamailio World Conference -- www.kamailioworld.com <http://www.kamailioworld.com>
> Kamailio Advanced Training, Nov 12-14, 2018, in Berlin -- www.asipto.com <http://www.asipto.com>
>
--
Daniel-Constantin Mierla -- www.asipto.com
www.twitter.com/miconda -- www.linkedin.com/in/miconda
Kamailio World Conference -- www.kamailioworld.com
Kamailio Advanced Training, Nov 12-14, 2018, in Berlin -- www.asipto.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.kamailio.org/pipermail/sr-users/attachments/20180907/f5b40eac/attachment.html>
More information about the sr-users
mailing list