[SR-Users] kamailio with evapi crashing on high volume

Daniel-Constantin Mierla miconda at gmail.com
Fri Sep 7 11:18:31 CEST 2018


Hello,

I write directly to you in case you can grant access to the system.

Otherwise I need also the output of 'p *dead_cell' for the core that has
the frame 0:

#0  0x00007f6af0b90b70 in free_cell_helper (dead_cell=0x7f6ab0a6baa8,
silent=0, fname=0x7f6af0c8f630 "timer.c", fline=654) at h_table.c:230

Cheers,
Daniel


On 07.09.18 09:56, Jayesh Nambiar wrote:
> Hi Daniel,
> This happens randomly. It is not a testbed and never reproducible on
> the test server. I can still give you access if you wish to analyse
> the core files to check whats happening here exactly.
> It simply feels like on heavy load the evapi socket gives up. Earlier
> I was running 5.0.2 and in that kamailio would get stuck by not
> sending any async events on evapi socket. Not even the
> evapi:connection-closed event gets triggered. After I upgraded to
> latest stable version (5.1.5), it at least started crashing. Here are
> two core dumps put onto pastebin.
> https://pastebin.com/nn6gJapm
> https://pastebin.com/ph7b8vFH
>
> Thanks for all the support,
>
> - Jayesh
>
> On Fri, Sep 7, 2018 at 1:55 AM Daniel-Constantin Mierla
> <miconda at gmail.com <mailto:miconda at gmail.com>> wrote:
>
>     Hello,
>
>     are you able to reproduce it somehow? Or just happens randomly?
>
>     Is it on a testbed where I could get access to investigate the
>     corefiles? If not, then I will ask for more details from cores
>     over email, first would be 'bt full' for both cores and 'p
>     *dead_cell' for the second one.
>
>     Cheers,
>     Daniel
>
>     On 05.09.18 17:47, Jayesh Nambiar wrote:
>>     Hi Daniel,
>>     Have got these core dumps. Let me know if I should be doing a bt
>>     full. I'll pastebin and send. Thanks,
>>
>>     Core was generated by `/usr/local/kamailio/sbin/kamailio -P
>>     /var/run/siptrunk.pid -f /usr/local/carrie'.
>>     Program terminated with signal SIGSEGV, Segmentation fault.
>>     #0  0x0000000000505d30 in sip_msg_shm_clone
>>     (org_msg=0x7f6ab0d9f618, sip_msg_len=0x7ffdddb2e8bc,
>>     clone_lumps=1) at core/sip_msg_clone.c:491
>>     491LUMP_LIST_LEN(len, org_msg->add_rm);
>>     (gdb) bt
>>     #0  0x0000000000505d30 in sip_msg_shm_clone
>>     (org_msg=0x7f6ab0d9f618, sip_msg_len=0x7ffdddb2e8bc,
>>     clone_lumps=1) at core/sip_msg_clone.c:491
>>     #1  0x00007f6af0bdf68d in fake_req (shmem_msg=0x7f6ab0d9f618,
>>     extra_flags=0, uac=0x7f6ab1738980, len=0x7ffdddb2e8bc) at
>>     t_reply.c:854
>>     #2  0x00007f6af0c3aa27 in t_continue_helper (hash_index=58039,
>>     label=413633661, rtact=0x7f6af10500f0, cbname=0x0, cbparam=0x0)
>>     at t_suspend.c:293
>>     #3  0x00007f6af0c3eed4 in t_continue (hash_index=58039,
>>     label=413633661, route=0x7f6af10500f0) at t_suspend.c:583
>>     #4  0x00007f6aae4dd010 in w_t_continue (msg=0x7ffdddb2fa60,
>>     idx=0x7f6af1098e90 "8\306\t\361j\177", lbl=0x7f6af1098ff0
>>     "\240\264\t\361j\177", rtn=0x7f6af1099150 "0\275\t\361j\177") at
>>     tmx_mod.c:760
>>     #5  0x000000000045b477 in do_action (h=0x7ffdddb2f850,
>>     a=0x7f6af109ab38, msg=0x7ffdddb2fa60) at core/action.c:1085
>>     #6  0x0000000000467fd5 in run_actions (h=0x7ffdddb2f850,
>>     a=0x7f6af1096630, msg=0x7ffdddb2fa60) at core/action.c:1565
>>     #7  0x000000000045b234 in do_action (h=0x7ffdddb2f850,
>>     a=0x7f6af10a0f80, msg=0x7ffdddb2fa60) at core/action.c:1058
>>     #8  0x0000000000467fd5 in run_actions (h=0x7ffdddb2f850,
>>     a=0x7f6af10a0f80, msg=0x7ffdddb2fa60) at core/action.c:1565
>>     #9  0x0000000000468797 in run_top_route (a=0x7f6af10a0f80,
>>     msg=0x7ffdddb2fa60, c=0x0) at core/action.c:1654
>>     #10 0x00007f6aabe79370 in evapi_run_cfg_route
>>     (evenv=0x7ffdddb30250, rt=3, rtname=0x7f6aac08cb18
>>     <_evapi_rts+56>) at evapi_dispatch.c:161
>>     #11 0x00007f6aabe7f271 in evapi_recv_client (loop=0x7f6aabe698e0,
>>     watcher=0x27af5e0, revents=1) at evapi_dispatch.c:467
>>     #12 0x00007f6aabc5fd73 in ev_invoke_pending () from
>>     /usr/lib/x86_64-linux-gnu/libev.so.4
>>     #13 0x00007f6aabc633de in ev_run () from
>>     /usr/lib/x86_64-linux-gnu/libev.so.4
>>     #14 0x00007f6aabe7867c in ev_loop (loop=0x7f6aabe698e0, flags=0)
>>     at /usr/include/ev.h:835
>>     #15 0x00007f6aabe83fc6 in evapi_run_dispatcher
>>     (laddr=0x7f6af0f72300 "127.0.0.1", lport=8060) at
>>     evapi_dispatch.c:705
>>     #16 0x00007f6aabe6e262 in child_init (rank=0) at evapi_mod.c:213
>>     #17 0x0000000000542cad in init_mod_child (m=0x7f6af0f71b70,
>>     rank=0) at core/sr_module.c:943
>>     #18 0x0000000000542971 in init_mod_child (m=0x7f6af0f72968,
>>     rank=0) at core/sr_module.c:939
>>     #19 0x0000000000542971 in init_mod_child (m=0x7f6af0f73d38,
>>     rank=0) at core/sr_module.c:939
>>     #20 0x0000000000542971 in init_mod_child (m=0x7f6af0f74670,
>>     rank=0) at core/sr_module.c:939
>>     #21 0x0000000000542971 in init_mod_child (m=0x7f6af0f76708,
>>     rank=0) at core/sr_module.c:939
>>     #22 0x0000000000542971 in init_mod_child (m=0x7f6af0f76c08,
>>     rank=0) at core/sr_module.c:939
>>     #23 0x0000000000542971 in init_mod_child (m=0x7f6af0f770d0,
>>     rank=0) at core/sr_module.c:939
>>     #24 0x0000000000542971 in init_mod_child (m=0x7f6af0f77cf0,
>>     rank=0) at core/sr_module.c:939
>>     #25 0x0000000000542971 in init_mod_child (m=0x7f6af0f78808,
>>     rank=0) at core/sr_module.c:939
>>     #26 0x0000000000542971 in init_mod_child (m=0x7f6af0f78bd8,
>>     rank=0) at core/sr_module.c:939
>>     #27 0x0000000000542971 in init_mod_child (m=0x7f6af0f794c8,
>>     rank=0) at core/sr_module.c:939
>>     #28 0x0000000000542971 in init_mod_child (m=0x7f6af0f79920,
>>     rank=0) at core/sr_module.c:939
>>     #29 0x0000000000542971 in init_mod_child (m=0x7f6af0f7a330,
>>     rank=0) at core/sr_module.c:939
>>     #30 0x0000000000542971 in init_mod_child (m=0x7f6af0f7afd0,
>>     rank=0) at core/sr_module.c:939
>>     #31 0x0000000000542971 in init_mod_child (m=0x7f6af0f7bc80,
>>     rank=0) at core/sr_module.c:939
>>     #32 0x000000000054303d in init_child (rank=0) at core/sr_module.c:970
>>     #33 0x0000000000425399 in main_loop () at main.c:1701
>>     #34 0x000000000042bd5c in main (argc=13, argv=0x7ffdddb31088) at
>>     main.c:2638
>>
>>     And this:
>>     [New LWP 15804]
>>     [Thread debugging using libthread_db enabled]
>>     Using host libthread_db library
>>     "/lib/x86_64-linux-gnu/libthread_db.so.1".
>>     Core was generated by `/usr/local/kamailio/sbin/kamailio -P
>>     /var/run/siptrunk.pid -f /usr/local/carrie'.
>>     Program terminated with signal SIGSEGV, Segmentation fault.
>>     #0  0x00007f6af0b90b70 in free_cell_helper
>>     (dead_cell=0x7f6ab0a6baa8, silent=0, fname=0x7f6af0c8f630
>>     "timer.c", fline=654) at h_table.c:230
>>     230foo = tt->next;
>>     (gdb) bt
>>     #0  0x00007f6af0b90b70 in free_cell_helper
>>     (dead_cell=0x7f6ab0a6baa8, silent=0, fname=0x7f6af0c8f630
>>     "timer.c", fline=654) at h_table.c:230
>>     #1  0x00007f6af0c24409 in wait_handler (ti=932640643,
>>     wait_tl=0x7f6ab0a6bb28, data=0x7f6ab0a6baa8) at timer.c:654
>>     #2  0x00000000004bb445 in timer_list_expire (t=932640643,
>>     h=0x7f6ab03ad158, slow_l=0x7f6ab03ae480, slow_mark=271) at
>>     core/timer.c:874
>>     #3  0x00000000004bb8ab in timer_handler () at core/timer.c:939
>>     #4  0x00000000004bbd30 in timer_main () at core/timer.c:978
>>     #5  0x00000000004250f9 in main_loop () at main.c:1691
>>     #6  0x000000000042bd5c in main (argc=13, argv=0x7ffdddb31088) at
>>     main.c:2638
>>
>>     On Wed, Sep 5, 2018 at 3:13 PM Daniel-Constantin Mierla
>>     <miconda at gmail.com <mailto:miconda at gmail.com>> wrote:
>>
>>         Hello,
>>
>>         the backtrace doesn't show any hint about kamailio, only from
>>         closelog() up.
>>
>>         It may be the core generated by shutdown procedure, have you
>>         enabled one core file per pid/process? If not, do it and
>>         reproduce the issue again, you may get two core files, one
>>         being the runtime issue and the other one from shutdown
>>         procedure, which likely is an effect of the other one. The
>>         one from the runtime is more relevant.
>>
>>         Cheers,
>>         Daniel
>>
>>
>>         On 05.09.18 10:09, Jayesh Nambiar wrote:
>>>         Hello,
>>>         I'm using kamailio 5.1.5 with evapi. I have a node.js
>>>         connecting with kamailio evapi to which I send events and
>>>         also consume events based on which I do the routing. I have
>>>         8 evapi workers defined in the config. 
>>>         The problem is that kamailio randomly crashes on high load.
>>>         I'm assuming that it is related to the evapi module as rest
>>>         of the config is pretty straight forward. I could get a core
>>>         file and here's the core dump:
>>>         [New LWP 14042]
>>>         [Thread debugging using libthread_db enabled]
>>>         Using host libthread_db library
>>>         “/lib/x86_64-linux-gnu/libthread_db.so.1”.
>>>         Core was generated by `/usr/local/kamailio/sbin/kamailio -P
>>>         /var/run/siptrunk.pid -f /usr/local/carrie’.
>>>         Program terminated with signal SIGABRT, Aborted.
>>>         #0  0x00007f9995283428 in __GI_raise (sig=sig at entry=6) at
>>>         ../sysdeps/unix/sysv/linux/raise.c:54
>>>         54    ../sysdeps/unix/sysv/linux/raise.c: No such file or
>>>         directory.
>>>         (gdb) bt
>>>         #0  0x00007f9995283428 in __GI_raise (sig=sig at entry=6) at
>>>         ../sysdeps/unix/sysv/linux/raise.c:54
>>>         #1  0x00007f999528502a in __GI_abort () at abort.c:89
>>>         #2  0x000000000041a029 in sig_alarm_abort (signo=14) at
>>>         main.c:646
>>>         #3  <signal handler called>
>>>         #4  0x00007f999534f497 in __libc_cleanup_routine
>>>         (f=<optimized out>) at ../sysdeps/nptl/libc-lockP.h:291
>>>         #5  closelog () at ../misc/syslog.c:415
>>>         #6  0x0000000000000000 in ?? ()
>>>
>>>         Any help in this regards is would allow me to identify the
>>>         reason of the crash. Thanks for the support.
>>>
>>>         - Jayesh
>>>
>>>
>>>         _______________________________________________
>>>         Kamailio (SER) - Users Mailing List
>>>         sr-users at lists.kamailio.org <mailto:sr-users at lists.kamailio.org>
>>>         https://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-users
>>
>>         -- 
>>         Daniel-Constantin Mierla -- www.asipto.com <http://www.asipto.com>
>>         www.twitter.com/miconda <http://www.twitter.com/miconda> -- www.linkedin.com/in/miconda <http://www.linkedin.com/in/miconda>
>>         Kamailio World Conference -- www.kamailioworld.com <http://www.kamailioworld.com>
>>         Kamailio Advanced Training, Nov 12-14, 2018, in Berlin -- www.asipto.com <http://www.asipto.com>
>>
>
>     -- 
>     Daniel-Constantin Mierla -- www.asipto.com <http://www.asipto.com>
>     www.twitter.com/miconda <http://www.twitter.com/miconda> -- www.linkedin.com/in/miconda <http://www.linkedin.com/in/miconda>
>     Kamailio World Conference -- www.kamailioworld.com <http://www.kamailioworld.com>
>     Kamailio Advanced Training, Nov 12-14, 2018, in Berlin -- www.asipto.com <http://www.asipto.com>
>

-- 
Daniel-Constantin Mierla -- www.asipto.com
www.twitter.com/miconda -- www.linkedin.com/in/miconda
Kamailio World Conference -- www.kamailioworld.com
Kamailio Advanced Training, Nov 12-14, 2018, in Berlin -- www.asipto.com

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.kamailio.org/pipermail/sr-users/attachments/20180907/f5b40eac/attachment.html>


More information about the sr-users mailing list