[sr-dev] Kamailio stucks at __read_nocancel during stop.

Daniel-Constantin Mierla miconda at gmail.com
Thu Apr 9 18:48:57 CEST 2020


Hello,

due to nature of parallel processing and using locks/mutexes, a kill of
kamailio can happen inside a locked semaphore and can result in a
deadlock while trying to sync data to backend. Rare, but can happen, for
such cases you can tune the exit_timeout parameter value.

I assume you are using ims usrloc modules, because I haven't seen this
case happening lately for the standard usrloc module. The later does a
lock-free sync to database at shutdown, not sure if the ims-usrloc
module take care of doing similar approach.

Also, be sure you have enabled dumping of core file per process/pid, it
can be another process that is the source of blocking at shutdown, but
its core file was overwritten.

Cheers,
Daniel

On 09.04.20 12:58, Andrey Deykunov wrote:
>
> Hi,
>
> We're getting periodically core dumps when stopping Kamailio:
>
>
> GNU gdb (Debian 7.12-6) 7.12.0.20161007-git
> Copyright (C) 2016 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later
> <http://gnu.org/licenses/gpl.html>
> This is free software: you are free to change and redistribute it.
> There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
> and "show warranty" for details.
> This GDB was configured as "x86_64-linux-gnu".
> Type "show configuration" for configuration details.
> For bug reporting instructions, please see:
> <http://www.gnu.org/software/gdb/bugs/>.
> Find the GDB manual and other documentation resources online at:
> <http://www.gnu.org/software/gdb/documentation/>.
> For help, type "help".
> Type "apropos word" to search for commands related to "word"...
> Reading symbols from /var/lib/ums/sbin/kamailio...done.
> [New LWP 9832]
> [Thread debugging using libthread_db enabled]
> Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
> Core was generated by `/var/lib/ums/sbin/kamailio -m 2048 -M 12 -P
> /var/run/kamailio/kamailio.pid -f /'.
> Program terminated with signal SIGABRT, Aborted.
> #0  __GI_raise (sig=sig at entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
> #0  __GI_raise (sig=sig at entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
> #1  0x00007f79c24e842a in __GI_abort () at abort.c:89
> #2  0x000000000043f92a in sig_alarm_abort (signo=14) at main.c:679
> #3  <signal handler called>
> #4  0x00007f79c258f910 in __read_nocancel () at
> ../sysdeps/unix/syscall-template.S:84
> #5  0x00007f793cbd7421 in vio_read_buff () from
> /var/lib/ums/lib/mysql/libmysqlclient.so.16
> #6  0x00007f793cbd8aa0 in my_real_read () from
> /var/lib/ums/lib/mysql/libmysqlclient.so.16
> #7  0x00007f793cbd87d6 in my_net_read () from
> /var/lib/ums/lib/mysql/libmysqlclient.so.16
> #8  0x00007f793cbd085d in cli_safe_read () from
> /var/lib/ums/lib/mysql/libmysqlclient.so.16
> #9  0x00007f793cbd539a in cli_read_query_result () from
> /var/lib/ums/lib/mysql/libmysqlclient.so.16
> #10 0x00007f793cbd436d in mysql_real_query () from
> /var/lib/ums/lib/mysql/libmysqlclient.so.16
> #11 0x00007f793ce80c13 in db_mysql_submit_query (_h=0x7f79c1786890,
> _s=0x7f793f379250 <sql_str>) at km_dbase.c:111
> #12 0x00007f793f16baba in db_do_submit_query (_h=0x7f79c1786890,
> _query=0x7f793f379250 <sql_str>, submit_query=0x7f793ce801a0
> <db_mysql_submit_query>) at db_query.c:58
> #13 0x00007f793f16e114 in db_do_delete (_h=0x7f79c1786890,
> _k=0x7ffeb3ca8df8, _o=0x0, _v=0x7ffeb3ca8dd0, _n=1,
> val2str=0x7f793cea8990 <db_mysql_val2str>, submit_query=0x7f793ce801a0
> <db_mysql_submit_query>) at db_query.c:300
> #14 0x00007f793ce86c17 in db_mysql_delete (_h=0x7f79c1786890,
> _k=0x7ffeb3ca8df8, _o=0x0, _v=0x7ffeb3ca8dd0, _n=1) at km_dbase.c:510
> #15 0x00007f793f3cb841 in db_delete_ucontact_ruid (_c=0x7f794186b3c0)
> at ucontact.c:1552
> #16 0x00007f793f3cd096 in db_delete_ucontact (_c=0x7f794186b3c0) at
> ucontact.c:1570
> #17 0x00007f793f3ad0c0 in wb_timer (_r=0x7f79411d7330) at urecord.c:401
> #18 0x00007f793f3ab451 in timer_urecord (_r=0x7f79411d7330) at
> urecord.c:463
> #19 0x00007f793f3978aa in mem_timer_udomain (_d=0x7f7940d9baa0,
> istart=0, istep=1) at udomain.c:1224
> #20 0x00007f793f3d4608 in synchronize_all_udomains (istart=0, istep=1)
> at dlist.c:756
> #21 0x00007f793f3a2768 in destroy () at usrloc_mod.c:464
> #22 0x0000000000637e2c in destroy_modules () at core/sr_module.c:746
> #23 0x000000000041c4d8 in cleanup (show_status=1) at main.c:555
> #24 0x0000000000423af7 in shutdown_children (sig=15, show_status=1) at
> main.c:696
> #25 0x000000000041f081 in handle_sigs () at main.c:727
> #26 0x0000000000432a01 in main_loop () at main.c:1806
> #27 0x000000000043df6f in main (argc=9, argv=0x7ffeb3cac348) at
> main.c:2802  
>
>
> Looks like libmysql stucks at __read_nocancel() syscall. The
> exit_timeout parameter is set by default (60  sec), Kamailio version
> 5.3.1.
>
> Any ideas?
>
> Thanks,
> Andrey
>
>
>
>
>
> _______________________________________________
> Kamailio (SER) - Development Mailing List
> sr-dev at lists.kamailio.org
> https://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-dev

-- 
Daniel-Constantin Mierla -- www.asipto.com
www.twitter.com/miconda -- www.linkedin.com/in/miconda

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.kamailio.org/pipermail/sr-dev/attachments/20200409/4a9f127f/attachment-0001.html>


More information about the sr-dev mailing list