Hi all,

I've made a post last month regarding losing MySQL connections  - https://lists.kamailio.org/pipermail/sr-users/2020-December/111389.html

At the time I thought connections were dying as a consequence of low activity and traffic on the proxy. Meanwhile, I've migrated a great number of equipments to the proxy with Registers being refreshed every 10minutes and the problem still persists.

In order to try to fix this i've added timeout_interval and ping_interval from the db_mysql module. My SQL client on the Kamailio machine is mysql-community-client  5.6.50-2.el7. It writes and reads in a remote InnoDB database.

This are the logs i get from Kamailio when the problem appears :

Jan  7 09:43:27 sbc_bbt01_active /usr/local/kamailio-5.4/sbin/kamailio[21735]: ERROR: {1 27880 REGISTER e5f8f7bc-cbb2-40b3-9037-edacd6276a2b} db_mysql [km_dbase.c:123]: db_mysql_submit_query(): driver error on query: Lock wait timeout exceeded; try restarting transaction (1205)
Jan  7 09:43:27 sbc_bbt01_active /usr/local/kamailio-5.4/sbin/kamailio[21735]: ERROR: {1 27880 REGISTER e5f8f7bc-cbb2-40b3-9037-edacd6276a2b} <core> [db_query.c:348]: db_do_update(): error while submitting query
Jan  7 09:43:27 sbc_bbt01_active /usr/local/kamailio-5.4/sbin/kamailio[21735]: ERROR: {1 27880 REGISTER e5f8f7bc-cbb2-40b3-9037-edacd6276a2b} usrloc [ucontact.c:1147]: db_update_ucontact_ruid(): updating database failed
Jan  7 09:43:27 sbc_bbt01_active /usr/local/kamailio-5.4/sbin/kamailio[21735]: ERROR: {1 27880 REGISTER e5f8f7bc-cbb2-40b3-9037-edacd6276a2b} usrloc [ucontact.c:1663]: update_contact_db(): failed to update database
Jan  7 09:43:27 sbc_bbt01_active /usr/local/kamailio-5.4/sbin/kamailio[21735]: ERROR: {1 27880 REGISTER e5f8f7bc-cbb2-40b3-9037-edacd6276a2b} registrar [save.c:784]: update_contacts(): failed to update contact
Jan  7 09:43:27 sbc_bbt01_active /usr/local/kamailio-5.4/sbin/kamailio[21735]: ERROR: {1 27880 REGISTER e5f8f7bc-cbb2-40b3-9037-edacd6276a2b} sl [sl_funcs.c:414]: sl_reply_error(): stateless error reply used: I'm terribly sorry, server error occurred (1/SL)


Originally I had usrloc db_mode on mode 3 - DB-Only Scheme. In order to try to mitigate the issue I changed it to mode 1 - Write-Through scheme but even then I get the same log errors and an "500" error is still sent to the client. I chose this mode since, as far as I can understand it applies changes directly to DB but also uses cache. Please correct me if i'm wrong on that.

Has this issue happened with anyone before? Is there a way to mitigate this issue? My only constraint is that I need the database to be always updated since I have an HA setup, and as such, I can't use cache only methods.

Best Regards,