[SR-Users] Timer child process loosing MySQL connections

Tobias Lindgren the_fx at hotmail.com
Sun Jun 17 21:13:41 CEST 2018


Hi Henning,

I've understood that timer has fairly little to do, at least normally. I've been using strace to verify that the 10s requests are being made on my two different connections, on the timer process. However, still seeing the same issue with connections dropped, and I'm not convinced the behaviour has changed either.

Something that makes me confused is that the mysql connection code in Kamailio seems to have support for auto reconnect and it's also enabled. If this was a firewall issue, I would expect to see other child processes loosing their connection as well. So my guess would be that they actually do that. And if they really do then it would seem they are able to reconnect properly (will try to confirm tomorrow if they do). But not the timer process, for some reason, which is very strange.

There is a firewall involved, and I've been try to find something timeout related there, but here it's also confusing as the connections can drop after 10 minuters or 10 hours. I'll revisit that thought though.. Also your idea on starting a new Kamailio with limited config is a very good idea, I'll give that a try.

Thanks!
/Tobias
________________________________
From: Henning Westerholt <hw at kamailio.org>
Sent: Sunday, June 17, 2018 8:08 PM
To: sr-users at lists.kamailio.org
Cc: Tobias Lindgren
Subject: Re: [SR-Users] Timer child process loosing MySQL connections


Am Freitag, 15. Juni 2018, 12:32:25 CEST schrieb Tobias Lindgren:

> Having an issue with MySQL db connections being dropped in a system running

> 4.4.7.

>

> We're using db_mysql and db_cluster modules setup a cluster connecting two

> different DB servers. We have two cluster connections, one for acc and one

> for "other queries". One DB (A) is on the same network, another DB (B) is

> on another network. The default DB connection is for the remote server B.

> Auto reconnect is enabled.

>

> The specific issue seen is that the "timer" child process looses/drops both

> connections to DB A and B. Looking at the output from lsof when this

> happens, the connections usually does not both drop connections for A and B

> at the same time. Sometimes the connections keep up for ~24h, sometimes for

> 10 minutes, but normally the problem re-occurs every 6 hours or so. We're

> seeing this problem on two Kamailio servers, both handling fairly high

> amount of calls.

>

> None of the other Kamailio child processes seems to get their connections

> dropped, only the "timer" process. To solve this we need to restart

> Kamailio.

>

> Lately I've added the timer.so module to make a simple query on each cluster

> connection each 10 seconds.

>

> This is an example output from when the problem appears and connections are

> dropped: Jun 15 09:39:12 /usr/sbin/kamailio[10439]: ERROR: db_mysql

> [km_dbase.c:128]: db_mysql_submit_query(): driver error on query: Can't

> connect to MySQL server on 'xxx' (4) (2003) Jun 15 09:39:12

> /usr/sbin/kamailio[10439]: ERROR: <core> [db_query.c:181]:

> db_do_raw_query(): error while submitting query Jun 15 09:39:12

> /usr/sbin/kamailio[10439]: ERROR: db_mysql [km_dbase.c:128]:

> [looking for ideas..]



Hello Tobias,



the timer process is obviously the one that is not doing any "heavy work" during SIP message processing. Its mostly concerned with cleanup and maintenance tasks e.g. usrloc user deletion, if you use this. Does the simple timer every 10s changes the behavior for you?



I have observed recently some similar issue (with a different code base). Here we found out after a long debugging that the firewall at the network border had some issues and was quitting this long running TCP sessions. Changing the respective timeout was fixing this issue. As you seen this issue every 6h, it could be a similar issue.



If there is no firewall or other network element between the hosts, I would try to reproduce this with e.g. a simple python script that connects to the DB and sleeps periodically. You could even just start a second kamailio with limited children count and attach to it with strace or a debugger.



Best regards,



Henning



--

If you like my work in the Kamailio project, it would be great if you could consider supporting me on Patreon: https://www.patreon.com/henningw
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.kamailio.org/pipermail/sr-users/attachments/20180617/31e26a6e/attachment.html>


More information about the sr-users mailing list