K reported during about 90 sec period that it is out of shared memory:
Feb 28 09:47:28 rox1 /usr/bin/sip-proxy[19725]: ERROR: tm [t_hooks.c:136]: insert_tmcb(): out of shm. mem Feb 28 09:47:28 rox1 /usr/bin/sip-proxy[19725]: ERROR: acc [acc_logic.c:394]: acc_onreq(): cannot register additional callbacks Feb 28 09:47:28 rox1 /usr/bin/sip-proxy[19725]: ERROR: <core> [core/sip_msg_clone.c:499]: sip_msg_shm_clone(): could not allocate shared memory from shm pool Feb 28 09:47:28 rox1 /usr/bin/sip-proxy[19725]: ERROR: tm [t_lookup.c:1293]: new_t(): out of mem: Feb 28 09:47:28 rox1 /usr/bin/sip-proxy[19725]: ERROR: tm [t_lookup.c:1439]: t_newtran(): new_t failed ... Feb 28 09:47:29 rox1 /usr/bin/sip-proxy[19725]: ERROR: <core> [core/mem/q_malloc.c:297]: qm_find_free(): qm_find_free(0x7f1f2a506000, 1232); Free fragment not found! Feb 28 09:47:29 rox1 /usr/bin/sip-proxy[19725]: ERROR: <core> [core/mem/q_malloc.c:434]: qm_malloc(): qm_malloc(0x7f1f2a506000, 1232); Free fragment not found! Feb 28 09:47:29 rox1 /usr/bin/sip-proxy[19725]: ERROR: tm [t_reply.c:1957]: relay_reply(): cannot alloc reply shmem Feb 28 09:47:29 rox1 /usr/bin/sip-proxy[19725]: ERROR: <core> [core/sip_msg_clone.c:499]: sip_msg_shm_clone(): could not allocate shared memory from shm pool Feb 28 09:47:29 rox1 /usr/bin/sip-proxy[19725]: ERROR: acc [acc_logic.c:562]: acc_onreply(): failed to clone the request - acc aborted ... Feb 28 09:48:51 rox1 /usr/bin/sip-proxy[19724]: ERROR: <core> [core/mem/q_malloc.c:297]: qm_find_free(): qm_find_free(0x7f1f2a506000, 5728); Free fragment not found! Feb 28 09:48:51 rox1 /usr/bin/sip-proxy[19724]: ERROR: <core> [core/mem/q_malloc.c:434]: qm_malloc(): qm_malloc(0x7f1f2a506000, 5728); Free fragment not found! Feb 28 09:48:51 rox1 /usr/bin/sip-proxy[19724]: ERROR: tm [t_lookup.c:1293]: new_t(): out of mem: Feb 28 09:48:51 rox1 /usr/bin/sip-proxy[19724]: ERROR: tm [t_lookup.c:1439]: t_newtran(): new_t failed
And after that period it started working normally again and core.shmmem showed:
{ total: 134217728 free: 124173808 used: 9530960 real_used: 10043920 max_used: 134217728 fragments: 349 }
As can be seen, shm had been full, but normally only about 10% of it is in use.
Syslog does not show any traffic spikes or other unusual activity before the memory got full.
Any ideas what could cause such a high memory usage or could there be a bug in shm management?
-- Juha
The times we've seen this is when transactions are waiting on something so they pile up consuming shared memory. Do you have any database lookups or calls out to external services or scripts?
Long timeouts can also contribute if something stops responding because transactions are waiting for a long timeout to expire.
It's typically a balance between setting reasonable timeouts and allocating enough shm. In addition we implemented some watcher scripts that monitor shm and will set gflags to disable non-critical external calls beyond a certain threshold as well as send us an alert.
John Petrini writes:
The times we've seen this is when transactions are waiting on something so they pile up consuming shared memory. Do you have any database lookups or calls out to external services or scripts?
John,
Thanks for your reply. Yes, there are MySQL operations both during request/reply processing and also accounting to db. I would assume that any problems with those should show up in /var/log/mysql/error.log.
Long timeouts can also contribute if something stops responding because transactions are waiting for a long timeout to expire.
There is quite long "fr_inv_timer" value, that could cause problems is lots of INVITEs are ringing at the same time.
It's typically a balance between setting reasonable timeouts and allocating enough shm. In addition we implemented some watcher scripts that monitor shm and will set gflags to disable non-critical external calls beyond a certain threshold as well as send us an alert.
I also created an external script to watch shm memory usage. I would have liked K to watch itself, but could not find pseudo variables corresponding to core.shmmem entries. Do they exist?
-- Juha
Maybe not the nicest solution, but you can use (from inside the routing script) the jsonrpc_exec() to get stats and grab the shmem usage, and build your logic there... if this do that...
On Sun, Mar 1, 2020 at 12:54 Juha Heinanen jh@tutpro.com wrote:
John Petrini writes:
The times we've seen this is when transactions are waiting on something
so
they pile up consuming shared memory. Do you have any database lookups or calls out to external services or scripts?
John,
Thanks for your reply. Yes, there are MySQL operations both during request/reply processing and also accounting to db. I would assume that any problems with those should show up in /var/log/mysql/error.log.
Long timeouts can also contribute if something stops responding because transactions are waiting for a long timeout to expire.
There is quite long "fr_inv_timer" value, that could cause problems is lots of INVITEs are ringing at the same time.
It's typically a balance between setting reasonable timeouts and
allocating
enough shm. In addition we implemented some watcher scripts that monitor shm and will set gflags to disable non-critical external calls beyond a certain threshold as well as send us an alert.
I also created an external script to watch shm memory usage. I would have liked K to watch itself, but could not find pseudo variables corresponding to core.shmmem entries. Do they exist?
-- Juha
Kamailio (SER) - Users Mailing List sr-users@lists.kamailio.org https://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-users
On 01.03.20 21:53, Juha Heinanen wrote:
John Petrini writes:
The times we've seen this is when transactions are waiting on something so they pile up consuming shared memory. Do you have any database lookups or calls out to external services or scripts?
John,
Thanks for your reply. Yes, there are MySQL operations both during request/reply processing and also accounting to db. I would assume that any problems with those should show up in /var/log/mysql/error.log.
Failed queries should be reported to error log by kamailio as well.
On the other hand, Kamailio has some global parameters that enable logging messages for slow db queries. Maybe you can enable them to see if any query takes very long.
Long timeouts can also contribute if something stops responding because transactions are waiting for a long timeout to expire.
There is quite long "fr_inv_timer" value, that could cause problems is lots of INVITEs are ringing at the same time.
It's typically a balance between setting reasonable timeouts and allocating enough shm. In addition we implemented some watcher scripts that monitor shm and will set gflags to disable non-critical external calls beyond a certain threshold as well as send us an alert.
I also created an external script to watch shm memory usage. I would have liked K to watch itself, but could not find pseudo variables corresponding to core.shmmem entries. Do they exist?
The $stat(name) can be used for any internal statistic. I think it takes only the name of statistic, not the group, so something like $stat(free_size).
Check also statsc module, maybe it is is useful to have the evolution of some stats values in the recent past.
Cheers, Daniel
What kind of delay there is on cleaning of an expired transaction? I'm asking, since tm.clean released lots of shm.
-- Juha
Looks like something odd happened there. Is the timer process somehow stuck? Any jump of the system clock? Virtual machine frozen somehow? Or the transactions remained hooked in some callback, not releasing the reference counter.
The rpc command tm.clean is like a last resort option when dealing with an unexpected situation that messed up the timer process, otherwise the transactions should be cleaned as they are expired or terminated because of final response sent upstream, with a delay of 5 seconds (default value).
Cheers, Daniel
On 03.03.20 02:00, Juha Heinanen wrote:
What kind of delay there is on cleaning of an expired transaction? I'm asking, since tm.clean released lots of shm.
-- Juha
Daniel-Constantin Mierla writes:
The rpc command tm.clean is like a last resort option when dealing with an unexpected situation that messed up the timer process, otherwise the transactions should be cleaned as they are expired or terminated because of final response sent upstream, with a delay of 5 seconds (default value).
Before we find out why the transactions are not freed on time, I would like to run a test every minute that checks if the shm usage is too high and, if so, executes tm.clean. Something like this:
route [every_minute] { $var(shm_usage) = ($stat(real_used_size) * 100) / $stat(total_size); if ($var(shm_usage) > 90) t_clean(); }
Otherwise OK, but there is no t_clean() function. Could that be added?
-- Juha
On 03.03.20 21:16, Juha Heinanen wrote:
Daniel-Constantin Mierla writes:
The rpc command tm.clean is like a last resort option when dealing with an unexpected situation that messed up the timer process, otherwise the transactions should be cleaned as they are expired or terminated because of final response sent upstream, with a delay of 5 seconds (default value).
Before we find out why the transactions are not freed on time, I would like to run a test every minute that checks if the shm usage is too high and, if so, executes tm.clean. Something like this:
route [every_minute] { $var(shm_usage) = ($stat(real_used_size) * 100) / $stat(total_size); if ($var(shm_usage) > 90) t_clean(); }
Otherwise OK, but there is no t_clean() function. Could that be added?
The function can be added, should be easy to extract the code from the rpc command function. However even now you can execute rpc commands from kamailio.cfg using jsonrpcs module:
* https://www.kamailio.org/docs/modules/stable/modules/jsonrpcs.html#jsonrpcs....
Cheers, Daniel
Daniel-Constantin Mierla writes:
The function can be added, should be easy to extract the code from the rpc command function.
How about the tm.c patch below? Or should the function be placed in some other tm/.c file and if so, which one?
-- Juha
-----------------------------------
*** /usr/src/orig/kamailio/src/modules/tm/tm.c 2020-03-06 00:13:15.534078140 +0200 --- tm.c 2020-03-06 00:45:26.405625701 +0200 *************** *** 208,213 **** --- 207,213 ---- char* pnexthop, char* psock, char *phdrs, char* pbody); static int w_t_get_status_code(sip_msg_t* msg, char *p1, char *p2);
+ static int t_clean(struct sip_msg* msg, char* key, char* value);
/* by default the fr timers avps are not set, so that the avps won't be * searched for nothing each time a new transaction is created */ *************** *** 411,416 **** --- 411,417 ---- REQUEST_ROUTE | FAILURE_ROUTE}, {"t_next_contact_flow", t_next_contact_flow, 0, 0, 0, REQUEST_ROUTE }, + {"t_clean", t_clean, 0, 0, 0, ANY_ROUTE },
/* not applicable from the script */ {"load_tm", (cmd_function)load_tm, NO_SCRIPT, 0, 0, 0}, *************** *** 2952,2957 **** --- 2909,2921 ---- return ki_t_relay_to_proxy_flags(msg, NULL, rflags); }
+ /* script function to clean active but very old transactions */ + static int t_clean(struct sip_msg* msg, char* key, char* value) + { + tm_clean_lifetime(); + return 1; + } + /** * */
It is ok, just add the documentation for the new function as well.
Cheers, Daniel
On 05.03.20 23:48, Juha Heinanen wrote:
Daniel-Constantin Mierla writes:
The function can be added, should be easy to extract the code from the rpc command function.
How about the tm.c patch below? Or should the function be placed in some other tm/.c file and if so, which one?
-- Juha
*** /usr/src/orig/kamailio/src/modules/tm/tm.c 2020-03-06 00:13:15.534078140 +0200 --- tm.c 2020-03-06 00:45:26.405625701 +0200
*** 208,213 **** --- 207,213 ---- char* pnexthop, char* psock, char *phdrs, char* pbody); static int w_t_get_status_code(sip_msg_t* msg, char *p1, char *p2);
static int t_clean(struct sip_msg* msg, char* key, char* value);
/* by default the fr timers avps are not set, so that the avps won't be
- searched for nothing each time a new transaction is created */
*** 411,416 **** --- 411,417 ---- REQUEST_ROUTE | FAILURE_ROUTE}, {"t_next_contact_flow", t_next_contact_flow, 0, 0, 0, REQUEST_ROUTE },
{"t_clean", t_clean, 0, 0, 0, ANY_ROUTE },
/* not applicable from the script */ {"load_tm", (cmd_function)load_tm, NO_SCRIPT, 0, 0, 0},
*** 2952,2957 **** --- 2909,2921 ---- return ki_t_relay_to_proxy_flags(msg, NULL, rflags); }
- /* script function to clean active but very old transactions */
- static int t_clean(struct sip_msg* msg, char* key, char* value)
- {
- tm_clean_lifetime();
- return 1;
- }
- /**
*/
Daniel-Constantin Mierla writes:
The $stat(name) can be used for any internal statistic. I think it takes only the name of statistic, not the group, so something like $stat(free_size).
Thanks, free_size and the others in
core.shmmem { total: 67108864 free: 58274976 used: 7403616 real_used: 8833888 max_used: 8851696 fragments: 10 }
were found. For some reason, "_size" suffix needs to be appended to the attribute names.
-- Juha
On 03.03.20 02:24, Juha Heinanen wrote:
Daniel-Constantin Mierla writes:
The $stat(name) can be used for any internal statistic. I think it takes only the name of statistic, not the group, so something like $stat(free_size).
Thanks, free_size and the others in
core.shmmem { total: 67108864 free: 58274976 used: 7403616 real_used: 8833888 max_used: 8851696 fragments: 10 }
were found. For some reason, "_size" suffix needs to be appended to the attribute names.
I think you used a different rpc command, the one only for shared memory info, which is not giving the exact statistic names.
To see the name of statistics, use the stats command with kamctl:
kamctl stats
It returns:
group.name = value
like:
shmem.fragments: '41' shmem.free_size: '128218840' shmem.max_used_size: '8643872' shmem.real_used_size: '5998888' shmem.total_size: '134217728' shmem.used_size: '4309712'
Cheers, Daniel
Cheers, Daniel
Here is some more details about the shm usage. Usage was steady at about 10 MB until time 17:05. Then it usage started to steadily grow during 15 min period and 17:21 hit 80 MB at which point tm.clean was executed. After that usage dropped straight back to 10 MB level.
How is that possible, i.e,, why the expired transactions were not automatically cleared?
According to tm README:
An INVITE transaction will be kept in memory for maximum: max_inv_lifetime+fr_timer(from the ACK to the final reply wait)+wt_timer.
In this case the max memory time was 180 + 5 + 5 = 190 sec, which is much less than 16 minutes.
-- Juha
If you can reproduce it, watch what the timer processes do during that time frame. Get the list of processes with 'kamctl ps', then when the issue is exposed, grab the backtraces of all processes with:
kamctl trap
A file is created with the backtraces taken with gdb. Sent it over here to analyze, if you cannot spot an issue there.
Cheers, Daniel
On 03.03.20 05:39, Juha Heinanen wrote:
Here is some more details about the shm usage. Usage was steady at about 10 MB until time 17:05. Then it usage started to steadily grow during 15 min period and 17:21 hit 80 MB at which point tm.clean was executed. After that usage dropped straight back to 10 MB level.
How is that possible, i.e,, why the expired transactions were not automatically cleared?
According to tm README:
An INVITE transaction will be kept in memory for maximum: max_inv_lifetime+fr_timer(from the ACK to the final reply wait)+wt_timer.
In this case the max memory time was 180 + 5 + 5 = 190 sec, which is much less than 16 minutes.
-- Juha
Daniel-Constantin Mierla writes:
If you can reproduce it, watch what the timer processes do during that time frame. Get the list of processes with 'kamctl ps', then when the issue is exposed, grab the backtraces of all processes with:
kamctl trap
A file is created with the backtraces taken with gdb. Sent it over here to analyze, if you cannot spot an issue there.
Will do that. Now it happened that K crashed when tm.clean was given at the time when the number transactions was growing. Below is backtrace of that crash.
-- Juha
------------------------------- (gdb) bt full #0 0x00007f5151fa3cf8 in tm_clean_lifetime () at h_table.c:617 r = 53487 tcell = 0x0 texp = 1117017528 #1 0x00007f5151fe1afd in tm_rpc_clean (rpc=0x7f51513b0a00 <func_param>, c=0x7f51513b0920 <_jsonrpc_ctx_global>) at t_stats.c:299 No locals. #2 0x00007f515138fc9f in jsonrpc_dispatch (msg=0x7fff3d1dc210, s1=0x0, s2=0x0) at jsonrpcs_mod.c:1294 rpce = 0x7f5152dcf360 ctx = 0x7f51513b0920 <_jsonrpc_ctx_global> ret = 0 nj = 0x0 val = {s = 0x557bb08fe9d0 "tm.clean", len = 8} __func__ = "jsonrpc_dispatch" #3 0x0000557bae5c28b5 in do_action (h=0x7fff3d1dc130, a=0x7f51537ebd30, msg=0x7fff3d1dc210) at core/action.c:1073 ret = -5 v = 1390957541 dst = {send_sock = 0xc00000, to = {s = {sa_family = 6, sa_data = "\000\000\000\000\000\000\220\270\035=\377\177\000"}, sin = {sin_family = 6, sin_port = 0, sin_addr = {s_addr = 0}, sin_zero = "\220\270\035=\377\177\000"}, sin6 = {sin6_family = 6, sin6_port = 0, sin6_flowinfo = 0, sin6_addr = {__in6_u = {__u6_addr8 = "\220\270\035=\377\177\000\000\000\b\302\tv\223\062\213", __u6_addr16 = {47248, 15645, 32767, 0, 2048, 2498, 37750, 35634}, __u6_addr32 = {1025357968, 32767, 163710976, 2335347574}}}, sin6_scope_id = 1025358088}}, id = 32767, send_flags = {f = 50272, blst_imask = 21212}, proto = 81 'Q', proto_pad0 = 127 '\177', proto_pad1 = 0} tmp = 0x7f5152ab3c1e "H\205\300\177\345ƃ\224" new_uri = 0x7f51537e9da8 "" end = 0x7fff3d1dc210 "\255\004" crt = 0x557bb08fc931 "ip_proxy_cache" cmd = 0x7f5152dd27c0 len = 4 user = 0 uri = {user = {s = 0x7fff3d1db580 "\340\265\035=\377\177", len = 1390199904}, passwd = {s = 0x557bae7f69e3 "INFO", len = -1368709420}, host = {s = 0x7fff3d1db5a0 "", len = 1389801488}, port = {s = 0x7fff3d1db640 "\320\006", len = 1389801488}, params = { s = 0x7fff3d1db5e0 "\020\266\035=\377\177", len = 1356175867}, sip_params = {s = 0x7fff3d1db7f0 "", len = 0}, headers = {s = 0xc00000 <error: Cannot access memory at address 0xc00000>, len = 1260448}, port_no = 56128, proto = 159, type = ERROR_URI_T, flags = (unknown: 11322464), transport = {s = 0xb7fab0 <error: Cannot access memory at address 0xb7fab0>, len = 16}, ttl = {s = 0x11e <error: Cannot access memory at address 0x11e>, len = 163710976}, user_param = {s = 0x7fff3d1db610 "\220\266\035=\377\177", len = -1370752186}, maddr = {s = 0x0, len = 0}, method = {s = 0x7f5153158560 "\320\006", len = 0}, lr = {s = 0x7fff3d1db690 "\360\266\035=\377\177", len = -1368723422}, r2 = {s = 0x7f51531531f0 "\002", len = 1389801488}, gr = {s = 0x7f51537ea518 "\002", len = 1393897952}, transport_val = {s = 0x6d0 <error: Cannot access memory at address 0x6d0>, len = 1389801488}, ttl_val = {s = 0x7f51531530d0 <incomplete sequence \360>, len = 1393899728}, user_param_val = {s = 0x7f51531531f8 ">5\025SQ\177", len = 1390199904}, maddr_val = {s = 0x557bae7f69e3 "INFO", len = 536870912}, method_val = {s = 0xc00000 <error: Cannot access memory at address 0xc00000>, len = 6}, lr_val = {s = 0x7fff3d1db6f0 "P\267\035=\377\177", len = -1371351967}, r2_val = {s = 0x7fff3d1dc210 "\255\004", len = 1400808728}, gr_val = {s = 0x7fff3d1db730 "\220\267\035=\377\177", len = 163710976}} next_hop = {user = {s = 0x0, len = 1390199904}, passwd = {s = 0x557bae7f69e3 "INFO", len = 536870912}, host = {s = 0xc00000 <error: Cannot access memory at address 0xc00000>, len = 6}, port = {s = 0x7fff3d1db750 "", len = -1371324103}, params = { s = 0x7fff3d1db7f0 "", len = 1400806792}, sip_params = {s = 0x19000017ac <error: Cannot access memory at address 0x19000017ac>, len = 1400808728}, headers = {s = 0x7fff3d1dc210 "\255\004", len = -1369366375}, port_no = 46992, proto = 15645, type = 32767, flags = (unknown: 1390199904), transport = {s = 0x557bae7f69e3 "INFO", len = 536870912}, ttl = {s = 0xc00000 <error: Cannot access memory at address 0xc00000>, len = 6}, user_param = {s = 0x7fff3d1db8a0 "\340\271\035=\377\177", len = -1371320250}, maddr = { s = 0x557bae7f69e3 "INFO", len = 536870912}, method = {s = 0xc00000 <error: Cannot access memory at address 0xc00000>, len = 6}, lr = {s = 0x7fff3d1db7f0 "", len = 1387308127}, r2 = {s = 0x7f51537eac00 "\031", len = 1025358088}, gr = { s = 0x7fff3d1dc210 "\255\004", len = 1025360176}, transport_val = {s = 0x17ac3d1db7f0 <error: Cannot access memory at address 0x17ac3d1db7f0>, len = 6060}, ttl_val = {s = 0x557bae7f69e3 "INFO", len = 536870912}, user_param_val = {s = 0x7f51537e9d88 "\a", len = 1400808728}, maddr_val = {s = 0x0, len = 0}, method_val = {s = 0x0, len = 0}, lr_val = {s = 0x557bae95a842 <ut_buf_int2str+18> "976", len = 1398451376}, r2_val = {s = 0x7fff3d1db900 "\360\275\331R\001", len = 1390350000}, gr_val = {s = 0x0, len = 1390199904}} u = 0x7f5152abbc8d port = 21269 dst_host = 0x7fff3d1db7f8 i = 0 flags = 32593 avp = 0x7fff3d1db4f8 st = {flags = 1400806800, id = 32593, name = {n = 1025360400, s = {s = 0x7fff3d1dc210 "\255\004", len = 1025357344}, re = 0x7fff3d1dc210}, avp = 0x557b00000000} sct = 0x7f51511a819b <pv_get_huri+69> sjt = 0x7f5153164bb0 rve = 0x7fff3d1dc248 mct = 0x7fff3d1db7f8 rv = 0x7fff3d1db4d0 rv1 = 0x7f5152dcc4a0 c1 = {cache_type = 2927585763, val_type = 21883, c = {avp_val = {n = 536870912, s = {s = 0x20000000 <error: Cannot access memory at address 0x20000000>, len = 12582912}, re = 0x20000000}, pval = {rs = { s = 0x20000000 <error: Cannot access memory at address 0x20000000>, len = 12582912}, ri = 6, flags = 0}}, i2s = "\340\271\035=\377\177\000\000\242KC\256{U\000\000\000\000\300\000\000"} s = {s = 0x7f51537e9da8 "", len = 1025360400} srevp = {0x557bae7f69e3, 0x20000000} evp = {data = 0x0, rcv = 0x0, dst = 0x0} mod_f_params = {{type = NOSUBTYPE, u = {number = 0, string = 0x0, str = {s = 0x0, len = 0}, data = 0x0, attr = 0x0, select = 0x0}}, {type = NOSUBTYPE, u = {number = 0, string = 0x0, str = {s = 0x0, len = 0}, data = 0x0, attr = 0x0, select = 0x0}}, { type = NOSUBTYPE, u = {number = 0, string = 0x0, str = {s = 0x0, len = 0}, data = 0x0, attr = 0x0, select = 0x0}}, {type = NOSUBTYPE, u = {number = 0, string = 0x0, str = {s = 0x0, len = 0}, data = 0x0, attr = 0x0, select = 0x0}}, {type = NOSUBTYPE, u = { number = 0, string = 0x0, str = {s = 0x0, len = 0}, data = 0x0, attr = 0x0, select = 0x0}}, {type = NOSUBTYPE, u = {number = 0, string = 0x0, str = {s = 0x0, len = 0}, data = 0x0, attr = 0x0, select = 0x0}}, {type = NOSUBTYPE, u = {number = 0, string = 0x0, str = {s = 0x0, len = 0}, data = 0x0, attr = 0x0, select = 0x0}}, {type = NOSUBTYPE, u = {number = 0, string = 0x0, str = {s = 0x0, len = 0}, data = 0x0, attr = 0x0, select = 0x0}}} __func__ = "do_action" #4 0x0000557bae5d0626 in run_actions (h=0x7fff3d1dc130, a=0x7f51537eba80, msg=0x7fff3d1dc210) at core/action.c:1578 t = 0x7f51537ebd30 ret = 1 tvb = {tv_sec = 0, tv_usec = 0} tve = {tv_sec = 0, tv_usec = 0} tz = {tz_minuteswest = 21, tz_dsttime = 0} tdiff = 2962221528 __func__ = "run_actions" #5 0x0000557bae5c2821 in do_action (h=0x7fff3d1dc130, a=0x7f51537ebf50, msg=0x7fff3d1dc210) at core/action.c:1064 --Type <RET> for more, q to quit, c to continue without paging-- ret = 1 v = 1 dst = {send_sock = 0x7f5153027c10, to = {s = {sa_family = 2048, sa_data = "\302\tv\223\062\213y\237\037QQ\177\000"}, sin = {sin_family = 2048, sin_port = 2498, sin_addr = {s_addr = 2335347574}, sin_zero = "y\237\037QQ\177\000"}, sin6 = {sin6_family = 2048, sin6_port = 2498, sin6_flowinfo = 2335347574, sin6_addr = {__in6_u = {__u6_addr8 = "y\237\037QQ\177\000\000\017\000\000\000Q\177\000", __u6_addr16 = {40825, 20767, 32593, 0, 15, 0, 32593, 0}, __u6_addr32 = {1361026937, 32593, 15, 32593}}}, sin6_scope_id = 2927585763}}, id = 4, send_flags = {f = 39616, blst_imask = 44675}, proto = 123 '{', proto_pad0 = 85 'U', proto_pad1 = 0} tmp = 0x500000000 <error: Cannot access memory at address 0x500000000> new_uri = 0x7fff3d1dc2b0 "" end = 0x7f5152d38eda "" crt = 0x0 cmd = 0x7f5153b25fa0 <_IO_strn_jumps> len = 21 user = 32593 uri = {user = {s = 0x4009e6720 <error: Cannot access memory at address 0x4009e6720>, len = -1367595136}, passwd = {s = 0x7fff3d1dbcc8 ""\027q\256{U", len = 163710976}, host = {s = 0x557bae95a830 <ut_buf_int2str> "", len = 0}, port = { s = 0xd68 <error: Cannot access memory at address 0xd68>, len = 17}, params = {s = 0x0, len = 1390199904}, sip_params = {s = 0x557bae7f69e3 "INFO", len = 536870912}, headers = {s = 0xc00000 <error: Cannot access memory at address 0xc00000>, len = 1025359116}, port_no = 48352, proto = 15645, type = URN_URI_T, flags = (URI_USER_NORMALIZE | URI_SIP_USER_PHONE | unknown: 1390957584), transport = {s = 0x43d1dbd3c <error: Cannot access memory at address 0x43d1dbd3c>, len = -1412931264}, ttl = { s = 0x7f5152e85418 "\r\nHost: x.x.x.x:6060\r\nAccept: */*\r\nContent-type: application/json\r\nContent-length: 38\r\n\r\n{"jsonrpc":"2.0", "method":"tm.clean"}", len = 1390957592}, user_param = {s = 0x7fff3d1dbe50 "p\277\035=\377\177", len = -1368320222}, maddr = {s = 0x557bae95a842 <ut_buf_int2str+18> "976", len = 163710976}, method = {s = 0x7fff3d1dbd00 "P\275\035=\377\177", len = 1393988016}, lr = {s = 0x7f5152e8549d "", len = 1390957561}, r2 = {s = 0x7fff3d1dbd50 "\260\275\035=\377\177", len = 84243696}, gr = {s = 0x37500000000 <error: Cannot access memory at address 0x37500000000>, len = 1025359520}, transport_val = {s = 0x7f51537eccf8 "", len = -1368709420}, ttl_val = {s = 0x7fff3d1dbd70 "", len = 1389801488}, user_param_val = { s = 0x557bae95a842 <ut_buf_int2str+18> "976", len = 1389801488}, maddr_val = {s = 0x7fff3d1dbdb0 "\340\275\035=\377\177", len = 1356175867}, method_val = {s = 0x4ae7f69e3 <error: Cannot access memory at address 0x4ae7f69e3>, len = 0}, lr_val = { s = 0xc00000 <error: Cannot access memory at address 0xc00000>, len = 1260448}, r2_val = {s = 0x9fdb40 <error: Cannot access memory at address 0x9fdb40>, len = 11322464}, gr_val = {s = 0xb7fab0 <error: Cannot access memory at address 0xb7fab0>, len = 16}} next_hop = {user = {s = 0x11e <error: Cannot access memory at address 0x11e>, len = 163710976}, passwd = {s = 0x7fff3d1dbde0 "P\276\035=\377\177", len = -1370752186}, host = {s = 0x0, len = 384}, port = {s = 0x7f51531692a0 <incomplete sequence \360>, len = 0}, params = {s = 0x7fff3d1dbe50 "p\277\035=\377\177", len = -1368726502}, sip_params = {s = 0xf0 <error: Cannot access memory at address 0xf0>, len = 1390957594}, headers = {s = 0x0, len = 1393988016}, port_no = 37280, proto = 21270, type = 32593, flags = (unknown: 163710976), transport = {s = 0xb7fab0 <error: Cannot access memory at address 0xb7fab0>, len = 1390199904}, ttl = {s = 0x557bae7f69e3 "INFO", len = 536870912}, user_param = {s = 0xc00000 <error: Cannot access memory at address 0xc00000>, len = 6}, maddr = {s = 0x7fff3d1dbf70 "\020\300\035=\377\177", len = -1368302309}, method = {s = 0x7f51531d2600 "\006", len = 1389801488}, lr = {s = 0x7fff3d1dbed0 "", len = 1356175867}, r2 = {s = 0x10 <error: Cannot access memory at address 0x10>, len = 0}, gr = {s = 0xc00000 <error: Cannot access memory at address 0xc00000>, len = 1392671776}, transport_val = {s = 0x7f5152e8549d "", len = 1390957556}, ttl_val = {s = 0xb7fab0 <error: Cannot access memory at address 0xb7fab0>, len = 16}, user_param_val = { s = 0x11e <error: Cannot access memory at address 0x11e>, len = 163710976}, maddr_val = {s = 0x7fff3d1dbf00 "p\277\035=\001", len = -1370752186}, method_val = {s = 0x0, len = 0}, lr_val = {s = 0x7f5152d6b0f0 "", len = 0}, r2_val = { s = 0x400000013d1dbf70 <error: Cannot access memory at address 0x400000013d1dbf70>, len = 1025360400}, gr_val = { s = 0x7f5152e8541a "Host: x.x.x.x:6060\r\nAccept: */*\r\nContent-type: application/json\r\nContent-length: 38\r\n\r\n{"jsonrpc":"2.0", "method":"tm.clean"}", len = 1393988016}} u = 0x7f512ecf128c port = 21250 dst_host = 0xe i = 1 flags = 0 avp = 0x557bae7f69e3 st = {flags = 0, id = 0, name = {n = 4, s = {s = 0x4 <error: Cannot access memory at address 0x4>, len = 0}, re = 0x4}, avp = 0x10} sct = 0x7f51539be112 <_IO_vfprintf_internal+1826> sjt = 0x557bae7f69e3 rve = 0x7f51537eb2f0 mct = 0x7f5152de7980 rv = 0x0 rv1 = 0x7fff3d1dc140 c1 = {cache_type = 12582912, val_type = RV_NONE, c = {avp_val = {n = 163710976, s = {s = 0x8b32937609c20800 <error: Cannot access memory at address 0x8b32937609c20800>, len = 1025359888}, re = 0x8b32937609c20800}, pval = {rs = { s = 0x8b32937609c20800 <error: Cannot access memory at address 0x8b32937609c20800>, len = 1025359888}, ri = -1368282998, flags = 21883}}, i2s = "\022\000\000\000\000\000\000\000\020\000\000\000\000\000\000\000\002\000\000\000\000"} s = {s = 0x1352 <error: Cannot access memory at address 0x1352>, len = 0} srevp = {0xab00000004, 0x0} evp = {data = 0x0, rcv = 0x0, dst = 0x0} mod_f_params = {{type = NOSUBTYPE, u = {number = 0, string = 0x0, str = {s = 0x0, len = 0}, data = 0x0, attr = 0x0, select = 0x0}}, {type = NOSUBTYPE, u = {number = 0, string = 0x0, str = {s = 0x0, len = 0}, data = 0x0, attr = 0x0, select = 0x0}}, { type = NOSUBTYPE, u = {number = 0, string = 0x0, str = {s = 0x0, len = 0}, data = 0x0, attr = 0x0, select = 0x0}}, {type = NOSUBTYPE, u = {number = 0, string = 0x0, str = {s = 0x0, len = 0}, data = 0x0, attr = 0x0, select = 0x0}}, {type = NOSUBTYPE, u = { number = 0, string = 0x0, str = {s = 0x0, len = 0}, data = 0x0, attr = 0x0, select = 0x0}}, {type = NOSUBTYPE, u = {number = 0, string = 0x0, str = {s = 0x0, len = 0}, data = 0x0, attr = 0x0, select = 0x0}}, {type = NOSUBTYPE, u = {number = 0, string = 0x0, str = {s = 0x0, len = 0}, data = 0x0, attr = 0x0, select = 0x0}}, {type = NOSUBTYPE, u = {number = 0, string = 0x0, str = {s = 0x0, len = 0}, data = 0x0, attr = 0x0, select = 0x0}}} __func__ = "do_action" #6 0x0000557bae5d0626 in run_actions (h=0x7fff3d1dc130, a=0x7f51537e7400, msg=0x7fff3d1dc210) at core/action.c:1578 t = 0x7f51537ebf50 ret = 1 tvb = {tv_sec = 0, tv_usec = 0} tve = {tv_sec = 0, tv_usec = 0} tz = {tz_minuteswest = 1025360400, tz_dsttime = 32767} tdiff = 0 __func__ = "run_actions" #7 0x00007f51511a943e in xhttp_process_request (orig_msg=0x7f5153876780, new_buf=0x7f5152e853e0 "POST /RPC HTTP/1.1\r\nVia: SIP/2.0/TCP X.X.X.X:43976\r\nHost: x.x.x.x:6060\r\nAccept: */*\r\nContent-type: application/json\r\nContent-length: 38\r\n\r\n{"jsonrpc":"2.0", "method":"tm.clean"}", new_len=189) at xhttp_mod.c:296 ret = 0 backup_rt = 1 tmp_msg = {id = 1197, pid = 0, tval = {tv_sec = 0, tv_usec = 0}, fwd_send_flags = {f = 0, blst_imask = 0}, rpl_send_flags = {f = 3, blst_imask = 0}, first_line = {type = 1, flags = 2, len = 20, u = {request = {method = { s = 0x7f5152e853e0 "POST /RPC HTTP/1.1\r\nVia: SIP/2.0/TCP X.X.X.X:43976\r\nHost: x.x.x.x:6060\r\nAccept: */*\r\nContent-type: application/json\r\nContent-length: 38\r\n\r\n{"jsonrpc":"2.0", "method":"tm.clean"}", len = 4}, uri = { s = 0x7f5152e853e5 "/RPC HTTP/1.1\r\nVia: SIP/2.0/TCP X.X.X.X:43976\r\nHost: x.x.x.x:6060\r\nAccept: */*\r\nContent-type: application/json\r\nContent-length: 38\r\n\r\n{"jsonrpc":"2.0", "method":"tm.clean"}", len = 4}, version = { s = 0x7f5152e853ea "HTTP/1.1\r\nVia: SIP/2.0/TCP X.X.X.X:43976\r\nHost: x.x.x.x:6060\r\nAccept: */*\r\nContent-type: application/json\r\nContent-length: 38\r\n\r\n{"jsonrpc":"2.0", "method":"tm.clean"}", len = 8}, method_value = 65536}, reply = {version = { s = 0x7f5152e853e0 "POST /RPC HTTP/1.1\r\nVia: SIP/2.0/TCP X.X.X.X:43976\r\nHost: x.x.x.x:6060\r\nAccept: */*\r\nContent-type: application/json\r\nContent-length: 38\r\n\r\n{"jsonrpc":"2.0", "method":"tm.clean"}", len = 4}, status = { s = 0x7f5152e853e5 "/RPC HTTP/1.1\r\nVia: SIP/2.0/TCP X.X.X.X:43976\r\nHost: x.x.x.x:6060\r\nAccept: */*\r\nContent-type: application/json\r\nContent-length: 38\r\n\r\n{"jsonrpc":"2.0", "method":"tm.clean"}", len = 4}, reason = { s = 0x7f5152e853ea "HTTP/1.1\r\nVia: SIP/2.0/TCP X.X.X.X:43976\r\nHost: x.x.x.x:6060\r\nAccept: */*\r\nContent-type: application/json\r\nContent-length: 38\r\n\r\n{"jsonrpc":"2.0", "method":"tm.clean"}", len = 8}, statuscode = 65536}}}, via1 = 0x7f51531691b0, via2 = 0x0, headers = 0x7f5153027c20, last_header = 0x7f51531ae720, parsed_flag = 18446744073709551615, h_via1 = 0x7f5153027c20, h_via2 = 0x0, callid = 0x0, to = 0x0, cseq = 0x0, from = 0x0, contact = 0x0, --Type <RET> for more, q to quit, c to continue without paging-- maxforwards = 0x0, route = 0x0, record_route = 0x0, content_type = 0x7f51531ae6c0, content_length = 0x7f51531ae720, authorization = 0x0, expires = 0x0, proxy_auth = 0x0, supported = 0x0, require = 0x0, proxy_require = 0x0, unsupported = 0x0, allow = 0x0, event = 0x0, accept = 0x7f51531bb9f0, accept_language = 0x0, organization = 0x0, priority = 0x0, subject = 0x0, user_agent = 0x0, server = 0x0, content_disposition = 0x0, diversion = 0x0, rpid = 0x0, refer_to = 0x0, session_expires = 0x0, min_se = 0x0, sipifmatch = 0x0, subscription_state = 0x0, date = 0x0, identity = 0x0, identity_info = 0x0, pai = 0x0, ppi = 0x0, path = 0x0, privacy = 0x0, min_expires = 0x0, body = 0x0, eoh = 0x7f5152e85475 "\r\n{"jsonrpc":"2.0", "method":"tm.clean"}", unparsed = 0x7f5152e85475 "\r\n{"jsonrpc":"2.0", "method":"tm.clean"}", rcv = {src_ip = {af = 2, len = 4, u = {addrl = {362853934, 0}, addr32 = {362853934, 0, 0, 0}, addr16 = {46638, 5536, 0, 0, 0, 0, 0, 0}, addr = ".\266\240\025", '\000' <repeats 11 times>}}, dst_ip = {af = 2, len = 4, u = {addrl = {346076718, 0}, addr32 = {346076718, 0, 0, 0}, addr16 = {46638, 5280, 0, 0, 0, 0, 0, 0}, addr = ".\266\240\024", '\000' <repeats 11 times>}}, src_port = 43976, dst_port = 6060, proto_reserved1 = 1262, proto_reserved2 = 0, src_su = {s = {sa_family = 2, sa_data = "\253\310.\266\240\025\000\000\000\000\000\000\000"}, sin = {sin_family = 2, sin_port = 51371, sin_addr = {s_addr = 362853934}, sin_zero = "\000\000\000\000\000\000\000"}, sin6 = {sin6_family = 2, sin6_port = 51371, sin6_flowinfo = 362853934, sin6_addr = {__in6_u = {__u6_addr8 = "\000\000\000\000\000\000\000\000\060\317\035=\377\177\000", __u6_addr16 = {0, 0, 0, 0, 53040, 15645, 32767, 0}, __u6_addr32 = {0, 0, 1025363760, 32767}}}, sin6_scope_id = 0}}, bind_address = 0x7f5152dcc460, proto = 2 '\002', proto_pad0 = 0 '\000', proto_pad1 = 0}, buf = 0x7f5152e853e0 "POST /RPC HTTP/1.1\r\nVia: SIP/2.0/TCP X.X.X.X:43976\r\nHost: x.x.x.x:6060\r\nAccept: */*\r\nContent-type: application/json\r\nContent-length: 38\r\n\r\n{"jsonrpc":"2.0", "method":"tm.clean"}", len = 189, new_uri = { s = 0x0, len = 0}, dst_uri = {s = 0x0, len = 0}, parsed_uri_ok = 0, parsed_uri = {user = {s = 0x0, len = 0}, passwd = {s = 0x0, len = 0}, host = {s = 0x0, len = 0}, port = {s = 0x0, len = 0}, params = {s = 0x0, len = 0}, sip_params = {s = 0x0, len = 0}, headers = {s = 0x0, len = 0}, port_no = 0, proto = 0, type = ERROR_URI_T, flags = (unknown: 0), transport = {s = 0x0, len = 0}, ttl = {s = 0x0, len = 0}, user_param = {s = 0x0, len = 0}, maddr = {s = 0x0, len = 0}, method = {s = 0x0, len = 0}, lr = {s = 0x0, len = 0}, r2 = {s = 0x0, len = 0}, gr = {s = 0x0, len = 0}, transport_val = {s = 0x0, len = 0}, ttl_val = {s = 0x0, len = 0}, user_param_val = {s = 0x0, len = 0}, maddr_val = {s = 0x0, len = 0}, method_val = {s = 0x0, len = 0}, lr_val = {s = 0x0, len = 0}, r2_val = {s = 0x0, len = 0}, gr_val = {s = 0x0, len = 0}}, parsed_orig_ruri_ok = 0, parsed_orig_ruri = {user = {s = 0x0, len = 0}, passwd = {s = 0x0, len = 0}, host = {s = 0x0, len = 0}, port = {s = 0x0, len = 0}, params = {s = 0x0, len = 0}, sip_params = { s = 0x0, len = 0}, headers = {s = 0x0, len = 0}, port_no = 0, proto = 0, type = ERROR_URI_T, flags = (unknown: 0), transport = {s = 0x0, len = 0}, ttl = {s = 0x0, len = 0}, user_param = {s = 0x0, len = 0}, maddr = {s = 0x0, len = 0}, method = {s = 0x0, len = 0}, lr = {s = 0x0, len = 0}, r2 = {s = 0x0, len = 0}, gr = {s = 0x0, len = 0}, transport_val = {s = 0x0, len = 0}, ttl_val = {s = 0x0, len = 0}, user_param_val = {s = 0x0, len = 0}, maddr_val = {s = 0x0, len = 0}, method_val = {s = 0x0, len = 0}, lr_val = {s = 0x0, len = 0}, r2_val = {s = 0x0, len = 0}, gr_val = {s = 0x0, len = 0}}, add_rm = 0x0, body_lumps = 0x0, reply_lump = 0x0, add_to_branch_s = '\000' <repeats 57 times>, add_to_branch_len = 0, hash_index = 0, msg_flags = 0, flags = 0, xflags = {0, 0}, set_global_address = {s = 0x0, len = 0}, set_global_port = {s = 0x0, len = 0}, force_send_socket = 0x0, path_vec = {s = 0x0, len = 0}, instance = {s = 0x0, len = 0}, reg_id = 0, ruid = {s = 0x0, len = 0}, location_ua = {s = 0x0, len = 0}, otcpid = 0, ldv = { flow = {decoded = 0, rcv = {src_ip = {af = 0, len = 0, u = {addrl = {0, 0}, addr32 = {0, 0, 0, 0}, addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, addr = '\000' <repeats 15 times>}}, dst_ip = {af = 0, len = 0, u = {addrl = {0, 0}, addr32 = {0, 0, 0, 0}, addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, addr = '\000' <repeats 15 times>}}, src_port = 0, dst_port = 0, proto_reserved1 = 0, proto_reserved2 = 0, src_su = {s = {sa_family = 0, sa_data = '\000' <repeats 13 times>}, sin = {sin_family = 0, sin_port = 0, sin_addr = {s_addr = 0}, sin_zero = "\000\000\000\000\000\000\000"}, sin6 = {sin6_family = 0, sin6_port = 0, sin6_flowinfo = 0, sin6_addr = {__in6_u = {__u6_addr8 = '\000' <repeats 15 times>, __u6_addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, __u6_addr32 = {0, 0, 0, 0}}}, sin6_scope_id = 0}}, bind_address = 0x0, proto = 0 '\000', proto_pad0 = 0 '\000', proto_pad1 = 0}}}} msg = 0x7fff3d1dc210 ra_ctx = {rec_lev = 2, run_flags = 0, last_retcode = 1, jmp_env = {{__jmpbuf = {139987259278432, -6960915947526443003, 93989696924131, 536870912, 12582912, 6, -6960915947698409467, -3860432251912212475}, __mask_was_saved = 0, __saved_mask = {__val = { 18446744073709551615, 0, 4, 93986769338368, 16, 12582944, 140734218749576, 3617854161208475648, 0, 0, 139987273275232, 139987269897424, 139987258674898, 11, 139987258674898, 139987227742213}}}}} keng = 0x0 evrtname = {s = 0x7f51511b1598 "xhttp:request", len = 13} __func__ = "xhttp_process_request" #8 0x00007f51511aac55 in xhttp_handler (msg=0x7f5153876780) at xhttp_mod.c:383 ret = 0 fake_msg = 0x7f5152e853e0 "POST /RPC HTTP/1.1\r\nVia: SIP/2.0/TCP X.X.X.X:43976\r\nHost: X.X.X.X:6060\r\nAccept: */*\r\nContent-type: application/json\r\nContent-length: 38\r\n\r\n{"jsonrpc":"2.0", "method":"tm.clean"}" fake_msg_len = 189 pmatch = {rm_so = 1025362464, rm_eo = 32767} c = 83 'S' __func__ = "xhttp_handler" #9 0x0000557bae559a24 in nonsip_msg_run_hooks (msg=0x7f5153876780) at core/nonsip_hooks.c:112 r = 0 ret = 0 #10 0x0000557bae5d6720 in receive_msg (buf=0x557bb08fee80 "POST /RPC HTTP/1.1\r\nHost: X.X.X.X:6060\r\nAccept: */*\r\nContent-type: application/json\r\nContent-length: 38\r\n\r\n{"jsonrpc":"2.0", "method":"tm.clean"}", len=151, rcv_info=0x7f512fe42828) at core/receive.c:348 msg = 0x7f5153876780 ctx = {rec_lev = 4, run_flags = 0, last_retcode = 803482386, jmp_env = {{__jmpbuf = {8, 65536, 27, 0, 140734218751644, 139986672560144, 7, 70364449210396}, __mask_was_saved = 121, __saved_mask = {__val = {10030241455286650880, 65535, 139986672560144, 93989696924131, 536870912, 12582912, 6, 140734218751232, 93989695295308, 519691042940, 2, 0, 65535, 140734218751644, 139986672560144, 18446744073709551536}}}}} bctx = 0x7fff3d1dcc10 ret = 0 tvb = {tv_sec = 140734218751644, tv_usec = 139986672560904} tve = {tv_sec = 139986672560144, tv_usec = 139986672560144} tz = {tz_minuteswest = -1369026089, tz_dsttime = 21883} diff = 0 inb = {s = 0x557bb08fee80 "POST /RPC HTTP/1.1\r\nHost: X.X.X.X:6060\r\nAccept: */*\r\nContent-type: application/json\r\nContent-length: 38\r\n\r\n{"jsonrpc":"2.0", "method":"tm.clean"}", len = 151} netinfo = {data = {s = 0x1400020001 <error: Cannot access memory at address 0x1400020001>, len = 803482376}, rcv = 0x4, dst = 0x7f512fe42b0d} keng = 0x0 evp = {data = 0x7fff3d1dcb30, rcv = 0x7f512fe42828, dst = 0x0} cidlockidx = 0 cidlockset = 0 errsipmsg = 0 exectime = 0 __func__ = "receive_msg" #11 0x0000557bae670357 in receive_tcp_msg (tcpbuf=0x7f512fe42b08 "POST /RPC HTTP/1.1\r\nHost: X.X.X.X:6060\r\nAccept: */*\r\nContent-type: application/json\r\nContent-length: 38\r\n\r\n{"jsonrpc":"2.0", "method":"tm.clean"}", len=151, rcv_info=0x7f512fe42828, con=0x7f512fe42810) at core/tcp_read.c:1448 buf = 0x557bb08fee80 "POST /RPC HTTP/1.1\r\nHost: X.X.X.X:6060\r\nAccept: */*\r\nContent-type: application/json\r\nContent-length: 38\r\n\r\n{"jsonrpc":"2.0", "method":"tm.clean"}" bsize = 65535 blen = 65535 __func__ = "receive_tcp_msg" #12 0x0000557bae6725f3 in tcp_read_req (con=0x7f512fe42810, bytes_read=0x7fff3d1dce98, read_flags=0x7fff3d1dce9c) at core/tcp_read.c:1631 bytes = 151 total_bytes = 151 resp = 1 size = 24 req = 0x7f512fe42890 dst = {send_sock = 0x0, to = {s = {sa_family = 61136, sa_data = "\342/Q\177\000\000\330\316\035=\377\177\000"}, sin = {sin_family = 61136, sin_port = 12258, sin_addr = {s_addr = 32593}, sin_zero = "\330\316\035=\377\177\000"}, sin6 = {sin6_family = 61136, sin6_port = 12258, sin6_flowinfo = 32593, sin6_addr = {__in6_u = {__u6_addr8 = "\330\316\035=\377\177\000\000\b\000\000\000\000\000\000", __u6_addr16 = {52952, 15645, 32767, 0, 8, 0, 0, 0}, __u6_addr32 = {1025363672, 32767, 8, 0}}}, sin6_scope_id = 20}}, id = 0, send_flags = {f = 1, blst_imask = 0}, proto = 1 '\001', proto_pad0 = 0 '\000', proto_pad1 = 0} --Type <RET> for more, q to quit, c to continue without paging-- c = 49 '1' ret = 1025363424 __func__ = "tcp_read_req" #13 0x0000557bae6755a2 in handle_io (fm=0x7f5152db1498, events=1, idx=-1) at core/tcp_read.c:1804 ret = 8 n = 8 read_flags = 1 con = 0x7f512fe42810 s = 28 resp = -2 t = 1117018954 __func__ = "handle_io" #14 0x0000557bae662ba9 in io_wait_loop_epoll (h=0x557bae90ad00 <io_w>, t=2, repeat=0) at core/io_wait.h:1062 n = 1 r = 0 fm = 0x7f5152db1498 revents = 1 __func__ = "io_wait_loop_epoll" #15 0x0000557bae6775f5 in tcp_receive_loop (unix_sock=59) at core/tcp_read.c:1974 __func__ = "tcp_receive_loop" #16 0x0000557bae51044a in tcp_init_children () at core/tcp_main.c:5174 r = 8 i = 7 reader_fd_1 = 59 pid = 0 si_desc = "tcp receiver (tcp:X.X.X.X:6060)\000\000\000\062\023\000\000\001", '\000' <repeats 11 times>, "\260\034\246SQ\177\000\000\343i\177\256{U\000\000\000\000\000 \000\000\000\000\000\000\300\000\000\000\000\000\006\000\000\000\000\000\000\000 \321\035=\377\177\000\000\8V\256{U\000\000 \321\035=\377\177\000\000\254\204a\256\000\000\000" si = 0x0 __func__ = "tcp_init_children" #17 0x0000557bae4228ce in main_loop () at main.c:1761 i = 8 pid = 4910 si = 0x0 si_desc = "udp receiver child=7 sock=127.0.0.1:5070\000\060\066\060\000\000\000\000\020\322\035=\377\177\000\000\000\b\302\tv\223\062\213\004\000\000\000\000\000\000\000\260\034\246SQ\177\000\000\343i\177\256{U\000\000\000\000\000 \000\000\000\000\000\000\300\000\000\000\000\000\006\000\000\000\000\000\000\000P\322\035=\377\177\000\000>0d\256{U\000" nrprocs = 8 woneinit = 1 __func__ = "main_loop" #18 0x0000557bae42abb9 in main (argc=17, argv=0x7fff3d1dd738) at main.c:2802 cfg_stream = 0x557bb0796260 c = -1 r = 0 tmp = 0x7fff3d1def3f "" tmp_len = 17 port = 0 proto = 1025365816 ahost = 0x0 aport = 0 options = 0x557bae7a0618 ":f:cm:M:dVIhEeb:l:L:n:vKrRDTN:W:w:t:u:g:P:G:SQ:O:a:A:x:X:Y:" ret = -1 seed = 1972091384 rfd = 4 debug_save = 0 debug_flag = 0 dont_fork_cnt = 0 n_lst = 0x0 p = 0x0 st = {st_dev = 21, st_ino = 1930, st_nlink = 2, st_mode = 16832, st_uid = 109, st_gid = 115, __pad0 = 0, st_rdev = 0, st_size = 40, st_blksize = 4096, st_blocks = 0, st_atim = {tv_sec = 1572122046, tv_nsec = 75984014}, st_mtim = {tv_sec = 1583307292, tv_nsec = 782512911}, st_ctim = {tv_sec = 1583307633, tv_nsec = 722908545}, __glibc_reserved = {0, 0, 0}} tbuf = "\000\000\000\000\000\000\000\000\b\000\000\000\000\000\000\000\060G\272SQ\177\000\000\000\000\000\000\000\000\000\000\240q\037=\377\177\000\000\364J\270SQ\177\000\000\b\000\000\000\000\000\000\000\230r\037=\377\177\000\000\060G\272SQ\177\000\000\330\324\035=\377\177\000\000\324\324\035=\377\177", '\000' <repeats 18 times>, "\230r\037=\377\177\000\000\250q\037=\377\177\000\000'\356\256SQ\177\000\000&\260be\000\000\000\000@\326\035=\377\177\000\000\300\212\225\001\000\000\000\000\240\325\035=\377\177\000\000\220\325\035=\377\177\000\000\330\324\035=\377\177\000\000XG\272SQ\177\000\000\001", '\000' <repeats 31 times>... option_index = 0 long_options = {{name = 0x557bae7a2fca "help", has_arg = 0, flag = 0x0, val = 104}, {name = 0x557bae79d438 "version", has_arg = 0, flag = 0x0, val = 118}, {name = 0x557bae7a2fcf "alias", has_arg = 1, flag = 0x0, val = 1024}, {name = 0x557bae7a2fd5 "subst", has_arg = 1, flag = 0x0, val = 1025}, {name = 0x557bae7a2fdb "substdef", has_arg = 1, flag = 0x0, val = 1026}, {name = 0x557bae7a2fe4 "substdefs", has_arg = 1, flag = 0x0, val = 1027}, {name = 0x557bae7a2fee "server-id", has_arg = 1, flag = 0x0, val = 1028}, { name = 0x0, has_arg = 0, flag = 0x0, val = 0}} __func__ = "main"
There was an unsafe list iteration - can you try with:
- https://github.com/kamailio/kamailio/commit/325a45e846faae3e5dfa333727d5fab2...
If all ok, you can backport.
Cheers, Daniel
On 04.03.20 22:18, Juha Heinanen wrote:
Daniel-Constantin Mierla writes:
If you can reproduce it, watch what the timer processes do during that time frame. Get the list of processes with 'kamctl ps', then when the issue is exposed, grab the backtraces of all processes with:
kamctl trap
A file is created with the backtraces taken with gdb. Sent it over here to analyze, if you cannot spot an issue there.
Will do that. Now it happened that K crashed when tm.clean was given at the time when the number transactions was growing. Below is backtrace of that crash.
-- Juha
(gdb) bt full #0 0x00007f5151fa3cf8 in tm_clean_lifetime () at h_table.c:617 r = 53487 tcell = 0x0 texp = 1117017528 #1 0x00007f5151fe1afd in tm_rpc_clean (rpc=0x7f51513b0a00 <func_param>, c=0x7f51513b0920 <_jsonrpc_ctx_global>) at t_stats.c:299 No locals. #2 0x00007f515138fc9f in jsonrpc_dispatch (msg=0x7fff3d1dc210, s1=0x0, s2=0x0) at jsonrpcs_mod.c:1294 rpce = 0x7f5152dcf360 ctx = 0x7f51513b0920 <_jsonrpc_ctx_global> ret = 0 nj = 0x0 val = {s = 0x557bb08fe9d0 "tm.clean", len = 8} __func__ = "jsonrpc_dispatch" #3 0x0000557bae5c28b5 in do_action (h=0x7fff3d1dc130, a=0x7f51537ebd30, msg=0x7fff3d1dc210) at core/action.c:1073 ret = -5 v = 1390957541 dst = {send_sock = 0xc00000, to = {s = {sa_family = 6, sa_data = "\000\000\000\000\000\000\220\270\035=\377\177\000"}, sin = {sin_family = 6, sin_port = 0, sin_addr = {s_addr = 0}, sin_zero = "\220\270\035=\377\177\000"}, sin6 = {sin6_family = 6, sin6_port = 0, sin6_flowinfo = 0, sin6_addr = {__in6_u = {__u6_addr8 = "\220\270\035=\377\177\000\000\000\b\302\tv\223\062\213", __u6_addr16 = {47248, 15645, 32767, 0, 2048, 2498, 37750, 35634}, __u6_addr32 = {1025357968, 32767, 163710976, 2335347574}}}, sin6_scope_id = 1025358088}}, id = 32767, send_flags = {f = 50272, blst_imask = 21212}, proto = 81 'Q', proto_pad0 = 127 '\177', proto_pad1 = 0} tmp = 0x7f5152ab3c1e "H\205\300\177\345ƃ\224" new_uri = 0x7f51537e9da8 "" end = 0x7fff3d1dc210 "\255\004" crt = 0x557bb08fc931 "ip_proxy_cache" cmd = 0x7f5152dd27c0 len = 4 user = 0 uri = {user = {s = 0x7fff3d1db580 "\340\265\035=\377\177", len = 1390199904}, passwd = {s = 0x557bae7f69e3 "INFO", len = -1368709420}, host = {s = 0x7fff3d1db5a0 "", len = 1389801488}, port = {s = 0x7fff3d1db640 "\320\006", len = 1389801488}, params = { s = 0x7fff3d1db5e0 "\020\266\035=\377\177", len = 1356175867}, sip_params = {s = 0x7fff3d1db7f0 "", len = 0}, headers = {s = 0xc00000 <error: Cannot access memory at address 0xc00000>, len = 1260448}, port_no = 56128, proto = 159, type = ERROR_URI_T, flags = (unknown: 11322464), transport = {s = 0xb7fab0 <error: Cannot access memory at address 0xb7fab0>, len = 16}, ttl = {s = 0x11e <error: Cannot access memory at address 0x11e>, len = 163710976}, user_param = {s = 0x7fff3d1db610 "\220\266\035=\377\177", len = -1370752186}, maddr = {s = 0x0, len = 0}, method = {s = 0x7f5153158560 "\320\006", len = 0}, lr = {s = 0x7fff3d1db690 "\360\266\035=\377\177", len = -1368723422}, r2 = {s = 0x7f51531531f0 "\002", len = 1389801488}, gr = {s = 0x7f51537ea518 "\002", len = 1393897952}, transport_val = {s = 0x6d0 <error: Cannot access memory at address 0x6d0>, len = 1389801488}, ttl_val = {s = 0x7f51531530d0 <incomplete sequence \360>, len = 1393899728}, user_param_val = {s = 0x7f51531531f8 ">5\025SQ\177", len = 1390199904}, maddr_val = {s = 0x557bae7f69e3 "INFO", len = 536870912}, method_val = {s = 0xc00000 <error: Cannot access memory at address 0xc00000>, len = 6}, lr_val = {s = 0x7fff3d1db6f0 "P\267\035=\377\177", len = -1371351967}, r2_val = {s = 0x7fff3d1dc210 "\255\004", len = 1400808728}, gr_val = {s = 0x7fff3d1db730 "\220\267\035=\377\177", len = 163710976}} next_hop = {user = {s = 0x0, len = 1390199904}, passwd = {s = 0x557bae7f69e3 "INFO", len = 536870912}, host = {s = 0xc00000 <error: Cannot access memory at address 0xc00000>, len = 6}, port = {s = 0x7fff3d1db750 "", len = -1371324103}, params = { s = 0x7fff3d1db7f0 "", len = 1400806792}, sip_params = {s = 0x19000017ac <error: Cannot access memory at address 0x19000017ac>, len = 1400808728}, headers = {s = 0x7fff3d1dc210 "\255\004", len = -1369366375}, port_no = 46992, proto = 15645, type = 32767, flags = (unknown: 1390199904), transport = {s = 0x557bae7f69e3 "INFO", len = 536870912}, ttl = {s = 0xc00000 <error: Cannot access memory at address 0xc00000>, len = 6}, user_param = {s = 0x7fff3d1db8a0 "\340\271\035=\377\177", len = -1371320250}, maddr = { s = 0x557bae7f69e3 "INFO", len = 536870912}, method = {s = 0xc00000 <error: Cannot access memory at address 0xc00000>, len = 6}, lr = {s = 0x7fff3d1db7f0 "", len = 1387308127}, r2 = {s = 0x7f51537eac00 "\031", len = 1025358088}, gr = { s = 0x7fff3d1dc210 "\255\004", len = 1025360176}, transport_val = {s = 0x17ac3d1db7f0 <error: Cannot access memory at address 0x17ac3d1db7f0>, len = 6060}, ttl_val = {s = 0x557bae7f69e3 "INFO", len = 536870912}, user_param_val = {s = 0x7f51537e9d88 "\a", len = 1400808728}, maddr_val = {s = 0x0, len = 0}, method_val = {s = 0x0, len = 0}, lr_val = {s = 0x557bae95a842 <ut_buf_int2str+18> "976", len = 1398451376}, r2_val = {s = 0x7fff3d1db900 "\360\275\331R\001", len = 1390350000}, gr_val = {s = 0x0, len = 1390199904}} u = 0x7f5152abbc8d port = 21269 dst_host = 0x7fff3d1db7f8 i = 0 flags = 32593 avp = 0x7fff3d1db4f8 st = {flags = 1400806800, id = 32593, name = {n = 1025360400, s = {s = 0x7fff3d1dc210 "\255\004", len = 1025357344}, re = 0x7fff3d1dc210}, avp = 0x557b00000000} sct = 0x7f51511a819b <pv_get_huri+69> sjt = 0x7f5153164bb0 rve = 0x7fff3d1dc248 mct = 0x7fff3d1db7f8 rv = 0x7fff3d1db4d0 rv1 = 0x7f5152dcc4a0 c1 = {cache_type = 2927585763, val_type = 21883, c = {avp_val = {n = 536870912, s = {s = 0x20000000 <error: Cannot access memory at address 0x20000000>, len = 12582912}, re = 0x20000000}, pval = {rs = { s = 0x20000000 <error: Cannot access memory at address 0x20000000>, len = 12582912}, ri = 6, flags = 0}}, i2s = "\340\271\035=\377\177\000\000\242KC\256{U\000\000\000\000\300\000\000"} s = {s = 0x7f51537e9da8 "", len = 1025360400} srevp = {0x557bae7f69e3, 0x20000000} evp = {data = 0x0, rcv = 0x0, dst = 0x0} mod_f_params = {{type = NOSUBTYPE, u = {number = 0, string = 0x0, str = {s = 0x0, len = 0}, data = 0x0, attr = 0x0, select = 0x0}}, {type = NOSUBTYPE, u = {number = 0, string = 0x0, str = {s = 0x0, len = 0}, data = 0x0, attr = 0x0, select = 0x0}}, { type = NOSUBTYPE, u = {number = 0, string = 0x0, str = {s = 0x0, len = 0}, data = 0x0, attr = 0x0, select = 0x0}}, {type = NOSUBTYPE, u = {number = 0, string = 0x0, str = {s = 0x0, len = 0}, data = 0x0, attr = 0x0, select = 0x0}}, {type = NOSUBTYPE, u = { number = 0, string = 0x0, str = {s = 0x0, len = 0}, data = 0x0, attr = 0x0, select = 0x0}}, {type = NOSUBTYPE, u = {number = 0, string = 0x0, str = {s = 0x0, len = 0}, data = 0x0, attr = 0x0, select = 0x0}}, {type = NOSUBTYPE, u = {number = 0, string = 0x0, str = {s = 0x0, len = 0}, data = 0x0, attr = 0x0, select = 0x0}}, {type = NOSUBTYPE, u = {number = 0, string = 0x0, str = {s = 0x0, len = 0}, data = 0x0, attr = 0x0, select = 0x0}}} __func__ = "do_action" #4 0x0000557bae5d0626 in run_actions (h=0x7fff3d1dc130, a=0x7f51537eba80, msg=0x7fff3d1dc210) at core/action.c:1578 t = 0x7f51537ebd30 ret = 1 tvb = {tv_sec = 0, tv_usec = 0} tve = {tv_sec = 0, tv_usec = 0} tz = {tz_minuteswest = 21, tz_dsttime = 0} tdiff = 2962221528 __func__ = "run_actions" #5 0x0000557bae5c2821 in do_action (h=0x7fff3d1dc130, a=0x7f51537ebf50, msg=0x7fff3d1dc210) at core/action.c:1064 --Type <RET> for more, q to quit, c to continue without paging-- ret = 1 v = 1 dst = {send_sock = 0x7f5153027c10, to = {s = {sa_family = 2048, sa_data = "\302\tv\223\062\213y\237\037QQ\177\000"}, sin = {sin_family = 2048, sin_port = 2498, sin_addr = {s_addr = 2335347574}, sin_zero = "y\237\037QQ\177\000"}, sin6 = {sin6_family = 2048, sin6_port = 2498, sin6_flowinfo = 2335347574, sin6_addr = {__in6_u = {__u6_addr8 = "y\237\037QQ\177\000\000\017\000\000\000Q\177\000", __u6_addr16 = {40825, 20767, 32593, 0, 15, 0, 32593, 0}, __u6_addr32 = {1361026937, 32593, 15, 32593}}}, sin6_scope_id = 2927585763}}, id = 4, send_flags = {f = 39616, blst_imask = 44675}, proto = 123 '{', proto_pad0 = 85 'U', proto_pad1 = 0} tmp = 0x500000000 <error: Cannot access memory at address 0x500000000> new_uri = 0x7fff3d1dc2b0 "" end = 0x7f5152d38eda "" crt = 0x0 cmd = 0x7f5153b25fa0 <_IO_strn_jumps> len = 21 user = 32593 uri = {user = {s = 0x4009e6720 <error: Cannot access memory at address 0x4009e6720>, len = -1367595136}, passwd = {s = 0x7fff3d1dbcc8 ""\027q\256{U", len = 163710976}, host = {s = 0x557bae95a830 <ut_buf_int2str> "", len = 0}, port = { s = 0xd68 <error: Cannot access memory at address 0xd68>, len = 17}, params = {s = 0x0, len = 1390199904}, sip_params = {s = 0x557bae7f69e3 "INFO", len = 536870912}, headers = {s = 0xc00000 <error: Cannot access memory at address 0xc00000>, len = 1025359116}, port_no = 48352, proto = 15645, type = URN_URI_T, flags = (URI_USER_NORMALIZE | URI_SIP_USER_PHONE | unknown: 1390957584), transport = {s = 0x43d1dbd3c <error: Cannot access memory at address 0x43d1dbd3c>, len = -1412931264}, ttl = { s = 0x7f5152e85418 "\r\nHost: x.x.x.x:6060\r\nAccept: */*\r\nContent-type: application/json\r\nContent-length: 38\r\n\r\n{"jsonrpc":"2.0", "method":"tm.clean"}", len = 1390957592}, user_param = {s = 0x7fff3d1dbe50 "p\277\035=\377\177", len = -1368320222}, maddr = {s = 0x557bae95a842 <ut_buf_int2str+18> "976", len = 163710976}, method = {s = 0x7fff3d1dbd00 "P\275\035=\377\177", len = 1393988016}, lr = {s = 0x7f5152e8549d "", len = 1390957561}, r2 = {s = 0x7fff3d1dbd50 "\260\275\035=\377\177", len = 84243696}, gr = {s = 0x37500000000 <error: Cannot access memory at address 0x37500000000>, len = 1025359520}, transport_val = {s = 0x7f51537eccf8 "", len = -1368709420}, ttl_val = {s = 0x7fff3d1dbd70 "", len = 1389801488}, user_param_val = { s = 0x557bae95a842 <ut_buf_int2str+18> "976", len = 1389801488}, maddr_val = {s = 0x7fff3d1dbdb0 "\340\275\035=\377\177", len = 1356175867}, method_val = {s = 0x4ae7f69e3 <error: Cannot access memory at address 0x4ae7f69e3>, len = 0}, lr_val = { s = 0xc00000 <error: Cannot access memory at address 0xc00000>, len = 1260448}, r2_val = {s = 0x9fdb40 <error: Cannot access memory at address 0x9fdb40>, len = 11322464}, gr_val = {s = 0xb7fab0 <error: Cannot access memory at address 0xb7fab0>, len = 16}} next_hop = {user = {s = 0x11e <error: Cannot access memory at address 0x11e>, len = 163710976}, passwd = {s = 0x7fff3d1dbde0 "P\276\035=\377\177", len = -1370752186}, host = {s = 0x0, len = 384}, port = {s = 0x7f51531692a0 <incomplete sequence \360>, len = 0}, params = {s = 0x7fff3d1dbe50 "p\277\035=\377\177", len = -1368726502}, sip_params = {s = 0xf0 <error: Cannot access memory at address 0xf0>, len = 1390957594}, headers = {s = 0x0, len = 1393988016}, port_no = 37280, proto = 21270, type = 32593, flags = (unknown: 163710976), transport = {s = 0xb7fab0 <error: Cannot access memory at address 0xb7fab0>, len = 1390199904}, ttl = {s = 0x557bae7f69e3 "INFO", len = 536870912}, user_param = {s = 0xc00000 <error: Cannot access memory at address 0xc00000>, len = 6}, maddr = {s = 0x7fff3d1dbf70 "\020\300\035=\377\177", len = -1368302309}, method = {s = 0x7f51531d2600 "\006", len = 1389801488}, lr = {s = 0x7fff3d1dbed0 "", len = 1356175867}, r2 = {s = 0x10 <error: Cannot access memory at address 0x10>, len = 0}, gr = {s = 0xc00000 <error: Cannot access memory at address 0xc00000>, len = 1392671776}, transport_val = {s = 0x7f5152e8549d "", len = 1390957556}, ttl_val = {s = 0xb7fab0 <error: Cannot access memory at address 0xb7fab0>, len = 16}, user_param_val = { s = 0x11e <error: Cannot access memory at address 0x11e>, len = 163710976}, maddr_val = {s = 0x7fff3d1dbf00 "p\277\035=\001", len = -1370752186}, method_val = {s = 0x0, len = 0}, lr_val = {s = 0x7f5152d6b0f0 "", len = 0}, r2_val = { s = 0x400000013d1dbf70 <error: Cannot access memory at address 0x400000013d1dbf70>, len = 1025360400}, gr_val = { s = 0x7f5152e8541a "Host: x.x.x.x:6060\r\nAccept: */*\r\nContent-type: application/json\r\nContent-length: 38\r\n\r\n{"jsonrpc":"2.0", "method":"tm.clean"}", len = 1393988016}} u = 0x7f512ecf128c port = 21250 dst_host = 0xe i = 1 flags = 0 avp = 0x557bae7f69e3 st = {flags = 0, id = 0, name = {n = 4, s = {s = 0x4 <error: Cannot access memory at address 0x4>, len = 0}, re = 0x4}, avp = 0x10} sct = 0x7f51539be112 <_IO_vfprintf_internal+1826> sjt = 0x557bae7f69e3 rve = 0x7f51537eb2f0 mct = 0x7f5152de7980 rv = 0x0 rv1 = 0x7fff3d1dc140 c1 = {cache_type = 12582912, val_type = RV_NONE, c = {avp_val = {n = 163710976, s = {s = 0x8b32937609c20800 <error: Cannot access memory at address 0x8b32937609c20800>, len = 1025359888}, re = 0x8b32937609c20800}, pval = {rs = { s = 0x8b32937609c20800 <error: Cannot access memory at address 0x8b32937609c20800>, len = 1025359888}, ri = -1368282998, flags = 21883}}, i2s = "\022\000\000\000\000\000\000\000\020\000\000\000\000\000\000\000\002\000\000\000\000"} s = {s = 0x1352 <error: Cannot access memory at address 0x1352>, len = 0} srevp = {0xab00000004, 0x0} evp = {data = 0x0, rcv = 0x0, dst = 0x0} mod_f_params = {{type = NOSUBTYPE, u = {number = 0, string = 0x0, str = {s = 0x0, len = 0}, data = 0x0, attr = 0x0, select = 0x0}}, {type = NOSUBTYPE, u = {number = 0, string = 0x0, str = {s = 0x0, len = 0}, data = 0x0, attr = 0x0, select = 0x0}}, { type = NOSUBTYPE, u = {number = 0, string = 0x0, str = {s = 0x0, len = 0}, data = 0x0, attr = 0x0, select = 0x0}}, {type = NOSUBTYPE, u = {number = 0, string = 0x0, str = {s = 0x0, len = 0}, data = 0x0, attr = 0x0, select = 0x0}}, {type = NOSUBTYPE, u = { number = 0, string = 0x0, str = {s = 0x0, len = 0}, data = 0x0, attr = 0x0, select = 0x0}}, {type = NOSUBTYPE, u = {number = 0, string = 0x0, str = {s = 0x0, len = 0}, data = 0x0, attr = 0x0, select = 0x0}}, {type = NOSUBTYPE, u = {number = 0, string = 0x0, str = {s = 0x0, len = 0}, data = 0x0, attr = 0x0, select = 0x0}}, {type = NOSUBTYPE, u = {number = 0, string = 0x0, str = {s = 0x0, len = 0}, data = 0x0, attr = 0x0, select = 0x0}}} __func__ = "do_action" #6 0x0000557bae5d0626 in run_actions (h=0x7fff3d1dc130, a=0x7f51537e7400, msg=0x7fff3d1dc210) at core/action.c:1578 t = 0x7f51537ebf50 ret = 1 tvb = {tv_sec = 0, tv_usec = 0} tve = {tv_sec = 0, tv_usec = 0} tz = {tz_minuteswest = 1025360400, tz_dsttime = 32767} tdiff = 0 __func__ = "run_actions" #7 0x00007f51511a943e in xhttp_process_request (orig_msg=0x7f5153876780, new_buf=0x7f5152e853e0 "POST /RPC HTTP/1.1\r\nVia: SIP/2.0/TCP X.X.X.X:43976\r\nHost: x.x.x.x:6060\r\nAccept: */*\r\nContent-type: application/json\r\nContent-length: 38\r\n\r\n{"jsonrpc":"2.0", "method":"tm.clean"}", new_len=189) at xhttp_mod.c:296 ret = 0 backup_rt = 1 tmp_msg = {id = 1197, pid = 0, tval = {tv_sec = 0, tv_usec = 0}, fwd_send_flags = {f = 0, blst_imask = 0}, rpl_send_flags = {f = 3, blst_imask = 0}, first_line = {type = 1, flags = 2, len = 20, u = {request = {method = { s = 0x7f5152e853e0 "POST /RPC HTTP/1.1\r\nVia: SIP/2.0/TCP X.X.X.X:43976\r\nHost: x.x.x.x:6060\r\nAccept: */*\r\nContent-type: application/json\r\nContent-length: 38\r\n\r\n{"jsonrpc":"2.0", "method":"tm.clean"}", len = 4}, uri = { s = 0x7f5152e853e5 "/RPC HTTP/1.1\r\nVia: SIP/2.0/TCP X.X.X.X:43976\r\nHost: x.x.x.x:6060\r\nAccept: */*\r\nContent-type: application/json\r\nContent-length: 38\r\n\r\n{"jsonrpc":"2.0", "method":"tm.clean"}", len = 4}, version = { s = 0x7f5152e853ea "HTTP/1.1\r\nVia: SIP/2.0/TCP X.X.X.X:43976\r\nHost: x.x.x.x:6060\r\nAccept: */*\r\nContent-type: application/json\r\nContent-length: 38\r\n\r\n{"jsonrpc":"2.0", "method":"tm.clean"}", len = 8}, method_value = 65536}, reply = {version = { s = 0x7f5152e853e0 "POST /RPC HTTP/1.1\r\nVia: SIP/2.0/TCP X.X.X.X:43976\r\nHost: x.x.x.x:6060\r\nAccept: */*\r\nContent-type: application/json\r\nContent-length: 38\r\n\r\n{"jsonrpc":"2.0", "method":"tm.clean"}", len = 4}, status = { s = 0x7f5152e853e5 "/RPC HTTP/1.1\r\nVia: SIP/2.0/TCP X.X.X.X:43976\r\nHost: x.x.x.x:6060\r\nAccept: */*\r\nContent-type: application/json\r\nContent-length: 38\r\n\r\n{"jsonrpc":"2.0", "method":"tm.clean"}", len = 4}, reason = { s = 0x7f5152e853ea "HTTP/1.1\r\nVia: SIP/2.0/TCP X.X.X.X:43976\r\nHost: x.x.x.x:6060\r\nAccept: */*\r\nContent-type: application/json\r\nContent-length: 38\r\n\r\n{"jsonrpc":"2.0", "method":"tm.clean"}", len = 8}, statuscode = 65536}}}, via1 = 0x7f51531691b0, via2 = 0x0, headers = 0x7f5153027c20, last_header = 0x7f51531ae720, parsed_flag = 18446744073709551615, h_via1 = 0x7f5153027c20, h_via2 = 0x0, callid = 0x0, to = 0x0, cseq = 0x0, from = 0x0, contact = 0x0, --Type <RET> for more, q to quit, c to continue without paging-- maxforwards = 0x0, route = 0x0, record_route = 0x0, content_type = 0x7f51531ae6c0, content_length = 0x7f51531ae720, authorization = 0x0, expires = 0x0, proxy_auth = 0x0, supported = 0x0, require = 0x0, proxy_require = 0x0, unsupported = 0x0, allow = 0x0, event = 0x0, accept = 0x7f51531bb9f0, accept_language = 0x0, organization = 0x0, priority = 0x0, subject = 0x0, user_agent = 0x0, server = 0x0, content_disposition = 0x0, diversion = 0x0, rpid = 0x0, refer_to = 0x0, session_expires = 0x0, min_se = 0x0, sipifmatch = 0x0, subscription_state = 0x0, date = 0x0, identity = 0x0, identity_info = 0x0, pai = 0x0, ppi = 0x0, path = 0x0, privacy = 0x0, min_expires = 0x0, body = 0x0, eoh = 0x7f5152e85475 "\r\n{"jsonrpc":"2.0", "method":"tm.clean"}", unparsed = 0x7f5152e85475 "\r\n{"jsonrpc":"2.0", "method":"tm.clean"}", rcv = {src_ip = {af = 2, len = 4, u = {addrl = {362853934, 0}, addr32 = {362853934, 0, 0, 0}, addr16 = {46638, 5536, 0, 0, 0, 0, 0, 0}, addr = ".\266\240\025", '\000' <repeats 11 times>}}, dst_ip = {af = 2, len = 4, u = {addrl = {346076718, 0}, addr32 = {346076718, 0, 0, 0}, addr16 = {46638, 5280, 0, 0, 0, 0, 0, 0}, addr = ".\266\240\024", '\000' <repeats 11 times>}}, src_port = 43976, dst_port = 6060, proto_reserved1 = 1262, proto_reserved2 = 0, src_su = {s = {sa_family = 2, sa_data = "\253\310.\266\240\025\000\000\000\000\000\000\000"}, sin = {sin_family = 2, sin_port = 51371, sin_addr = {s_addr = 362853934}, sin_zero = "\000\000\000\000\000\000\000"}, sin6 = {sin6_family = 2, sin6_port = 51371, sin6_flowinfo = 362853934, sin6_addr = {__in6_u = {__u6_addr8 = "\000\000\000\000\000\000\000\000\060\317\035=\377\177\000", __u6_addr16 = {0, 0, 0, 0, 53040, 15645, 32767, 0}, __u6_addr32 = {0, 0, 1025363760, 32767}}}, sin6_scope_id = 0}}, bind_address = 0x7f5152dcc460, proto = 2 '\002', proto_pad0 = 0 '\000', proto_pad1 = 0}, buf = 0x7f5152e853e0 "POST /RPC HTTP/1.1\r\nVia: SIP/2.0/TCP X.X.X.X:43976\r\nHost: x.x.x.x:6060\r\nAccept: */*\r\nContent-type: application/json\r\nContent-length: 38\r\n\r\n{"jsonrpc":"2.0", "method":"tm.clean"}", len = 189, new_uri = { s = 0x0, len = 0}, dst_uri = {s = 0x0, len = 0}, parsed_uri_ok = 0, parsed_uri = {user = {s = 0x0, len = 0}, passwd = {s = 0x0, len = 0}, host = {s = 0x0, len = 0}, port = {s = 0x0, len = 0}, params = {s = 0x0, len = 0}, sip_params = {s = 0x0, len = 0}, headers = {s = 0x0, len = 0}, port_no = 0, proto = 0, type = ERROR_URI_T, flags = (unknown: 0), transport = {s = 0x0, len = 0}, ttl = {s = 0x0, len = 0}, user_param = {s = 0x0, len = 0}, maddr = {s = 0x0, len = 0}, method = {s = 0x0, len = 0}, lr = {s = 0x0, len = 0}, r2 = {s = 0x0, len = 0}, gr = {s = 0x0, len = 0}, transport_val = {s = 0x0, len = 0}, ttl_val = {s = 0x0, len = 0}, user_param_val = {s = 0x0, len = 0}, maddr_val = {s = 0x0, len = 0}, method_val = {s = 0x0, len = 0}, lr_val = {s = 0x0, len = 0}, r2_val = {s = 0x0, len = 0}, gr_val = {s = 0x0, len = 0}}, parsed_orig_ruri_ok = 0, parsed_orig_ruri = {user = {s = 0x0, len = 0}, passwd = {s = 0x0, len = 0}, host = {s = 0x0, len = 0}, port = {s = 0x0, len = 0}, params = {s = 0x0, len = 0}, sip_params = { s = 0x0, len = 0}, headers = {s = 0x0, len = 0}, port_no = 0, proto = 0, type = ERROR_URI_T, flags = (unknown: 0), transport = {s = 0x0, len = 0}, ttl = {s = 0x0, len = 0}, user_param = {s = 0x0, len = 0}, maddr = {s = 0x0, len = 0}, method = {s = 0x0, len = 0}, lr = {s = 0x0, len = 0}, r2 = {s = 0x0, len = 0}, gr = {s = 0x0, len = 0}, transport_val = {s = 0x0, len = 0}, ttl_val = {s = 0x0, len = 0}, user_param_val = {s = 0x0, len = 0}, maddr_val = {s = 0x0, len = 0}, method_val = {s = 0x0, len = 0}, lr_val = {s = 0x0, len = 0}, r2_val = {s = 0x0, len = 0}, gr_val = {s = 0x0, len = 0}}, add_rm = 0x0, body_lumps = 0x0, reply_lump = 0x0, add_to_branch_s = '\000' <repeats 57 times>, add_to_branch_len = 0, hash_index = 0, msg_flags = 0, flags = 0, xflags = {0, 0}, set_global_address = {s = 0x0, len = 0}, set_global_port = {s = 0x0, len = 0}, force_send_socket = 0x0, path_vec = {s = 0x0, len = 0}, instance = {s = 0x0, len = 0}, reg_id = 0, ruid = {s = 0x0, len = 0}, location_ua = {s = 0x0, len = 0}, otcpid = 0, ldv = { flow = {decoded = 0, rcv = {src_ip = {af = 0, len = 0, u = {addrl = {0, 0}, addr32 = {0, 0, 0, 0}, addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, addr = '\000' <repeats 15 times>}}, dst_ip = {af = 0, len = 0, u = {addrl = {0, 0}, addr32 = {0, 0, 0, 0}, addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, addr = '\000' <repeats 15 times>}}, src_port = 0, dst_port = 0, proto_reserved1 = 0, proto_reserved2 = 0, src_su = {s = {sa_family = 0, sa_data = '\000' <repeats 13 times>}, sin = {sin_family = 0, sin_port = 0, sin_addr = {s_addr = 0}, sin_zero = "\000\000\000\000\000\000\000"}, sin6 = {sin6_family = 0, sin6_port = 0, sin6_flowinfo = 0, sin6_addr = {__in6_u = {__u6_addr8 = '\000' <repeats 15 times>, __u6_addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, __u6_addr32 = {0, 0, 0, 0}}}, sin6_scope_id = 0}}, bind_address = 0x0, proto = 0 '\000', proto_pad0 = 0 '\000', proto_pad1 = 0}}}} msg = 0x7fff3d1dc210 ra_ctx = {rec_lev = 2, run_flags = 0, last_retcode = 1, jmp_env = {{__jmpbuf = {139987259278432, -6960915947526443003, 93989696924131, 536870912, 12582912, 6, -6960915947698409467, -3860432251912212475}, __mask_was_saved = 0, __saved_mask = {__val = { 18446744073709551615, 0, 4, 93986769338368, 16, 12582944, 140734218749576, 3617854161208475648, 0, 0, 139987273275232, 139987269897424, 139987258674898, 11, 139987258674898, 139987227742213}}}}} keng = 0x0 evrtname = {s = 0x7f51511b1598 "xhttp:request", len = 13} __func__ = "xhttp_process_request" #8 0x00007f51511aac55 in xhttp_handler (msg=0x7f5153876780) at xhttp_mod.c:383 ret = 0 fake_msg = 0x7f5152e853e0 "POST /RPC HTTP/1.1\r\nVia: SIP/2.0/TCP X.X.X.X:43976\r\nHost: X.X.X.X:6060\r\nAccept: */*\r\nContent-type: application/json\r\nContent-length: 38\r\n\r\n{"jsonrpc":"2.0", "method":"tm.clean"}" fake_msg_len = 189 pmatch = {rm_so = 1025362464, rm_eo = 32767} c = 83 'S' __func__ = "xhttp_handler" #9 0x0000557bae559a24 in nonsip_msg_run_hooks (msg=0x7f5153876780) at core/nonsip_hooks.c:112 r = 0 ret = 0 #10 0x0000557bae5d6720 in receive_msg (buf=0x557bb08fee80 "POST /RPC HTTP/1.1\r\nHost: X.X.X.X:6060\r\nAccept: */*\r\nContent-type: application/json\r\nContent-length: 38\r\n\r\n{"jsonrpc":"2.0", "method":"tm.clean"}", len=151, rcv_info=0x7f512fe42828) at core/receive.c:348 msg = 0x7f5153876780 ctx = {rec_lev = 4, run_flags = 0, last_retcode = 803482386, jmp_env = {{__jmpbuf = {8, 65536, 27, 0, 140734218751644, 139986672560144, 7, 70364449210396}, __mask_was_saved = 121, __saved_mask = {__val = {10030241455286650880, 65535, 139986672560144, 93989696924131, 536870912, 12582912, 6, 140734218751232, 93989695295308, 519691042940, 2, 0, 65535, 140734218751644, 139986672560144, 18446744073709551536}}}}} bctx = 0x7fff3d1dcc10 ret = 0 tvb = {tv_sec = 140734218751644, tv_usec = 139986672560904} tve = {tv_sec = 139986672560144, tv_usec = 139986672560144} tz = {tz_minuteswest = -1369026089, tz_dsttime = 21883} diff = 0 inb = {s = 0x557bb08fee80 "POST /RPC HTTP/1.1\r\nHost: X.X.X.X:6060\r\nAccept: */*\r\nContent-type: application/json\r\nContent-length: 38\r\n\r\n{"jsonrpc":"2.0", "method":"tm.clean"}", len = 151} netinfo = {data = {s = 0x1400020001 <error: Cannot access memory at address 0x1400020001>, len = 803482376}, rcv = 0x4, dst = 0x7f512fe42b0d} keng = 0x0 evp = {data = 0x7fff3d1dcb30, rcv = 0x7f512fe42828, dst = 0x0} cidlockidx = 0 cidlockset = 0 errsipmsg = 0 exectime = 0 __func__ = "receive_msg" #11 0x0000557bae670357 in receive_tcp_msg (tcpbuf=0x7f512fe42b08 "POST /RPC HTTP/1.1\r\nHost: X.X.X.X:6060\r\nAccept: */*\r\nContent-type: application/json\r\nContent-length: 38\r\n\r\n{"jsonrpc":"2.0", "method":"tm.clean"}", len=151, rcv_info=0x7f512fe42828, con=0x7f512fe42810) at core/tcp_read.c:1448 buf = 0x557bb08fee80 "POST /RPC HTTP/1.1\r\nHost: X.X.X.X:6060\r\nAccept: */*\r\nContent-type: application/json\r\nContent-length: 38\r\n\r\n{"jsonrpc":"2.0", "method":"tm.clean"}" bsize = 65535 blen = 65535 __func__ = "receive_tcp_msg" #12 0x0000557bae6725f3 in tcp_read_req (con=0x7f512fe42810, bytes_read=0x7fff3d1dce98, read_flags=0x7fff3d1dce9c) at core/tcp_read.c:1631 bytes = 151 total_bytes = 151 resp = 1 size = 24 req = 0x7f512fe42890 dst = {send_sock = 0x0, to = {s = {sa_family = 61136, sa_data = "\342/Q\177\000\000\330\316\035=\377\177\000"}, sin = {sin_family = 61136, sin_port = 12258, sin_addr = {s_addr = 32593}, sin_zero = "\330\316\035=\377\177\000"}, sin6 = {sin6_family = 61136, sin6_port = 12258, sin6_flowinfo = 32593, sin6_addr = {__in6_u = {__u6_addr8 = "\330\316\035=\377\177\000\000\b\000\000\000\000\000\000", __u6_addr16 = {52952, 15645, 32767, 0, 8, 0, 0, 0}, __u6_addr32 = {1025363672, 32767, 8, 0}}}, sin6_scope_id = 20}}, id = 0, send_flags = {f = 1, blst_imask = 0}, proto = 1 '\001', proto_pad0 = 0 '\000', proto_pad1 = 0} --Type <RET> for more, q to quit, c to continue without paging-- c = 49 '1' ret = 1025363424 __func__ = "tcp_read_req" #13 0x0000557bae6755a2 in handle_io (fm=0x7f5152db1498, events=1, idx=-1) at core/tcp_read.c:1804 ret = 8 n = 8 read_flags = 1 con = 0x7f512fe42810 s = 28 resp = -2 t = 1117018954 __func__ = "handle_io" #14 0x0000557bae662ba9 in io_wait_loop_epoll (h=0x557bae90ad00 <io_w>, t=2, repeat=0) at core/io_wait.h:1062 n = 1 r = 0 fm = 0x7f5152db1498 revents = 1 __func__ = "io_wait_loop_epoll" #15 0x0000557bae6775f5 in tcp_receive_loop (unix_sock=59) at core/tcp_read.c:1974 __func__ = "tcp_receive_loop" #16 0x0000557bae51044a in tcp_init_children () at core/tcp_main.c:5174 r = 8 i = 7 reader_fd_1 = 59 pid = 0 si_desc = "tcp receiver (tcp:X.X.X.X:6060)\000\000\000\062\023\000\000\001", '\000' <repeats 11 times>, "\260\034\246SQ\177\000\000\343i\177\256{U\000\000\000\000\000 \000\000\000\000\000\000\300\000\000\000\000\000\006\000\000\000\000\000\000\000 \321\035=\377\177\000\000\8V\256{U\000\000 \321\035=\377\177\000\000\254\204a\256\000\000\000" si = 0x0 __func__ = "tcp_init_children" #17 0x0000557bae4228ce in main_loop () at main.c:1761 i = 8 pid = 4910 si = 0x0 si_desc = "udp receiver child=7 sock=127.0.0.1:5070\000\060\066\060\000\000\000\000\020\322\035=\377\177\000\000\000\b\302\tv\223\062\213\004\000\000\000\000\000\000\000\260\034\246SQ\177\000\000\343i\177\256{U\000\000\000\000\000 \000\000\000\000\000\000\300\000\000\000\000\000\006\000\000\000\000\000\000\000P\322\035=\377\177\000\000>0d\256{U\000" nrprocs = 8 woneinit = 1 __func__ = "main_loop" #18 0x0000557bae42abb9 in main (argc=17, argv=0x7fff3d1dd738) at main.c:2802 cfg_stream = 0x557bb0796260 c = -1 r = 0 tmp = 0x7fff3d1def3f "" tmp_len = 17 port = 0 proto = 1025365816 ahost = 0x0 aport = 0 options = 0x557bae7a0618 ":f:cm:M:dVIhEeb:l:L:n:vKrRDTN:W:w:t:u:g:P:G:SQ:O:a:A:x:X:Y:" ret = -1 seed = 1972091384 rfd = 4 debug_save = 0 debug_flag = 0 dont_fork_cnt = 0 n_lst = 0x0 p = 0x0 st = {st_dev = 21, st_ino = 1930, st_nlink = 2, st_mode = 16832, st_uid = 109, st_gid = 115, __pad0 = 0, st_rdev = 0, st_size = 40, st_blksize = 4096, st_blocks = 0, st_atim = {tv_sec = 1572122046, tv_nsec = 75984014}, st_mtim = {tv_sec = 1583307292, tv_nsec = 782512911}, st_ctim = {tv_sec = 1583307633, tv_nsec = 722908545}, __glibc_reserved = {0, 0, 0}} tbuf = "\000\000\000\000\000\000\000\000\b\000\000\000\000\000\000\000\060G\272SQ\177\000\000\000\000\000\000\000\000\000\000\240q\037=\377\177\000\000\364J\270SQ\177\000\000\b\000\000\000\000\000\000\000\230r\037=\377\177\000\000\060G\272SQ\177\000\000\330\324\035=\377\177\000\000\324\324\035=\377\177", '\000' <repeats 18 times>, "\230r\037=\377\177\000\000\250q\037=\377\177\000\000'\356\256SQ\177\000\000&\260be\000\000\000\000@\326\035=\377\177\000\000\300\212\225\001\000\000\000\000\240\325\035=\377\177\000\000\220\325\035=\377\177\000\000\330\324\035=\377\177\000\000XG\272SQ\177\000\000\001", '\000' <repeats 31 times>... option_index = 0 long_options = {{name = 0x557bae7a2fca "help", has_arg = 0, flag = 0x0, val = 104}, {name = 0x557bae79d438 "version", has_arg = 0, flag = 0x0, val = 118}, {name = 0x557bae7a2fcf "alias", has_arg = 1, flag = 0x0, val = 1024}, {name = 0x557bae7a2fd5 "subst", has_arg = 1, flag = 0x0, val = 1025}, {name = 0x557bae7a2fdb "substdef", has_arg = 1, flag = 0x0, val = 1026}, {name = 0x557bae7a2fe4 "substdefs", has_arg = 1, flag = 0x0, val = 1027}, {name = 0x557bae7a2fee "server-id", has_arg = 1, flag = 0x0, val = 1028}, { name = 0x0, has_arg = 0, flag = 0x0, val = 0}} __func__ = "main"
Daniel-Constantin Mierla writes:
There was an unsafe list iteration - can you try with:
- https://github.com/kamailio/kamailio/commit/325a45e846faae3e5dfa333727d5fab2...
If all ok, you can backport.
Thanks for the patch. The crash has only happened once, so it is not easy to reproduce, and it is production a sip proxy running 5.3.
I'll backport to my own 5.3 copy.
-- Juha
Daniel-Constantin Mierla writes:
If you can reproduce it, watch what the timer processes do during that time frame. Get the list of processes with 'kamctl ps', then when the issue is exposed, grab the backtraces of all processes with:
kamctl trap
A file is created with the backtraces taken with gdb. Sent it over here to analyze, if you cannot spot an issue there.
Two traps were taken more than 10 sec apart when the number of transactions and shared memory usage started to grow. In both traps, process PID: 6049 DSC: timer was in the exactly same state (below).
Looks like it is stuck in an acc to db operation. How long is such an operation allowed to last before an error is generated and the timer process becomes free to do something else?
acc module db_insert_mode has value 2 and there is 2 async_workers (process numbers 6056 and 5057 below). db_mysql README does not mention if it supports async insert.
-- Juha
---start 6049 ----------------------------------------------------- [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". 0x00007f7dc30107e4 in __GI___poll (fds=0x7fff233831e8, nfds=1, timeout=2000) at ../sysdeps/unix/sysv/linux/poll.c:29 29 ../sysdeps/unix/sysv/linux/poll.c: No such file or directory. #0 0x00007f7dc30107e4 in __GI___poll (fds=0x7fff233831e8, nfds=1, timeout=2000) at ../sysdeps/unix/sysv/linux/poll.c:29 resultvar = 18446744073709551100 sc_ret = <optimized out> #1 0x00007f7dc2064efa in ?? () from /usr/lib/x86_64-linux-gnu/libmariadb.so.3 No symbol table info available. #2 0x00007f7dc20652da in ?? () from /usr/lib/x86_64-linux-gnu/libmariadb.so.3 No symbol table info available. #3 0x00007f7dc2071425 in mysql_get_parameters () from /usr/lib/x86_64-linux-gnu/libmariadb.so.3 No symbol table info available. #4 0x00007f7dc2071626 in mysql_get_parameters () from /usr/lib/x86_64-linux-gnu/libmariadb.so.3 No symbol table info available. #5 0x00007f7dc2067c1e in ?? () from /usr/lib/x86_64-linux-gnu/libmariadb.so.3 No symbol table info available. #6 0x00007f7dc2068654 in ?? () from /usr/lib/x86_64-linux-gnu/libmariadb.so.3 No symbol table info available. #7 0x00007f7dc206d4ed in ?? () from /usr/lib/x86_64-linux-gnu/libmariadb.so.3 No symbol table info available. #8 0x00007f7dc2070513 in ?? () from /usr/lib/x86_64-linux-gnu/libmariadb.so.3 No symbol table info available. #9 0x00007f7dc206f230 in mysql_ping () from /usr/lib/x86_64-linux-gnu/libmariadb.so.3 No symbol table info available. #10 0x00007f7dc20bbaba in db_mysql_submit_query (_h=0x7f7dc237f3a0, _s=0x7f7dc210e1e0 <sql_str>) at km_dbase.c:83 t = 1583739979 i = 0 code = 21923 __func__ = "db_mysql_submit_query" #11 0x00007f7dc20fb35e in db_do_submit_query (_h=0x7f7dc237f3a0, _query=0x7f7dc210e1e0 <sql_str>, submit_query=0x7f7dc20bb71d <db_mysql_submit_query>) at db_query.c:58 ret = -1039426148 tvb = {tv_sec = 94160937318932, tv_usec = 140178106086304} tve = {tv_sec = 200798118576, tv_usec = 1782411427864} tz = {tz_minuteswest = 24, tz_dsttime = 65147} tdiff = 32637 __func__ = "db_do_submit_query" #12 0x00007f7dc20fdd29 in db_do_insert_cmd (_h=0x7f7dc237f3a0, _k=0x7f7dc04f88a0 <db_keys>, _v=0x7f7dc04f8b80 <db_vals>, _n=24, val2str=0x7f7dc20b9d9c <db_mysql_val2str>, submit_query=0x7f7dc20bb71d <db_mysql_submit_query>, mode=0) at db_query.c:243 off = 804 ret = 415 __func__ = "db_do_insert_cmd" #13 0x00007f7dc20fe396 in db_do_insert (_h=0x7f7dc237f3a0, _k=0x7f7dc04f88a0 <db_keys>, _v=0x7f7dc04f8b80 <db_vals>, _n=24, val2str=0x7f7dc20b9d9c <db_mysql_val2str>, submit_query=0x7f7dc20bb71d <db_mysql_submit_query>) at db_query.c:258 No locals. #14 0x00007f7dc20c08a4 in db_mysql_insert (_h=0x7f7dc237f3a0, _k=0x7f7dc04f88a0 <db_keys>, _v=0x7f7dc04f8b80 <db_vals>, _n=24) at km_dbase.c:493 No locals. #15 0x00007f7dc07342a0 in db_cluster_insert (_h=0x7f7dc29bb650, _k=0x7f7dc04f88a0 <db_keys>, _v=0x7f7dc04f8b80 <db_vals>, _n=24) at dbcl_api.c:379 ret = -1 i = 5 k = 32637 sec = 36091306 rc = 0 rok = 0 j = 0 dbh = 0x7f7dc237f3a0 cls = 0x7f7dc239b980 __func__ = "db_cluster_insert" #16 0x00007f7dc04c18a0 in acc_db_request (rq=0x7f7d9e36d540) at acc.c:476 m = 24 n = -1640573632 i = 24 o = 17 t = 0x7f7d9e36e668 dtime = -4.6749039984044258e+154 __func__ = "acc_db_request" #17 0x00007f7dc04d4aca in acc_onreply (t=0x7f7d9e3b0650, req=0x7f7d9e357650, reply=0xffffffffffffffff, code=408) at acc_logic.c:604 new_uri_bk = {s = 0x7f7d9e36dc40 "sip:xxx@x.x.x.x:5060;transport=udp203.29INVITE sip:xxx@xxx.xxx SIP/2.0\r\nRecord-Route: sip:x.x.x.x;lr;did=851.8452\r\nVia: SIP/2.0/UDP x.x.x.x;branch"..., len = 50} br = 0 hdr = 0x7f7dc15f3c0f <msg_send_buffer+17006> cmsg = 0x7f7d9e36d540 cmsg_len = 6968 preq = 0x7f7d9e36d540 mstart = 0x7f7d9e36d540 mend = 0x7f7d9e36f078 __func__ = "acc_onreply" #18 0x00007f7dc04d5783 in tmcb_func (t=0x7f7d9e3b0650, type=512, ps=0x7fff23383a50) at acc_logic.c:684 __func__ = "tmcb_func" #19 0x00007f7dc15fff4d in run_trans_callbacks_internal (cb_lst=0x7f7d9e3b06c8, type=512, trans=0x7f7d9e3b0650, params=0x7fff23383a50) at t_hooks.c:254 cbp = 0x7f7d9e339720 backup_from = 0x55a38cc8dc90 <def_list+16> backup_to = 0x55a38cc8dc98 <def_list+24> backup_dom_from = 0x55a38cc8dca0 <def_list+32> backup_dom_to = 0x55a38cc8dca8 <def_list+40> backup_uri_from = 0x55a38cc8dc80 <def_list> backup_uri_to = 0x55a38cc8dc88 <def_list+8> backup_xavps = 0x55a38cce95c0 <_xavp_list_head> __func__ = "run_trans_callbacks_internal" #20 0x00007f7dc1600180 in run_trans_callbacks_with_buf (type=512, rbuf=0x7f7d9e3b0720, req=0x7f7d9e357650, repl=0xffffffffffffffff, flags=0) at t_hooks.c:297 params = {req = 0x7f7d9e357650, rpl = 0xffffffffffffffff, param = 0x7f7d9e339730, code = 408, flags = 0, branch = 0, t_rbuf = 0x7f7d9e3b0720, dst = 0x7f7d9e3b0770, send_buf = {s = 0x7f7d9e3cf530 "SIP/2.0 408 Request Timeout\r\nVia: SIP/2.0/UDP x.x.x.x;branch=z9hG4bKf819.22eaf74987b5f436d71e6e7ac8c4d607.0\r\nVia: SIP/2.0/UDP x.x.x.x:5060;rport=5060;branch=z9hG4bKac410218704\r\nFrom: "xxx"..., len = 489}} trans = 0x7f7d9e3b0650 #21 0x00007f7dc15b04f5 in relay_reply (t=0x7f7d9e3b0650, p_msg=0xffffffffffffffff, branch=0, msg_status=408, cancel_data=0x7fff23383c30, do_put_on_wait=0) at t_reply.c:2021 relay = 0 save_clone = 0 buf = 0x7f7dc271d010 "SIP/2.0 408 Request Timeout\r\nVia: SIP/2.0/UDP x.x.x.x;branch=z9hG4bKf819.22eaf74987b5f436d71e6e7ac8c4d607.0\r\nVia: SIP/2.0/UDP x.x.x.x:5060;rport=5060;branch=z9hG4bKac410218704\r\nFrom: "xxx"... res_len = 489 relayed_code = 408 relayed_msg = 0xffffffffffffffff reply_bak = 0x0 bm = {to_tag_val = {s = 0x7f7dc271d146 "f148edd50b878efc51a72f445b390303-e1110000\r\nCall-ID: 173781355593202084446@x.x.x.x\r\nCSeq: 1 INVITE\r\nServer: OpenSIPg SIP Proxy (5.3.2-6 (x86_64/linux))\r\nContent-Length: 0\r\n\r\n", len = 41}} totag_retr = 0 reply_status = RPS_COMPLETED uas_rb = 0x7f7d9e3b0720 to_tag = 0x7f7dc1677750 <tm_tag> reason = {s = 0x55a38cb96cc4 "Request Timeout", len = 15} onsend_params = {req = 0x0, rpl = 0x0, param = 0x0, code = 0, flags = 0, branch = 0, t_rbuf = 0x0, dst = 0x0, send_buf = {s = 0x7fff23383bc0 "\220<8#\377\177", len = -1}} ip = {af = 0, len = 0, u = {addrl = {xxx, xxx}, addr32 = {xxx, xxx, xxx, xxx}, addr16 = {xxx, xxx, xxx, xxx, xxx, xxx, xxx, 0}, addr = "xxx"}} __func__ = "relay_reply" #22 0x00007f7dc15ec23d in fake_reply (t=0x7f7d9e3b0650, branch=0, code=408) at timer.c:295 cancel_data = {cancel_bitmap = 0, reason = {cause = 0, u = {text = {s = 0x0, len = -1640298928}, e2e_cancel = 0x0, packed_hdrs = {s = 0x0, len = -1640298928}}}} do_cancel_branch = 1 reply_status = 32637 #23 0x00007f7dc15ec6dd in final_response_handler (r_buf=0x7f7d9e3b0878, t=0x7f7d9e3b0650) at timer.c:455 silent = 0 branch_ret = 12582912 prev_branch = 0 now = 0 #24 0x00007f7dc15ec78f in retr_buf_handler (ticks=577460907, tl=0x7f7d9e3b0898, p=0x3e8) at timer.c:511 rbuf = 0x7f7d9e3b0878 fr_remainder = 0 retr_remainder = 0 retr_interval = 0 new_retr_interval_ms = 0 crt_retr_interval_ms = 140177493637904 t = 0x7f7d9e3b0650 __func__ = "retr_buf_handler" #25 0x000055a38c85794f in timer_list_expire (t=577460907, h=0x7f7d9dbafb70, slow_l=0x7f7d9dbb0330, slow_mark=48237) at core/timer.c:857 tl = 0x7f7d9e3b0898 ret = 0 #26 0x000055a38c857df7 in timer_handler () at core/timer.c:922 saved_ticks = 577460907 run_slow_timer = 0 i = 109 __func__ = "timer_handler" #27 0x000055a38c8582c0 in timer_main () at core/timer.c:961 No locals. #28 0x000055a38c7bc306 in main_loop () at main.c:1743 i = 8 pid = 0 si = 0x0 si_desc = "udp receiver child=7 sock=xxx" nrprocs = 8 woneinit = 1 __func__ = "main_loop" #29 0x000055a38c7c4bb9 in main (argc=17, argv=0x7fff233843f8) at main.c:2802 cfg_stream = 0x55a38d35f260 c = -1 r = 0 tmp = 0x7fff23384f3f "" tmp_len = 17 port = 0 proto = 590889976 ahost = 0x0 aport = 0 options = 0x55a38cb3a618 ":f:cm:M:dVIhEeb:l:L:n:vKrRDTN:W:w:t:u:g:P:G:SQ:O:a:A:x:X:Y:" ret = -1 seed = 1148589326 rfd = 4 debug_save = 0 debug_flag = 0 dont_fork_cnt = 0 n_lst = 0x0 p = 0x0 st = {st_dev = 21, st_ino = 1930, st_nlink = 2, st_mode = 16832, st_uid = 109, st_gid = 115, __pad0 = 0, st_rdev = 0, st_size = 40, st_blksize = 4096, st_blocks = 0, st_atim = {tv_sec = 1572122046, tv_nsec = 75984014}, st_mtim = {tv_sec = 1583490964, tv_nsec = 420466802}, st_ctim = {tv_sec = 1583492513, tv_nsec = 260094288}, __glibc_reserved = {0, 0, 0}} tbuf = "\000\000\000\000\000\000\000\000\b\000\000\000\000\000\000\000\060\207\025\303}\177\000\000\000\000\000\000\000\000\000\000\240\221\070#\377\177\000\000\364\212\023\303}\177\000\000\b\000\000\000\000\000\000\000\230\222\070#\377\177\000\000\060\207\025\303}\177\000\000\230A8#\377\177\000\000\224A8#\377\177", '\000' <repeats 18 times>, "\230\222\070#\377\177\000\000\250\221\070#\377\177\000\000'.\n\303}\177\000\000&\260be\000\000\000\000\000C8#\377\177\000\000\300\212\225\001\000\000\000\000`B8#\377\177\000\000PB8#\377\177\000\000\230A8#\377\177\000\000X\207\025\303}\177\000\000\001", '\000' <repeats 31 times>... option_index = 0 long_options = {{name = 0x55a38cb3cfca "help", has_arg = 0, flag = 0x0, val = 104}, {name = 0x55a38cb37438 "version", has_arg = 0, flag = 0x0, val = 118}, {name = 0x55a38cb3cfcf "alias", has_arg = 1, flag = 0x0, val = 1024}, {name = 0x55a38cb3cfd5 "subst", has_arg = 1, flag = 0x0, val = 1025}, {name = 0x55a38cb3cfdb "substdef", has_arg = 1, flag = 0x0, val = 1026}, {name = 0x55a38cb3cfe4 "substdefs", has_arg = 1, flag = 0x0, val = 1027}, {name = 0x55a38cb3cfee "server-id", has_arg = 1, flag = 0x0, val = 1028}, {name = 0x0, has_arg = 0, flag = 0x0, val = 0}} __func__ = "main" [Inferior 1 (process 6049) detached] ---end 6049 -------------------------------------------------------
---start 6056 ----------------------------------------------------- [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". 0x00007f7dc30f4834 in __libc_recvfrom (fd=12, buf=0x7fff23383cd0, len=8, flags=0, addr=..., addrlen=0x0) at ../sysdeps/unix/sysv/linux/recvfrom.c:27 27 ../sysdeps/unix/sysv/linux/recvfrom.c: No such file or directory. #0 0x00007f7dc30f4834 in __libc_recvfrom (fd=12, buf=0x7fff23383cd0, len=8, flags=0, addr=..., addrlen=0x0) at ../sysdeps/unix/sysv/linux/recvfrom.c:27 resultvar = 18446744073709551104 sc_ret = <optimized out> #1 0x000055a38c986b74 in async_task_run (idx=1) at core/async_task.c:269 ptask = 0x2200000021 received = 1060615238 __func__ = "async_task_run" #2 0x000055a38c9859a8 in async_task_child_init (rank=0) at core/async_task.c:185 pid = 0 i = 0 __func__ = "async_task_child_init" #3 0x000055a38c8fd83d in init_child (rank=0) at core/sr_module.c:822 ret = 32 type = 0x55a38cb7d6cb "PROC_MAIN" __func__ = "init_child" #4 0x000055a38c7bc5e8 in main_loop () at main.c:1753 i = 8 pid = 6049 si = 0x0 si_desc = "udp receiver child=7 sock=xxx" nrprocs = 8 woneinit = 1 __func__ = "main_loop" #5 0x000055a38c7c4bb9 in main (argc=17, argv=0x7fff233843f8) at main.c:2802 cfg_stream = 0x55a38d35f260 c = -1 r = 0 tmp = 0x7fff23384f3f "" tmp_len = 17 port = 0 proto = 590889976 ahost = 0x0 aport = 0 options = 0x55a38cb3a618 ":f:cm:M:dVIhEeb:l:L:n:vKrRDTN:W:w:t:u:g:P:G:SQ:O:a:A:x:X:Y:" ret = -1 seed = 1148589326 rfd = 4 debug_save = 0 debug_flag = 0 dont_fork_cnt = 0 n_lst = 0x0 p = 0x0 st = {st_dev = 21, st_ino = 1930, st_nlink = 2, st_mode = 16832, st_uid = 109, st_gid = 115, __pad0 = 0, st_rdev = 0, st_size = 40, st_blksize = 4096, st_blocks = 0, st_atim = {tv_sec = 1572122046, tv_nsec = 75984014}, st_mtim = {tv_sec = 1583490964, tv_nsec = 420466802}, st_ctim = {tv_sec = 1583492513, tv_nsec = 260094288}, __glibc_reserved = {0, 0, 0}} tbuf = "\000\000\000\000\000\000\000\000\b\000\000\000\000\000\000\000\060\207\025\303}\177\000\000\000\000\000\000\000\000\000\000\240\221\070#\377\177\000\000\364\212\023\303}\177\000\000\b\000\000\000\000\000\000\000\230\222\070#\377\177\000\000\060\207\025\303}\177\000\000\230A8#\377\177\000\000\224A8#\377\177", '\000' <repeats 18 times>, "\230\222\070#\377\177\000\000\250\221\070#\377\177\000\000'.\n\303}\177\000\000&\260be\000\000\000\000\000C8#\377\177\000\000\300\212\225\001\000\000\000\000`B8#\377\177\000\000PB8#\377\177\000\000\230A8#\377\177\000\000X\207\025\303}\177\000\000\001", '\000' <repeats 31 times>... option_index = 0 long_options = {{name = 0x55a38cb3cfca "help", has_arg = 0, flag = 0x0, val = 104}, {name = 0x55a38cb37438 "version", has_arg = 0, flag = 0x0, val = 118}, {name = 0x55a38cb3cfcf "alias", has_arg = 1, flag = 0x0, val = 1024}, {name = 0x55a38cb3cfd5 "subst", has_arg = 1, flag = 0x0, val = 1025}, {name = 0x55a38cb3cfdb "substdef", has_arg = 1, flag = 0x0, val = 1026}, {name = 0x55a38cb3cfe4 "substdefs", has_arg = 1, flag = 0x0, val = 1027}, {name = 0x55a38cb3cfee "server-id", has_arg = 1, flag = 0x0, val = 1028}, {name = 0x0, has_arg = 0, flag = 0x0, val = 0}} __func__ = "main" [Inferior 1 (process 6056) detached] ---end 6056 -------------------------------------------------------
---start 6057 ----------------------------------------------------- [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". 0x00007f7dc30f4834 in __libc_recvfrom (fd=12, buf=0x7fff23383cd0, len=8, flags=0, addr=..., addrlen=0x0) at ../sysdeps/unix/sysv/linux/recvfrom.c:27 27 ../sysdeps/unix/sysv/linux/recvfrom.c: No such file or directory. #0 0x00007f7dc30f4834 in __libc_recvfrom (fd=12, buf=0x7fff23383cd0, len=8, flags=0, addr=..., addrlen=0x0) at ../sysdeps/unix/sysv/linux/recvfrom.c:27 resultvar = 18446744073709551104 sc_ret = <optimized out> #1 0x000055a38c986b74 in async_task_run (idx=2) at core/async_task.c:269 ptask = 0x2300000022 received = 1497148970 __func__ = "async_task_run" #2 0x000055a38c9859a8 in async_task_child_init (rank=0) at core/async_task.c:185 pid = 0 i = 1 __func__ = "async_task_child_init" #3 0x000055a38c8fd83d in init_child (rank=0) at core/sr_module.c:822 ret = 32 type = 0x55a38cb7d6cb "PROC_MAIN" __func__ = "init_child" #4 0x000055a38c7bc5e8 in main_loop () at main.c:1753 i = 8 pid = 6049 si = 0x0 si_desc = "udp receiver child=7 sock=xxx" nrprocs = 8 woneinit = 1 __func__ = "main_loop" #5 0x000055a38c7c4bb9 in main (argc=17, argv=0x7fff233843f8) at main.c:2802 cfg_stream = 0x55a38d35f260 c = -1 r = 0 tmp = 0x7fff23384f3f "" tmp_len = 17 port = 0 proto = 590889976 ahost = 0x0 aport = 0 options = 0x55a38cb3a618 ":f:cm:M:dVIhEeb:l:L:n:vKrRDTN:W:w:t:u:g:P:G:SQ:O:a:A:x:X:Y:" ret = -1 seed = 1148589326 rfd = 4 debug_save = 0 debug_flag = 0 dont_fork_cnt = 0 n_lst = 0x0 p = 0x0 st = {st_dev = 21, st_ino = 1930, st_nlink = 2, st_mode = 16832, st_uid = 109, st_gid = 115, __pad0 = 0, st_rdev = 0, st_size = 40, st_blksize = 4096, st_blocks = 0, st_atim = {tv_sec = 1572122046, tv_nsec = 75984014}, st_mtim = {tv_sec = 1583490964, tv_nsec = 420466802}, st_ctim = {tv_sec = 1583492513, tv_nsec = 260094288}, __glibc_reserved = {0, 0, 0}} tbuf = "\000\000\000\000\000\000\000\000\b\000\000\000\000\000\000\000\060\207\025\303}\177\000\000\000\000\000\000\000\000\000\000\240\221\070#\377\177\000\000\364\212\023\303}\177\000\000\b\000\000\000\000\000\000\000\230\222\070#\377\177\000\000\060\207\025\303}\177\000\000\230A8#\377\177\000\000\224A8#\377\177", '\000' <repeats 18 times>, "\230\222\070#\377\177\000\000\250\221\070#\377\177\000\000'.\n\303}\177\000\000&\260be\000\000\000\000\000C8#\377\177\000\000\300\212\225\001\000\000\000\000`B8#\377\177\000\000PB8#\377\177\000\000\230A8#\377\177\000\000X\207\025\303}\177\000\000\001", '\000' <repeats 31 times>... option_index = 0 long_options = {{name = 0x55a38cb3cfca "help", has_arg = 0, flag = 0x0, val = 104}, {name = 0x55a38cb37438 "version", has_arg = 0, flag = 0x0, val = 118}, {name = 0x55a38cb3cfcf "alias", has_arg = 1, flag = 0x0, val = 1024}, {name = 0x55a38cb3cfd5 "subst", has_arg = 1, flag = 0x0, val = 1025}, {name = 0x55a38cb3cfdb "substdef", has_arg = 1, flag = 0x0, val = 1026}, {name = 0x55a38cb3cfe4 "substdefs", has_arg = 1, flag = 0x0, val = 1027}, {name = 0x55a38cb3cfee "server-id", has_arg = 1, flag = 0x0, val = 1028}, {name = 0x0, has_arg = 0, flag = 0x0, val = 0}} __func__ = "main" [Inferior 1 (process 6057) detached] ---end 6057 -------------------------------------------------------
The process is stuck in mysql client library.
The async insert is not implemented in the db_cluster module, so at this moment, if you want it, you have to use acc directly with db_mysql module.
Cheers, Daniel
On 10.03.20 04:41, Juha Heinanen wrote:
Daniel-Constantin Mierla writes:
If you can reproduce it, watch what the timer processes do during that time frame. Get the list of processes with 'kamctl ps', then when the issue is exposed, grab the backtraces of all processes with:
kamctl trap
A file is created with the backtraces taken with gdb. Sent it over here to analyze, if you cannot spot an issue there.
Two traps were taken more than 10 sec apart when the number of transactions and shared memory usage started to grow. In both traps, process PID: 6049 DSC: timer was in the exactly same state (below).
Looks like it is stuck in an acc to db operation. How long is such an operation allowed to last before an error is generated and the timer process becomes free to do something else?
acc module db_insert_mode has value 2 and there is 2 async_workers (process numbers 6056 and 5057 below). db_mysql README does not mention if it supports async insert.
-- Juha
---start 6049 ----------------------------------------------------- [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". 0x00007f7dc30107e4 in __GI___poll (fds=0x7fff233831e8, nfds=1, timeout=2000) at ../sysdeps/unix/sysv/linux/poll.c:29 29 ../sysdeps/unix/sysv/linux/poll.c: No such file or directory. #0 0x00007f7dc30107e4 in __GI___poll (fds=0x7fff233831e8, nfds=1, timeout=2000) at ../sysdeps/unix/sysv/linux/poll.c:29 resultvar = 18446744073709551100 sc_ret = <optimized out> #1 0x00007f7dc2064efa in ?? () from /usr/lib/x86_64-linux-gnu/libmariadb.so.3 No symbol table info available. #2 0x00007f7dc20652da in ?? () from /usr/lib/x86_64-linux-gnu/libmariadb.so.3 No symbol table info available. #3 0x00007f7dc2071425 in mysql_get_parameters () from /usr/lib/x86_64-linux-gnu/libmariadb.so.3 No symbol table info available. #4 0x00007f7dc2071626 in mysql_get_parameters () from /usr/lib/x86_64-linux-gnu/libmariadb.so.3 No symbol table info available. #5 0x00007f7dc2067c1e in ?? () from /usr/lib/x86_64-linux-gnu/libmariadb.so.3 No symbol table info available. #6 0x00007f7dc2068654 in ?? () from /usr/lib/x86_64-linux-gnu/libmariadb.so.3 No symbol table info available. #7 0x00007f7dc206d4ed in ?? () from /usr/lib/x86_64-linux-gnu/libmariadb.so.3 No symbol table info available. #8 0x00007f7dc2070513 in ?? () from /usr/lib/x86_64-linux-gnu/libmariadb.so.3 No symbol table info available. #9 0x00007f7dc206f230 in mysql_ping () from /usr/lib/x86_64-linux-gnu/libmariadb.so.3 No symbol table info available. #10 0x00007f7dc20bbaba in db_mysql_submit_query (_h=0x7f7dc237f3a0, _s=0x7f7dc210e1e0 <sql_str>) at km_dbase.c:83 t = 1583739979 i = 0 code = 21923 __func__ = "db_mysql_submit_query" #11 0x00007f7dc20fb35e in db_do_submit_query (_h=0x7f7dc237f3a0, _query=0x7f7dc210e1e0 <sql_str>, submit_query=0x7f7dc20bb71d <db_mysql_submit_query>) at db_query.c:58 ret = -1039426148 tvb = {tv_sec = 94160937318932, tv_usec = 140178106086304} tve = {tv_sec = 200798118576, tv_usec = 1782411427864} tz = {tz_minuteswest = 24, tz_dsttime = 65147} tdiff = 32637 __func__ = "db_do_submit_query" #12 0x00007f7dc20fdd29 in db_do_insert_cmd (_h=0x7f7dc237f3a0, _k=0x7f7dc04f88a0 <db_keys>, _v=0x7f7dc04f8b80 <db_vals>, _n=24, val2str=0x7f7dc20b9d9c <db_mysql_val2str>, submit_query=0x7f7dc20bb71d <db_mysql_submit_query>, mode=0) at db_query.c:243 off = 804 ret = 415 __func__ = "db_do_insert_cmd" #13 0x00007f7dc20fe396 in db_do_insert (_h=0x7f7dc237f3a0, _k=0x7f7dc04f88a0 <db_keys>, _v=0x7f7dc04f8b80 <db_vals>, _n=24, val2str=0x7f7dc20b9d9c <db_mysql_val2str>, submit_query=0x7f7dc20bb71d <db_mysql_submit_query>) at db_query.c:258 No locals. #14 0x00007f7dc20c08a4 in db_mysql_insert (_h=0x7f7dc237f3a0, _k=0x7f7dc04f88a0 <db_keys>, _v=0x7f7dc04f8b80 <db_vals>, _n=24) at km_dbase.c:493 No locals. #15 0x00007f7dc07342a0 in db_cluster_insert (_h=0x7f7dc29bb650, _k=0x7f7dc04f88a0 <db_keys>, _v=0x7f7dc04f8b80 <db_vals>, _n=24) at dbcl_api.c:379 ret = -1 i = 5 k = 32637 sec = 36091306 rc = 0 rok = 0 j = 0 dbh = 0x7f7dc237f3a0 cls = 0x7f7dc239b980 __func__ = "db_cluster_insert" #16 0x00007f7dc04c18a0 in acc_db_request (rq=0x7f7d9e36d540) at acc.c:476 m = 24 n = -1640573632 i = 24 o = 17 t = 0x7f7d9e36e668 dtime = -4.6749039984044258e+154 __func__ = "acc_db_request" #17 0x00007f7dc04d4aca in acc_onreply (t=0x7f7d9e3b0650, req=0x7f7d9e357650, reply=0xffffffffffffffff, code=408) at acc_logic.c:604 new_uri_bk = {s = 0x7f7d9e36dc40 "sip:xxx@x.x.x.x:5060;transport=udp203.29INVITE sip:xxx@xxx.xxx SIP/2.0\r\nRecord-Route: sip:x.x.x.x;lr;did=851.8452\r\nVia: SIP/2.0/UDP x.x.x.x;branch"..., len = 50} br = 0 hdr = 0x7f7dc15f3c0f <msg_send_buffer+17006> cmsg = 0x7f7d9e36d540 cmsg_len = 6968 preq = 0x7f7d9e36d540 mstart = 0x7f7d9e36d540 mend = 0x7f7d9e36f078 __func__ = "acc_onreply" #18 0x00007f7dc04d5783 in tmcb_func (t=0x7f7d9e3b0650, type=512, ps=0x7fff23383a50) at acc_logic.c:684 __func__ = "tmcb_func" #19 0x00007f7dc15fff4d in run_trans_callbacks_internal (cb_lst=0x7f7d9e3b06c8, type=512, trans=0x7f7d9e3b0650, params=0x7fff23383a50) at t_hooks.c:254 cbp = 0x7f7d9e339720 backup_from = 0x55a38cc8dc90 <def_list+16> backup_to = 0x55a38cc8dc98 <def_list+24> backup_dom_from = 0x55a38cc8dca0 <def_list+32> backup_dom_to = 0x55a38cc8dca8 <def_list+40> backup_uri_from = 0x55a38cc8dc80 <def_list> backup_uri_to = 0x55a38cc8dc88 <def_list+8> backup_xavps = 0x55a38cce95c0 <_xavp_list_head> __func__ = "run_trans_callbacks_internal" #20 0x00007f7dc1600180 in run_trans_callbacks_with_buf (type=512, rbuf=0x7f7d9e3b0720, req=0x7f7d9e357650, repl=0xffffffffffffffff, flags=0) at t_hooks.c:297 params = {req = 0x7f7d9e357650, rpl = 0xffffffffffffffff, param = 0x7f7d9e339730, code = 408, flags = 0, branch = 0, t_rbuf = 0x7f7d9e3b0720, dst = 0x7f7d9e3b0770, send_buf = {s = 0x7f7d9e3cf530 "SIP/2.0 408 Request Timeout\r\nVia: SIP/2.0/UDP x.x.x.x;branch=z9hG4bKf819.22eaf74987b5f436d71e6e7ac8c4d607.0\r\nVia: SIP/2.0/UDP x.x.x.x:5060;rport=5060;branch=z9hG4bKac410218704\r\nFrom: "xxx"..., len = 489}} trans = 0x7f7d9e3b0650 #21 0x00007f7dc15b04f5 in relay_reply (t=0x7f7d9e3b0650, p_msg=0xffffffffffffffff, branch=0, msg_status=408, cancel_data=0x7fff23383c30, do_put_on_wait=0) at t_reply.c:2021 relay = 0 save_clone = 0 buf = 0x7f7dc271d010 "SIP/2.0 408 Request Timeout\r\nVia: SIP/2.0/UDP x.x.x.x;branch=z9hG4bKf819.22eaf74987b5f436d71e6e7ac8c4d607.0\r\nVia: SIP/2.0/UDP x.x.x.x:5060;rport=5060;branch=z9hG4bKac410218704\r\nFrom: "xxx"... res_len = 489 relayed_code = 408 relayed_msg = 0xffffffffffffffff reply_bak = 0x0 bm = {to_tag_val = {s = 0x7f7dc271d146 "f148edd50b878efc51a72f445b390303-e1110000\r\nCall-ID: 173781355593202084446@x.x.x.x\r\nCSeq: 1 INVITE\r\nServer: OpenSIPg SIP Proxy (5.3.2-6 (x86_64/linux))\r\nContent-Length: 0\r\n\r\n", len = 41}} totag_retr = 0 reply_status = RPS_COMPLETED uas_rb = 0x7f7d9e3b0720 to_tag = 0x7f7dc1677750 <tm_tag> reason = {s = 0x55a38cb96cc4 "Request Timeout", len = 15} onsend_params = {req = 0x0, rpl = 0x0, param = 0x0, code = 0, flags = 0, branch = 0, t_rbuf = 0x0, dst = 0x0, send_buf = {s = 0x7fff23383bc0 "\220<8#\377\177", len = -1}} ip = {af = 0, len = 0, u = {addrl = {xxx, xxx}, addr32 = {xxx, xxx, xxx, xxx}, addr16 = {xxx, xxx, xxx, xxx, xxx, xxx, xxx, 0}, addr = "xxx"}} __func__ = "relay_reply" #22 0x00007f7dc15ec23d in fake_reply (t=0x7f7d9e3b0650, branch=0, code=408) at timer.c:295 cancel_data = {cancel_bitmap = 0, reason = {cause = 0, u = {text = {s = 0x0, len = -1640298928}, e2e_cancel = 0x0, packed_hdrs = {s = 0x0, len = -1640298928}}}} do_cancel_branch = 1 reply_status = 32637 #23 0x00007f7dc15ec6dd in final_response_handler (r_buf=0x7f7d9e3b0878, t=0x7f7d9e3b0650) at timer.c:455 silent = 0 branch_ret = 12582912 prev_branch = 0 now = 0 #24 0x00007f7dc15ec78f in retr_buf_handler (ticks=577460907, tl=0x7f7d9e3b0898, p=0x3e8) at timer.c:511 rbuf = 0x7f7d9e3b0878 fr_remainder = 0 retr_remainder = 0 retr_interval = 0 new_retr_interval_ms = 0 crt_retr_interval_ms = 140177493637904 t = 0x7f7d9e3b0650 __func__ = "retr_buf_handler" #25 0x000055a38c85794f in timer_list_expire (t=577460907, h=0x7f7d9dbafb70, slow_l=0x7f7d9dbb0330, slow_mark=48237) at core/timer.c:857 tl = 0x7f7d9e3b0898 ret = 0 #26 0x000055a38c857df7 in timer_handler () at core/timer.c:922 saved_ticks = 577460907 run_slow_timer = 0 i = 109 __func__ = "timer_handler" #27 0x000055a38c8582c0 in timer_main () at core/timer.c:961 No locals. #28 0x000055a38c7bc306 in main_loop () at main.c:1743 i = 8 pid = 0 si = 0x0 si_desc = "udp receiver child=7 sock=xxx" nrprocs = 8 woneinit = 1 __func__ = "main_loop" #29 0x000055a38c7c4bb9 in main (argc=17, argv=0x7fff233843f8) at main.c:2802 cfg_stream = 0x55a38d35f260 c = -1 r = 0 tmp = 0x7fff23384f3f "" tmp_len = 17 port = 0 proto = 590889976 ahost = 0x0 aport = 0 options = 0x55a38cb3a618 ":f:cm:M:dVIhEeb:l:L:n:vKrRDTN:W:w:t:u:g:P:G:SQ:O:a:A:x:X:Y:" ret = -1 seed = 1148589326 rfd = 4 debug_save = 0 debug_flag = 0 dont_fork_cnt = 0 n_lst = 0x0 p = 0x0 st = {st_dev = 21, st_ino = 1930, st_nlink = 2, st_mode = 16832, st_uid = 109, st_gid = 115, __pad0 = 0, st_rdev = 0, st_size = 40, st_blksize = 4096, st_blocks = 0, st_atim = {tv_sec = 1572122046, tv_nsec = 75984014}, st_mtim = {tv_sec = 1583490964, tv_nsec = 420466802}, st_ctim = {tv_sec = 1583492513, tv_nsec = 260094288}, __glibc_reserved = {0, 0, 0}} tbuf = "\000\000\000\000\000\000\000\000\b\000\000\000\000\000\000\000\060\207\025\303}\177\000\000\000\000\000\000\000\000\000\000\240\221\070#\377\177\000\000\364\212\023\303}\177\000\000\b\000\000\000\000\000\000\000\230\222\070#\377\177\000\000\060\207\025\303}\177\000\000\230A8#\377\177\000\000\224A8#\377\177", '\000' <repeats 18 times>, "\230\222\070#\377\177\000\000\250\221\070#\377\177\000\000'.\n\303}\177\000\000&\260be\000\000\000\000\000C8#\377\177\000\000\300\212\225\001\000\000\000\000`B8#\377\177\000\000PB8#\377\177\000\000\230A8#\377\177\000\000X\207\025\303}\177\000\000\001", '\000' <repeats 31 times>... option_index = 0 long_options = {{name = 0x55a38cb3cfca "help", has_arg = 0, flag = 0x0, val = 104}, {name = 0x55a38cb37438 "version", has_arg = 0, flag = 0x0, val = 118}, {name = 0x55a38cb3cfcf "alias", has_arg = 1, flag = 0x0, val = 1024}, {name = 0x55a38cb3cfd5 "subst", has_arg = 1, flag = 0x0, val = 1025}, {name = 0x55a38cb3cfdb "substdef", has_arg = 1, flag = 0x0, val = 1026}, {name = 0x55a38cb3cfe4 "substdefs", has_arg = 1, flag = 0x0, val = 1027}, {name = 0x55a38cb3cfee "server-id", has_arg = 1, flag = 0x0, val = 1028}, {name = 0x0, has_arg = 0, flag = 0x0, val = 0}} __func__ = "main" [Inferior 1 (process 6049) detached] ---end 6049 -------------------------------------------------------
---start 6056 ----------------------------------------------------- [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". 0x00007f7dc30f4834 in __libc_recvfrom (fd=12, buf=0x7fff23383cd0, len=8, flags=0, addr=..., addrlen=0x0) at ../sysdeps/unix/sysv/linux/recvfrom.c:27 27 ../sysdeps/unix/sysv/linux/recvfrom.c: No such file or directory. #0 0x00007f7dc30f4834 in __libc_recvfrom (fd=12, buf=0x7fff23383cd0, len=8, flags=0, addr=..., addrlen=0x0) at ../sysdeps/unix/sysv/linux/recvfrom.c:27 resultvar = 18446744073709551104 sc_ret = <optimized out> #1 0x000055a38c986b74 in async_task_run (idx=1) at core/async_task.c:269 ptask = 0x2200000021 received = 1060615238 __func__ = "async_task_run" #2 0x000055a38c9859a8 in async_task_child_init (rank=0) at core/async_task.c:185 pid = 0 i = 0 __func__ = "async_task_child_init" #3 0x000055a38c8fd83d in init_child (rank=0) at core/sr_module.c:822 ret = 32 type = 0x55a38cb7d6cb "PROC_MAIN" __func__ = "init_child" #4 0x000055a38c7bc5e8 in main_loop () at main.c:1753 i = 8 pid = 6049 si = 0x0 si_desc = "udp receiver child=7 sock=xxx" nrprocs = 8 woneinit = 1 __func__ = "main_loop" #5 0x000055a38c7c4bb9 in main (argc=17, argv=0x7fff233843f8) at main.c:2802 cfg_stream = 0x55a38d35f260 c = -1 r = 0 tmp = 0x7fff23384f3f "" tmp_len = 17 port = 0 proto = 590889976 ahost = 0x0 aport = 0 options = 0x55a38cb3a618 ":f:cm:M:dVIhEeb:l:L:n:vKrRDTN:W:w:t:u:g:P:G:SQ:O:a:A:x:X:Y:" ret = -1 seed = 1148589326 rfd = 4 debug_save = 0 debug_flag = 0 dont_fork_cnt = 0 n_lst = 0x0 p = 0x0 st = {st_dev = 21, st_ino = 1930, st_nlink = 2, st_mode = 16832, st_uid = 109, st_gid = 115, __pad0 = 0, st_rdev = 0, st_size = 40, st_blksize = 4096, st_blocks = 0, st_atim = {tv_sec = 1572122046, tv_nsec = 75984014}, st_mtim = {tv_sec = 1583490964, tv_nsec = 420466802}, st_ctim = {tv_sec = 1583492513, tv_nsec = 260094288}, __glibc_reserved = {0, 0, 0}} tbuf = "\000\000\000\000\000\000\000\000\b\000\000\000\000\000\000\000\060\207\025\303}\177\000\000\000\000\000\000\000\000\000\000\240\221\070#\377\177\000\000\364\212\023\303}\177\000\000\b\000\000\000\000\000\000\000\230\222\070#\377\177\000\000\060\207\025\303}\177\000\000\230A8#\377\177\000\000\224A8#\377\177", '\000' <repeats 18 times>, "\230\222\070#\377\177\000\000\250\221\070#\377\177\000\000'.\n\303}\177\000\000&\260be\000\000\000\000\000C8#\377\177\000\000\300\212\225\001\000\000\000\000`B8#\377\177\000\000PB8#\377\177\000\000\230A8#\377\177\000\000X\207\025\303}\177\000\000\001", '\000' <repeats 31 times>... option_index = 0 long_options = {{name = 0x55a38cb3cfca "help", has_arg = 0, flag = 0x0, val = 104}, {name = 0x55a38cb37438 "version", has_arg = 0, flag = 0x0, val = 118}, {name = 0x55a38cb3cfcf "alias", has_arg = 1, flag = 0x0, val = 1024}, {name = 0x55a38cb3cfd5 "subst", has_arg = 1, flag = 0x0, val = 1025}, {name = 0x55a38cb3cfdb "substdef", has_arg = 1, flag = 0x0, val = 1026}, {name = 0x55a38cb3cfe4 "substdefs", has_arg = 1, flag = 0x0, val = 1027}, {name = 0x55a38cb3cfee "server-id", has_arg = 1, flag = 0x0, val = 1028}, {name = 0x0, has_arg = 0, flag = 0x0, val = 0}} __func__ = "main" [Inferior 1 (process 6056) detached] ---end 6056 -------------------------------------------------------
---start 6057 ----------------------------------------------------- [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". 0x00007f7dc30f4834 in __libc_recvfrom (fd=12, buf=0x7fff23383cd0, len=8, flags=0, addr=..., addrlen=0x0) at ../sysdeps/unix/sysv/linux/recvfrom.c:27 27 ../sysdeps/unix/sysv/linux/recvfrom.c: No such file or directory. #0 0x00007f7dc30f4834 in __libc_recvfrom (fd=12, buf=0x7fff23383cd0, len=8, flags=0, addr=..., addrlen=0x0) at ../sysdeps/unix/sysv/linux/recvfrom.c:27 resultvar = 18446744073709551104 sc_ret = <optimized out> #1 0x000055a38c986b74 in async_task_run (idx=2) at core/async_task.c:269 ptask = 0x2300000022 received = 1497148970 __func__ = "async_task_run" #2 0x000055a38c9859a8 in async_task_child_init (rank=0) at core/async_task.c:185 pid = 0 i = 1 __func__ = "async_task_child_init" #3 0x000055a38c8fd83d in init_child (rank=0) at core/sr_module.c:822 ret = 32 type = 0x55a38cb7d6cb "PROC_MAIN" __func__ = "init_child" #4 0x000055a38c7bc5e8 in main_loop () at main.c:1753 i = 8 pid = 6049 si = 0x0 si_desc = "udp receiver child=7 sock=xxx" nrprocs = 8 woneinit = 1 __func__ = "main_loop" #5 0x000055a38c7c4bb9 in main (argc=17, argv=0x7fff233843f8) at main.c:2802 cfg_stream = 0x55a38d35f260 c = -1 r = 0 tmp = 0x7fff23384f3f "" tmp_len = 17 port = 0 proto = 590889976 ahost = 0x0 aport = 0 options = 0x55a38cb3a618 ":f:cm:M:dVIhEeb:l:L:n:vKrRDTN:W:w:t:u:g:P:G:SQ:O:a:A:x:X:Y:" ret = -1 seed = 1148589326 rfd = 4 debug_save = 0 debug_flag = 0 dont_fork_cnt = 0 n_lst = 0x0 p = 0x0 st = {st_dev = 21, st_ino = 1930, st_nlink = 2, st_mode = 16832, st_uid = 109, st_gid = 115, __pad0 = 0, st_rdev = 0, st_size = 40, st_blksize = 4096, st_blocks = 0, st_atim = {tv_sec = 1572122046, tv_nsec = 75984014}, st_mtim = {tv_sec = 1583490964, tv_nsec = 420466802}, st_ctim = {tv_sec = 1583492513, tv_nsec = 260094288}, __glibc_reserved = {0, 0, 0}} tbuf = "\000\000\000\000\000\000\000\000\b\000\000\000\000\000\000\000\060\207\025\303}\177\000\000\000\000\000\000\000\000\000\000\240\221\070#\377\177\000\000\364\212\023\303}\177\000\000\b\000\000\000\000\000\000\000\230\222\070#\377\177\000\000\060\207\025\303}\177\000\000\230A8#\377\177\000\000\224A8#\377\177", '\000' <repeats 18 times>, "\230\222\070#\377\177\000\000\250\221\070#\377\177\000\000'.\n\303}\177\000\000&\260be\000\000\000\000\000C8#\377\177\000\000\300\212\225\001\000\000\000\000`B8#\377\177\000\000PB8#\377\177\000\000\230A8#\377\177\000\000X\207\025\303}\177\000\000\001", '\000' <repeats 31 times>... option_index = 0 long_options = {{name = 0x55a38cb3cfca "help", has_arg = 0, flag = 0x0, val = 104}, {name = 0x55a38cb37438 "version", has_arg = 0, flag = 0x0, val = 118}, {name = 0x55a38cb3cfcf "alias", has_arg = 1, flag = 0x0, val = 1024}, {name = 0x55a38cb3cfd5 "subst", has_arg = 1, flag = 0x0, val = 1025}, {name = 0x55a38cb3cfdb "substdef", has_arg = 1, flag = 0x0, val = 1026}, {name = 0x55a38cb3cfe4 "substdefs", has_arg = 1, flag = 0x0, val = 1027}, {name = 0x55a38cb3cfee "server-id", has_arg = 1, flag = 0x0, val = 1028}, {name = 0x0, has_arg = 0, flag = 0x0, val = 0}} __func__ = "main" [Inferior 1 (process 6057) detached] ---end 6057 -------------------------------------------------------
Daniel-Constantin Mierla writes:
The process is stuck in mysql client library.
The async insert is not implemented in the db_cluster module, so at this moment, if you want it, you have to use acc directly with db_mysql module.
Then how is it possible that most of the time accounting using async inserts works and get stuck only once in awhile?
And would it be possible to detect that async inserts are trying to be used when K is started and make the start to fail?
Maybe also some text about that in db_cluster module README would be useful.
-- Juha
On 10.03.20 20:07, Juha Heinanen wrote:
Daniel-Constantin Mierla writes:
The process is stuck in mysql client library.
The async insert is not implemented in the db_cluster module, so at this moment, if you want it, you have to use acc directly with db_mysql module.
Then how is it possible that most of the time accounting using async inserts works and get stuck only once in awhile?
There is no async-insert done by acc with db_cluster, it is always standard (sync) insert. You would have to track why the mysql server (or the client library) is blocking from time to time, I don't think it is something that kamailio can do.
And would it be possible to detect that async inserts are trying to be used when K is started and make the start to fail?
The fallback from async-insert to standard-insert is an acc option:
* https://www.kamailio.org/docs/modules/stable/modules/acc.html#acc.p.db_inser...
You can add a new option to db_insert_mode to fail if async-insert is not supported by the db connector.
Maybe also some text about that in db_cluster module README would be useful.
The db_cluster readme is listing the operations it can do for write-to and read-from database:
* https://www.kamailio.org/docs/modules/stable/modules/db_cluster.html#idm20
You can enhance the docs as you can consider more useful and clear.
Otherwise, I just pushed to master branch a commit implementing async-insert for db_cluster module, with fallback to standard insert. Testing and reporting back the results would be appreciated if you plan to use it in the near future. Also, if you need, you can add a mod param to have an option not to fallback to standard insert/check async-insert option at start up.
Cheers, Daniel
Daniel-Constantin Mierla writes:
There is no async-insert done by acc with db_cluster, it is always standard (sync) insert. You would have to track why the mysql server (or the client library) is blocking from time to time, I don't think it is something that kamailio can do.
OK, thanks for the explanation.
One more question: why is the timer process involved with this acc insert? Isn't accounting done by the normal worker processes that handle requests/responses?
-- Juha
Regarding db_mysql timeout_interval, it has its default value 2, which means 6 seconds. The insert was hanging in the timer process much longer and no error messages related to abort appear in syslog. How is that possible?
-- Juha
On 10.03.20 22:54, Juha Heinanen wrote:
Regarding db_mysql timeout_interval, it has its default value 2, which means 6 seconds. The insert was hanging in the timer process much longer and no error messages related to abort appear in syslog. How is that possible?
I can't say why it happens, but, iirc, in some versions of mysql client library docs is written that the timeout may fail in various cases.
Cheers, Daniel
Hello,
this could be one cause, and another - the timeout is in the end for every request. So, if you have a loaded sever with several requests waiting to be written to the server it will wait for each of of them, I think. (It has been some time that I personally tested this functionality.)
Cheers,
Henning
On 10.03.20 20:30, Juha Heinanen wrote:
Daniel-Constantin Mierla writes:
There is no async-insert done by acc with db_cluster, it is always standard (sync) insert. You would have to track why the mysql server (or the client library) is blocking from time to time, I don't think it is something that kamailio can do.
OK, thanks for the explanation.
One more question: why is the timer process involved with this acc insert? Isn't accounting done by the normal worker processes that handle requests/responses?
It seems to be the case of a retransmission timeout:
#17 0x00007f7dc04d4aca in acc_onreply (t=0x7f7d9e3b0650, req=0x7f7d9e357650, reply=0xffffffffffffffff, code=408) at acc_logic.c:604
Code is 408 and the reply is faked value. This case is happening in timer process.
Cheers, Daniel
Daniel-Constantin Mierla writes:
It seems to be the case of a retransmission timeout:
#17 0x00007f7dc04d4aca in acc_onreply (t=0x7f7d9e3b0650, req=0x7f7d9e357650, reply=0xffffffffffffffff, code=408) at acc_logic.c:604
Code is 408 and the reply is faked value. This case is happening in timer process.
That explains it. But isn't it risky that in this kind of situation the timer process (the only one) handles the reply and accounting?
The problem is related to db_cluster/mariadb/debian. If db_cluster is not used, everything works fine. With db_cluster, accounting hangs the timer process at regular (about 2 hour) intervals.
-- Juha
On 11.03.20 09:04, Juha Heinanen wrote:
Daniel-Constantin Mierla writes:
It seems to be the case of a retransmission timeout:
#17 0x00007f7dc04d4aca in acc_onreply (t=0x7f7d9e3b0650, req=0x7f7d9e357650, reply=0xffffffffffffffff, code=408) at acc_logic.c:604
Code is 408 and the reply is faked value. This case is happening in timer process.
That explains it. But isn't it risky that in this kind of situation the timer process (the only one) handles the reply and accounting?
There are many cases when delays can increase the risk of malfunctioning, no matter is in timer module or a sip routing worker. If that a process is blocked, slots on internal hash tables (e.g., user location) can be locked and no other process can continue processing until that process unlocks. Interaction with external systems such as database, api servers, dns service ... are the typical candidates for adding significant delay. For specific deployments, there are some solutions to do as less as possible blocking operations, but it would be probably impossible to do it everywhere when dealing with external systems. Such example is even the async-insert added to db_mysql quite some time ago, or mqueue+rtimer or async modules.
The problem is related to db_cluster/mariadb/debian. If db_cluster is not used, everything works fine. With db_cluster, accounting hangs the timer process at regular (about 2 hour) intervals.
If it happens periodically, maybe you can track why: try to identify apps accessing the database for back up, cdr generation, etc ... as well as infrastructure maintenance operations (vm backup snapshot).
Cheers, Daniel
Daniel-Constantin Mierla writes:
If it happens periodically, maybe you can track why: try to identify apps accessing the database for back up, cdr generation, etc ... as well as infrastructure maintenance operations (vm backup snapshot).
DB access works fine from K worker processes during the time when insert is stuck in timer process.
-- Juha
On 11.03.20 20:12, Juha Heinanen wrote:
Daniel-Constantin Mierla writes:
If it happens periodically, maybe you can track why: try to identify apps accessing the database for back up, cdr generation, etc ... as well as infrastructure maintenance operations (vm backup snapshot).
DB access works fine from K worker processes during the time when insert is stuck in timer process.
Are new records inserted in acc table? I think there are some tools/commands for mysql to inspect the state a database table, check acc table and missed_calls, if you also use this one.
Cheers, Daniel
Daniel-Constantin Mierla writes:
DB access works fine from K worker processes during the time when insert is stuck in timer process.
Are new records inserted in acc table? I think there are some tools/commands for mysql to inspect the state a database table, check acc table and missed_calls, if you also use this one.
Accounting records are inserted without problems from other calls to the same accounting table where the timer process tries to insert its record but hangs.
-- Juha
On 12.03.20 09:09, Juha Heinanen wrote:
Daniel-Constantin Mierla writes:
DB access works fine from K worker processes during the time when insert is stuck in timer process.
Are new records inserted in acc table? I think there are some tools/commands for mysql to inspect the state a database table, check acc table and missed_calls, if you also use this one.
Accounting records are inserted without problems from other calls to the same accounting table where the timer process tries to insert its record but hangs.
Try to install debugging symbols for libmariadb and maybe there will be more hints in the backtace of what it does internally, which can improve troubleshooting by searching on the web for similar cases.
And, as I said in previous email, try to investigate the status of mysql server with specific commands and tools, if you can find any out there.
From mysql client API, we set the query timeout options.
Otherwise, I haven't encountered such a situation yet to have any other hint to share specific for it.
Maybe you can try to use IP sockets for connecting to mysql server, if the connection is done over unixsocket file. In some older versions, it was mentioned that timeout options may not work if not on an IP connection.
What you can also try, is to change the database table type. What do you have now, MyISAM, InnoDB or something else?
To know better the ecosystem, what is the operation system and mysq/mariadb versions?
Cheers, Daniel
Daniel-Constantin Mierla writes:
Try to install debugging symbols for libmariadb and maybe there will be more hints in the backtace of what it does internally, which can improve troubleshooting by searching on the web for similar cases.
I did that already earlier when I reported about similar issue, which made K totally unresponsive for a long period of time. Nothing was found from the backtrace.
What you can also try, is to change the database table type. What do you have now, MyISAM, InnoDB or something else?
DB type is InnoDB.
To know better the ecosystem, what is the operation system and mysq/mariadb versions?
Latest stable Debian 10 that comes with mariadb 10.3.
On the positive side, so far the issue has not appeared after I backported your db_cluster async insert patch to 5.3. Perhaps there was some issue in falling back to normal insert.
-- Juha
Hi Juha,
about the timeout question regarding DB Operation - in case of MySQL it can be configured (but still not that much helpful on loaded systems due to the seconds granularity) https://kamailio.org/docs/modules/devel/modules/db_mysql.html#idm61
Cheers,
Henning