Hi Daniel,
I wanted to go back to this thread for a minute:
http://lists.sip-router.org/pipermail/sr-users/2012-June/073663.html
Sorry that I didn't reply to the thread; I no longer have the last message.
In your reply, you said:
there is no timeout on the execution of that task. Of course, you can start many rtimer processes to do same task, up to you, via configuration parameters. A matter of how rtimer executed route is built, a rtimer process may not rest (e.g., sleep) and keep consuming mqueue items as long as the queue is not empty."
However, I am seeing some behaviour that is rather different.
Roughly, I have a single rtimer task that runs every 3 seconds and consumes an mqueue full of accounting events. The rtimer request route looks like something this:
route[ACCT_WRITER] { while(mq_fetch("acct_write")) { # Some stuff to write $mqv(...) values to PostgreSQL.
...
sql_query("ca", "SELECT * FROM acct_event_proc(...)"); ...
mq_pv_free("account_write"); } }
And the rtimer task is parameterised thusly:
modparam("rtimer", "timer", "name=acct_write;interval=3;mode=1") modparam("rtimer", "exec", "timer=acct_write;route=ACCT_WRITER")
And the mqueue:
modparam("mqueue", "mqueue", "name=acct_write")
I have a stored procedure - let's call it acct_event_proc() - that is invoked from within the rtimer task. When I looked at the live query activity from Kamailio to Postgres by doing a packet capture on loopback, like this:
tcpdump -i any -A -s 0 -tttt -n 'tcp port 5432' | grep acct_event_proc
What I saw was some events being written for about ~3 seconds and then the process would stop. Then, after about half a second, it would burst again for 3 seconds, then stop again. And so on and so forth.
When I lowered the interval to 1 second, it got faster and more responsive, and would push more events through in one interval. But still, it appeared to be a limited "burst" of SELECT calls punctuated by 1 second of silence.
When I checked out branch 3.3 and lowered the interval to 50000u, it got even faster, almost real-time enough to keep up. Still, I did not really know what was happening, so I added an MI command to HEAD to check mqueue size live:
http://git.sip-router.org/cgi-bin/gitweb.cgi?p=sip-router;a=commit;h=41f7715...
(I think it would be quite useful anyhow, as there is no runtime mqueue info available right now.)
Using it, I confirmed my impression: the bigger the rtimer interval, the larger the queue grows, and the slower the events seem to be written overall. The shorter the rtimer interval, the faster they are dequeued, and the smaller the mqueue is overall, in the course of normal call processing.
Thus, my impression is that the execution time of the rtimer task is bounded by the interval, and that the core kills the forked rtimer task currently running prior to starting another one.
Can you shed some light?
Thanks!
-- Alex
I am wondering if the execution time of a route is somehow bounded in action.c/run_actions(). Unfortunately, I don't enough about how it works to figure it out.
On 07/23/2012 08:38 PM, Alex Balashov wrote:
Hi Daniel,
I wanted to go back to this thread for a minute:
http://lists.sip-router.org/pipermail/sr-users/2012-June/073663.html
Sorry that I didn't reply to the thread; I no longer have the last message.
In your reply, you said:
there is no timeout on the execution of that task. Of course, you can start many rtimer processes to do same task, up to you, via configuration parameters. A matter of how rtimer executed route is built, a rtimer process may not rest (e.g., sleep) and keep consuming mqueue items as long as the queue is not empty."
However, I am seeing some behaviour that is rather different.
Roughly, I have a single rtimer task that runs every 3 seconds and consumes an mqueue full of accounting events. The rtimer request route looks like something this:
route[ACCT_WRITER] { while(mq_fetch("acct_write")) { # Some stuff to write $mqv(...) values to PostgreSQL. ... sql_query("ca", "SELECT * FROM acct_event_proc(...)"); ... mq_pv_free("account_write"); } }
And the rtimer task is parameterised thusly:
modparam("rtimer", "timer", "name=acct_write;interval=3;mode=1") modparam("rtimer", "exec", "timer=acct_write;route=ACCT_WRITER")
And the mqueue:
modparam("mqueue", "mqueue", "name=acct_write")
I have a stored procedure - let's call it acct_event_proc() - that is invoked from within the rtimer task. When I looked at the live query activity from Kamailio to Postgres by doing a packet capture on loopback, like this:
tcpdump -i any -A -s 0 -tttt -n 'tcp port 5432' | grep acct_event_proc
What I saw was some events being written for about ~3 seconds and then the process would stop. Then, after about half a second, it would burst again for 3 seconds, then stop again. And so on and so forth.
When I lowered the interval to 1 second, it got faster and more responsive, and would push more events through in one interval. But still, it appeared to be a limited "burst" of SELECT calls punctuated by 1 second of silence.
When I checked out branch 3.3 and lowered the interval to 50000u, it got even faster, almost real-time enough to keep up. Still, I did not really know what was happening, so I added an MI command to HEAD to check mqueue size live:
http://git.sip-router.org/cgi-bin/gitweb.cgi?p=sip-router;a=commit;h=41f7715...
(I think it would be quite useful anyhow, as there is no runtime mqueue info available right now.)
Using it, I confirmed my impression: the bigger the rtimer interval, the larger the queue grows, and the slower the events seem to be written overall. The shorter the rtimer interval, the faster they are dequeued, and the smaller the mqueue is overall, in the course of normal call processing.
Thus, my impression is that the execution time of the rtimer task is bounded by the interval, and that the core kills the forked rtimer task currently running prior to starting another one.
Can you shed some light?
Thanks!
-- Alex
On 07/23/2012 08:38 PM, Alex Balashov wrote:
mq_pv_free("account_write");
This is a typo, btw.
I did an experiment to try to simplify the problem, and rule out mqueue concurrency/locking issues:
modparam("pv", "shvset", "lasttm=i:-1")
modparam("rtimer", "timer", "name=test;interval=1;mode=1") modparam("rtimer", "exec", "timer=test;route=TEST_ROUTE")
route[TEST_ROUTE] { while(1) { if($shv(lasttm) < 0) $shv(lasttm) = $TV(sn); else { $var(gap) = $TV(sn) - $shv(lasttm);
if($var(gap) >= 1) { xlog("L_INFO", "Presented time gap: $var(gap)\n"); } }
$shv(lasttm) = $TV(sn); } }
And here's what I saw in the log:
Jul 23 21:10:07 diminuendo-1 /usr/local/sbin/kamailio[26073]: INFO: Presented time gap: 1 Jul 23 21:10:12 diminuendo-1 /usr/local/sbin/kamailio[26073]: INFO: Presented time gap: 1 Jul 23 21:10:13 diminuendo-1 /usr/local/sbin/kamailio[26073]: INFO: Presented time gap: 1 Jul 23 21:10:16 diminuendo-1 /usr/local/sbin/kamailio[26073]: INFO: Presented time gap: 1 Jul 23 21:10:18 diminuendo-1 /usr/local/sbin/kamailio[26073]: INFO: Presented time gap: 1 Jul 23 21:10:21 diminuendo-1 /usr/local/sbin/kamailio[26073]: INFO: Presented time gap: 1 Jul 23 21:10:23 diminuendo-1 /usr/local/sbin/kamailio[26073]: INFO: Presented time gap: 1 Jul 23 21:10:29 diminuendo-1 /usr/local/sbin/kamailio[26073]: INFO: Presented time gap: 1 Jul 23 21:10:30 diminuendo-1 /usr/local/sbin/kamailio[26073]: INFO: Presented time gap: 1 Jul 23 21:10:34 diminuendo-1 /usr/local/sbin/kamailio[26073]: INFO: Presented time gap: 1 Jul 23 21:10:36 diminuendo-1 /usr/local/sbin/kamailio[26073]: INFO: Presented time gap: 1 Jul 23 21:10:38 diminuendo-1 /usr/local/sbin/kamailio[26073]: INFO: Presented time gap: 1 Jul 23 21:10:39 diminuendo-1 /usr/local/sbin/kamailio[26073]: INFO: Presented time gap: 1 Jul 23 21:10:40 diminuendo-1 /usr/local/sbin/kamailio[26073]: INFO: Presented time gap: 1 Jul 23 21:10:43 diminuendo-1 /usr/local/sbin/kamailio[26073]: INFO: Presented time gap: 1
When I repeated the experiment with interval=3:
Jul 23 21:11:51 diminuendo-1 /usr/local/sbin/kamailio[26132]: INFO: Presented time gap: 1 Jul 23 21:11:52 diminuendo-1 /usr/local/sbin/kamailio[26132]: INFO: Presented time gap: 1 Jul 23 21:11:54 diminuendo-1 /usr/local/sbin/kamailio[26132]: INFO: Presented time gap: 1 Jul 23 21:11:55 diminuendo-1 /usr/local/sbin/kamailio[26132]: INFO: Presented time gap: 1 Jul 23 21:11:56 diminuendo-1 /usr/local/sbin/kamailio[26132]: INFO: Presented time gap: 1 Jul 23 21:11:59 diminuendo-1 /usr/local/sbin/kamailio[26132]: INFO: Presented time gap: 1 Jul 23 21:12:03 diminuendo-1 /usr/local/sbin/kamailio[26132]: INFO: Presented time gap: 1 Jul 23 21:12:04 diminuendo-1 /usr/local/sbin/kamailio[26132]: INFO: Presented time gap: 1
So, something is definitely interrupting the possibility of continuous execution of these rtimer tasks.
I think some sort of execution time governor in the actions infrastructure is probably the right angle here.
I added a log statement to modules_k/rtimer/rtimer_mod.c:stm_timer_exec():
for(rt=it->rt; rt; rt=rt->next) { fmsg = faked_msg_next(); if (exec_pre_script_cb(fmsg, REQUEST_CB_TYPE)==0 ) continue; /* drop the request */ set_route_type(REQUEST_ROUTE); run_top_route(main_rt.rlist[rt->route], fmsg, 0); LM_ERR("Fell out of route execution: %u\n", time(NULL)); exec_post_script_cb(fmsg, REQUEST_CB_TYPE); }
And here's what I saw, with the previous test task (quoted in the message above which I am posting):
Jul 23 21:20:15 diminuendo-1 /usr/local/sbin/kamailio[26255]: ERROR: rtimer [rtimer_mod.c:192]: Fell out of route execution: 1343092815 Jul 23 21:20:19 diminuendo-1 /usr/local/sbin/kamailio[26254]: INFO: Presented time gap: 1 Jul 23 21:20:20 diminuendo-1 /usr/local/sbin/kamailio[26255]: ERROR: rtimer [rtimer_mod.c:192]: Fell out of route execution: 1343092820 Jul 23 21:20:22 diminuendo-1 /usr/local/sbin/kamailio[26254]: INFO: Presented time gap: 1 Jul 23 21:20:24 diminuendo-1 /usr/local/sbin/kamailio[26254]: INFO: Presented time gap: 1 Jul 23 21:20:25 diminuendo-1 /usr/local/sbin/kamailio[26254]: INFO: Presented time gap: 1 Jul 23 21:20:25 diminuendo-1 /usr/local/sbin/kamailio[26255]: ERROR: rtimer [rtimer_mod.c:192]: Fell out of route execution: 1343092825 Jul 23 21:20:28 diminuendo-1 /usr/local/sbin/kamailio[26254]: INFO: Presented time gap: 1 Jul 23 21:20:29 diminuendo-1 /usr/local/sbin/kamailio[26254]: INFO: Presented time gap: 1 Jul 23 21:20:30 diminuendo-1 /usr/local/sbin/kamailio[26255]: ERROR: rtimer [rtimer_mod.c:192]: Fell out of route execution: 1343092830 Jul 23 21:20:31 diminuendo-1 /usr/local/sbin/kamailio[26254]: INFO: Presented time gap: 1 Jul 23 21:20:35 diminuendo-1 /usr/local/sbin/kamailio[26254]: INFO: Presented time gap: 1 Jul 23 21:20:35 diminuendo-1 /usr/local/sbin/kamailio[26255]: ERROR: rtimer [rtimer_mod.c:192]: Fell out of route execution: 1343092835 Jul 23 21:20:36 diminuendo-1 /usr/local/sbin/kamailio[26254]: INFO: Presented time gap: 1 Jul 23 21:20:37 diminuendo-1 /usr/local/sbin/kamailio[26254]: INFO: Presented time gap: 1 Jul 23 21:20:38 diminuendo-1 /usr/local/sbin/kamailio[26254]: INFO: Presented time gap: 1
Shouldn't that while() loop in the script run to infinity? (max_while_loops is set to 0)
On 07/23/2012 09:12 PM, Alex Balashov wrote:
I did an experiment to try to simplify the problem, and rule out mqueue concurrency/locking issues:
modparam("pv", "shvset", "lasttm=i:-1") modparam("rtimer", "timer", "name=test;interval=1;mode=1") modparam("rtimer", "exec", "timer=test;route=TEST_ROUTE") route[TEST_ROUTE] { while(1) { if($shv(lasttm) < 0) $shv(lasttm) = $TV(sn); else { $var(gap) = $TV(sn) - $shv(lasttm); if($var(gap) >= 1) { xlog("L_INFO", "Presented time gap:
$var(gap)\n"); } }
$shv(lasttm) = $TV(sn); } }
And here's what I saw in the log:
Jul 23 21:10:07 diminuendo-1 /usr/local/sbin/kamailio[26073]: INFO: Presented time gap: 1 Jul 23 21:10:12 diminuendo-1 /usr/local/sbin/kamailio[26073]: INFO: Presented time gap: 1 Jul 23 21:10:13 diminuendo-1 /usr/local/sbin/kamailio[26073]: INFO: Presented time gap: 1 Jul 23 21:10:16 diminuendo-1 /usr/local/sbin/kamailio[26073]: INFO: Presented time gap: 1 Jul 23 21:10:18 diminuendo-1 /usr/local/sbin/kamailio[26073]: INFO: Presented time gap: 1 Jul 23 21:10:21 diminuendo-1 /usr/local/sbin/kamailio[26073]: INFO: Presented time gap: 1 Jul 23 21:10:23 diminuendo-1 /usr/local/sbin/kamailio[26073]: INFO: Presented time gap: 1 Jul 23 21:10:29 diminuendo-1 /usr/local/sbin/kamailio[26073]: INFO: Presented time gap: 1 Jul 23 21:10:30 diminuendo-1 /usr/local/sbin/kamailio[26073]: INFO: Presented time gap: 1 Jul 23 21:10:34 diminuendo-1 /usr/local/sbin/kamailio[26073]: INFO: Presented time gap: 1 Jul 23 21:10:36 diminuendo-1 /usr/local/sbin/kamailio[26073]: INFO: Presented time gap: 1 Jul 23 21:10:38 diminuendo-1 /usr/local/sbin/kamailio[26073]: INFO: Presented time gap: 1 Jul 23 21:10:39 diminuendo-1 /usr/local/sbin/kamailio[26073]: INFO: Presented time gap: 1 Jul 23 21:10:40 diminuendo-1 /usr/local/sbin/kamailio[26073]: INFO: Presented time gap: 1 Jul 23 21:10:43 diminuendo-1 /usr/local/sbin/kamailio[26073]: INFO: Presented time gap: 1
When I repeated the experiment with interval=3:
Jul 23 21:11:51 diminuendo-1 /usr/local/sbin/kamailio[26132]: INFO: Presented time gap: 1 Jul 23 21:11:52 diminuendo-1 /usr/local/sbin/kamailio[26132]: INFO: Presented time gap: 1 Jul 23 21:11:54 diminuendo-1 /usr/local/sbin/kamailio[26132]: INFO: Presented time gap: 1 Jul 23 21:11:55 diminuendo-1 /usr/local/sbin/kamailio[26132]: INFO: Presented time gap: 1 Jul 23 21:11:56 diminuendo-1 /usr/local/sbin/kamailio[26132]: INFO: Presented time gap: 1 Jul 23 21:11:59 diminuendo-1 /usr/local/sbin/kamailio[26132]: INFO: Presented time gap: 1 Jul 23 21:12:03 diminuendo-1 /usr/local/sbin/kamailio[26132]: INFO: Presented time gap: 1 Jul 23 21:12:04 diminuendo-1 /usr/local/sbin/kamailio[26132]: INFO: Presented time gap: 1
So, something is definitely interrupting the possibility of continuous execution of these rtimer tasks.
Yeah, something's going on interrupting the execution of the route. I set the interval to 1:
modparam("rtimer", "timer", "name=test;interval=1;mode=1")
And then added this to the infinite for() loop in timer_proc.c:fork_basic_timer(), after the call to the timer_function:
LM_ERR("Bottomed out of timer loop\n"); if(tm_last > 0) { tm = time(NULL); LM_ERR("Execution gap: %d\n", tm - tm_last); }
tm_last = time(0); }
And here's what I got:
Jul 23 21:59:19 diminuendo-1 /usr/local/sbin/kamailio[2686]: INFO: Presented time gap: 1 Jul 23 21:59:20 diminuendo-1 /usr/local/sbin/kamailio[2686]: INFO: Presented time gap: 1 Jul 23 21:59:22 diminuendo-1 /usr/local/sbin/kamailio[2686]: INFO: Presented time gap: 1 Jul 23 21:59:22 diminuendo-1 /usr/local/sbin/kamailio[2687]: ERROR: rtimer [rtimer_mod.c:194]: Bottomed out of top route execution Jul 23 21:59:22 diminuendo-1 /usr/local/sbin/kamailio[2687]: ERROR: <core> [timer_proc.c:94]: Bottomed out of timer loop Jul 23 21:59:23 diminuendo-1 /usr/local/sbin/kamailio[2686]: INFO: Presented time gap: 1 Jul 23 21:59:25 diminuendo-1 /usr/local/sbin/kamailio[2686]: INFO: Presented time gap: 1 Jul 23 21:59:26 diminuendo-1 /usr/local/sbin/kamailio[2686]: INFO: Presented time gap: 1 Jul 23 21:59:27 diminuendo-1 /usr/local/sbin/kamailio[2687]: ERROR: rtimer [rtimer_mod.c:194]: Bottomed out of top route execution Jul 23 21:59:27 diminuendo-1 /usr/local/sbin/kamailio[2687]: ERROR: <core> [timer_proc.c:94]: Bottomed out of timer loop Jul 23 21:59:27 diminuendo-1 /usr/local/sbin/kamailio[2687]: ERROR: <core> [timer_proc.c:99]: Execution gap: 5 Jul 23 21:59:32 diminuendo-1 /usr/local/sbin/kamailio[2687]: ERROR: rtimer [rtimer_mod.c:194]: Bottomed out of top route execution Jul 23 21:59:32 diminuendo-1 /usr/local/sbin/kamailio[2687]: ERROR: <core> [timer_proc.c:94]: Bottomed out of timer loop Jul 23 21:59:32 diminuendo-1 /usr/local/sbin/kamailio[2687]: ERROR: <core> [timer_proc.c:99]: Execution gap: 5
There is definitely something dependably interrupting the execution once in a while (every 5 sec).
On 07/23/2012 09:23 PM, Alex Balashov wrote:
I think some sort of execution time governor in the actions infrastructure is probably the right angle here.
I added a log statement to modules_k/rtimer/rtimer_mod.c:stm_timer_exec():
for(rt=it->rt; rt; rt=rt->next) { fmsg = faked_msg_next(); if (exec_pre_script_cb(fmsg, REQUEST_CB_TYPE)==0 ) continue; /* drop the request */ set_route_type(REQUEST_ROUTE); run_top_route(main_rt.rlist[rt->route], fmsg, 0); LM_ERR("Fell out of route execution: %u\n", time(NULL)); exec_post_script_cb(fmsg, REQUEST_CB_TYPE); }
And here's what I saw, with the previous test task (quoted in the message above which I am posting):
Jul 23 21:20:15 diminuendo-1 /usr/local/sbin/kamailio[26255]: ERROR: rtimer [rtimer_mod.c:192]: Fell out of route execution: 1343092815 Jul 23 21:20:19 diminuendo-1 /usr/local/sbin/kamailio[26254]: INFO: Presented time gap: 1 Jul 23 21:20:20 diminuendo-1 /usr/local/sbin/kamailio[26255]: ERROR: rtimer [rtimer_mod.c:192]: Fell out of route execution: 1343092820 Jul 23 21:20:22 diminuendo-1 /usr/local/sbin/kamailio[26254]: INFO: Presented time gap: 1 Jul 23 21:20:24 diminuendo-1 /usr/local/sbin/kamailio[26254]: INFO: Presented time gap: 1 Jul 23 21:20:25 diminuendo-1 /usr/local/sbin/kamailio[26254]: INFO: Presented time gap: 1 Jul 23 21:20:25 diminuendo-1 /usr/local/sbin/kamailio[26255]: ERROR: rtimer [rtimer_mod.c:192]: Fell out of route execution: 1343092825 Jul 23 21:20:28 diminuendo-1 /usr/local/sbin/kamailio[26254]: INFO: Presented time gap: 1 Jul 23 21:20:29 diminuendo-1 /usr/local/sbin/kamailio[26254]: INFO: Presented time gap: 1 Jul 23 21:20:30 diminuendo-1 /usr/local/sbin/kamailio[26255]: ERROR: rtimer [rtimer_mod.c:192]: Fell out of route execution: 1343092830 Jul 23 21:20:31 diminuendo-1 /usr/local/sbin/kamailio[26254]: INFO: Presented time gap: 1 Jul 23 21:20:35 diminuendo-1 /usr/local/sbin/kamailio[26254]: INFO: Presented time gap: 1 Jul 23 21:20:35 diminuendo-1 /usr/local/sbin/kamailio[26255]: ERROR: rtimer [rtimer_mod.c:192]: Fell out of route execution: 1343092835 Jul 23 21:20:36 diminuendo-1 /usr/local/sbin/kamailio[26254]: INFO: Presented time gap: 1 Jul 23 21:20:37 diminuendo-1 /usr/local/sbin/kamailio[26254]: INFO: Presented time gap: 1 Jul 23 21:20:38 diminuendo-1 /usr/local/sbin/kamailio[26254]: INFO: Presented time gap: 1
Shouldn't that while() loop in the script run to infinity? (max_while_loops is set to 0)
On 07/23/2012 09:12 PM, Alex Balashov wrote:
I did an experiment to try to simplify the problem, and rule out mqueue concurrency/locking issues:
modparam("pv", "shvset", "lasttm=i:-1") modparam("rtimer", "timer", "name=test;interval=1;mode=1") modparam("rtimer", "exec", "timer=test;route=TEST_ROUTE") route[TEST_ROUTE] { while(1) { if($shv(lasttm) < 0) $shv(lasttm) = $TV(sn); else { $var(gap) = $TV(sn) - $shv(lasttm); if($var(gap) >= 1) { xlog("L_INFO", "Presented time gap:
$var(gap)\n"); } }
$shv(lasttm) = $TV(sn); } }
And here's what I saw in the log:
Jul 23 21:10:07 diminuendo-1 /usr/local/sbin/kamailio[26073]: INFO: Presented time gap: 1 Jul 23 21:10:12 diminuendo-1 /usr/local/sbin/kamailio[26073]: INFO: Presented time gap: 1 Jul 23 21:10:13 diminuendo-1 /usr/local/sbin/kamailio[26073]: INFO: Presented time gap: 1 Jul 23 21:10:16 diminuendo-1 /usr/local/sbin/kamailio[26073]: INFO: Presented time gap: 1 Jul 23 21:10:18 diminuendo-1 /usr/local/sbin/kamailio[26073]: INFO: Presented time gap: 1 Jul 23 21:10:21 diminuendo-1 /usr/local/sbin/kamailio[26073]: INFO: Presented time gap: 1 Jul 23 21:10:23 diminuendo-1 /usr/local/sbin/kamailio[26073]: INFO: Presented time gap: 1 Jul 23 21:10:29 diminuendo-1 /usr/local/sbin/kamailio[26073]: INFO: Presented time gap: 1 Jul 23 21:10:30 diminuendo-1 /usr/local/sbin/kamailio[26073]: INFO: Presented time gap: 1 Jul 23 21:10:34 diminuendo-1 /usr/local/sbin/kamailio[26073]: INFO: Presented time gap: 1 Jul 23 21:10:36 diminuendo-1 /usr/local/sbin/kamailio[26073]: INFO: Presented time gap: 1 Jul 23 21:10:38 diminuendo-1 /usr/local/sbin/kamailio[26073]: INFO: Presented time gap: 1 Jul 23 21:10:39 diminuendo-1 /usr/local/sbin/kamailio[26073]: INFO: Presented time gap: 1 Jul 23 21:10:40 diminuendo-1 /usr/local/sbin/kamailio[26073]: INFO: Presented time gap: 1 Jul 23 21:10:43 diminuendo-1 /usr/local/sbin/kamailio[26073]: INFO: Presented time gap: 1
When I repeated the experiment with interval=3:
Jul 23 21:11:51 diminuendo-1 /usr/local/sbin/kamailio[26132]: INFO: Presented time gap: 1 Jul 23 21:11:52 diminuendo-1 /usr/local/sbin/kamailio[26132]: INFO: Presented time gap: 1 Jul 23 21:11:54 diminuendo-1 /usr/local/sbin/kamailio[26132]: INFO: Presented time gap: 1 Jul 23 21:11:55 diminuendo-1 /usr/local/sbin/kamailio[26132]: INFO: Presented time gap: 1 Jul 23 21:11:56 diminuendo-1 /usr/local/sbin/kamailio[26132]: INFO: Presented time gap: 1 Jul 23 21:11:59 diminuendo-1 /usr/local/sbin/kamailio[26132]: INFO: Presented time gap: 1 Jul 23 21:12:03 diminuendo-1 /usr/local/sbin/kamailio[26132]: INFO: Presented time gap: 1 Jul 23 21:12:04 diminuendo-1 /usr/local/sbin/kamailio[26132]: INFO: Presented time gap: 1
So, something is definitely interrupting the possibility of continuous execution of these rtimer tasks.
Okay, ignore this bit. I had a conflicting timer that I forgot about that was engaged at a 5 sec interval, thus polluting the results. When I run my pure while() loop test routine and only it, it does not demonstrate this loop-breaking behaviour.
It still doesn't explain what is going on with my mqueue-consuming DB loop, though. One easy explanation would be that mq_fetch() returns 0 at some point, and the while() loop bottoms out, and doesn't run again for 'interval' (u)sec. But then, why would I be showing the mqueue as growing and growing, not at 0 elements?
On 07/23/2012 10:02 PM, Alex Balashov wrote:
Yeah, something's going on interrupting the execution of the route. I set the interval to 1:
modparam("rtimer", "timer", "name=test;interval=1;mode=1")
And then added this to the infinite for() loop in timer_proc.c:fork_basic_timer(), after the call to the timer_function:
LM_ERR("Bottomed out of timer loop\n"); if(tm_last > 0) { tm = time(NULL); LM_ERR("Execution gap: %d\n", tm - tm_last); } tm_last = time(0); }
And here's what I got:
Jul 23 21:59:19 diminuendo-1 /usr/local/sbin/kamailio[2686]: INFO: Presented time gap: 1 Jul 23 21:59:20 diminuendo-1 /usr/local/sbin/kamailio[2686]: INFO: Presented time gap: 1 Jul 23 21:59:22 diminuendo-1 /usr/local/sbin/kamailio[2686]: INFO: Presented time gap: 1 Jul 23 21:59:22 diminuendo-1 /usr/local/sbin/kamailio[2687]: ERROR: rtimer [rtimer_mod.c:194]: Bottomed out of top route execution Jul 23 21:59:22 diminuendo-1 /usr/local/sbin/kamailio[2687]: ERROR: <core> [timer_proc.c:94]: Bottomed out of timer loop Jul 23 21:59:23 diminuendo-1 /usr/local/sbin/kamailio[2686]: INFO: Presented time gap: 1 Jul 23 21:59:25 diminuendo-1 /usr/local/sbin/kamailio[2686]: INFO: Presented time gap: 1 Jul 23 21:59:26 diminuendo-1 /usr/local/sbin/kamailio[2686]: INFO: Presented time gap: 1 Jul 23 21:59:27 diminuendo-1 /usr/local/sbin/kamailio[2687]: ERROR: rtimer [rtimer_mod.c:194]: Bottomed out of top route execution Jul 23 21:59:27 diminuendo-1 /usr/local/sbin/kamailio[2687]: ERROR: <core> [timer_proc.c:94]: Bottomed out of timer loop Jul 23 21:59:27 diminuendo-1 /usr/local/sbin/kamailio[2687]: ERROR: <core> [timer_proc.c:99]: Execution gap: 5 Jul 23 21:59:32 diminuendo-1 /usr/local/sbin/kamailio[2687]: ERROR: rtimer [rtimer_mod.c:194]: Bottomed out of top route execution Jul 23 21:59:32 diminuendo-1 /usr/local/sbin/kamailio[2687]: ERROR: <core> [timer_proc.c:94]: Bottomed out of timer loop Jul 23 21:59:32 diminuendo-1 /usr/local/sbin/kamailio[2687]: ERROR: <core> [timer_proc.c:99]: Execution gap: 5
There is definitely something dependably interrupting the execution once in a while (every 5 sec).
On 07/23/2012 09:23 PM, Alex Balashov wrote:
I think some sort of execution time governor in the actions infrastructure is probably the right angle here.
I added a log statement to modules_k/rtimer/rtimer_mod.c:stm_timer_exec():
for(rt=it->rt; rt; rt=rt->next) { fmsg = faked_msg_next(); if (exec_pre_script_cb(fmsg, REQUEST_CB_TYPE)==0 ) continue; /* drop the request */ set_route_type(REQUEST_ROUTE); run_top_route(main_rt.rlist[rt->route], fmsg, 0); LM_ERR("Fell out of route execution: %u\n", time(NULL)); exec_post_script_cb(fmsg, REQUEST_CB_TYPE); }
And here's what I saw, with the previous test task (quoted in the message above which I am posting):
Jul 23 21:20:15 diminuendo-1 /usr/local/sbin/kamailio[26255]: ERROR: rtimer [rtimer_mod.c:192]: Fell out of route execution: 1343092815 Jul 23 21:20:19 diminuendo-1 /usr/local/sbin/kamailio[26254]: INFO: Presented time gap: 1 Jul 23 21:20:20 diminuendo-1 /usr/local/sbin/kamailio[26255]: ERROR: rtimer [rtimer_mod.c:192]: Fell out of route execution: 1343092820 Jul 23 21:20:22 diminuendo-1 /usr/local/sbin/kamailio[26254]: INFO: Presented time gap: 1 Jul 23 21:20:24 diminuendo-1 /usr/local/sbin/kamailio[26254]: INFO: Presented time gap: 1 Jul 23 21:20:25 diminuendo-1 /usr/local/sbin/kamailio[26254]: INFO: Presented time gap: 1 Jul 23 21:20:25 diminuendo-1 /usr/local/sbin/kamailio[26255]: ERROR: rtimer [rtimer_mod.c:192]: Fell out of route execution: 1343092825 Jul 23 21:20:28 diminuendo-1 /usr/local/sbin/kamailio[26254]: INFO: Presented time gap: 1 Jul 23 21:20:29 diminuendo-1 /usr/local/sbin/kamailio[26254]: INFO: Presented time gap: 1 Jul 23 21:20:30 diminuendo-1 /usr/local/sbin/kamailio[26255]: ERROR: rtimer [rtimer_mod.c:192]: Fell out of route execution: 1343092830 Jul 23 21:20:31 diminuendo-1 /usr/local/sbin/kamailio[26254]: INFO: Presented time gap: 1 Jul 23 21:20:35 diminuendo-1 /usr/local/sbin/kamailio[26254]: INFO: Presented time gap: 1 Jul 23 21:20:35 diminuendo-1 /usr/local/sbin/kamailio[26255]: ERROR: rtimer [rtimer_mod.c:192]: Fell out of route execution: 1343092835 Jul 23 21:20:36 diminuendo-1 /usr/local/sbin/kamailio[26254]: INFO: Presented time gap: 1 Jul 23 21:20:37 diminuendo-1 /usr/local/sbin/kamailio[26254]: INFO: Presented time gap: 1 Jul 23 21:20:38 diminuendo-1 /usr/local/sbin/kamailio[26254]: INFO: Presented time gap: 1
Shouldn't that while() loop in the script run to infinity? (max_while_loops is set to 0)
On 07/23/2012 09:12 PM, Alex Balashov wrote:
I did an experiment to try to simplify the problem, and rule out mqueue concurrency/locking issues:
modparam("pv", "shvset", "lasttm=i:-1") modparam("rtimer", "timer", "name=test;interval=1;mode=1") modparam("rtimer", "exec", "timer=test;route=TEST_ROUTE") route[TEST_ROUTE] { while(1) { if($shv(lasttm) < 0) $shv(lasttm) = $TV(sn); else { $var(gap) = $TV(sn) - $shv(lasttm); if($var(gap) >= 1) { xlog("L_INFO", "Presented time gap:
$var(gap)\n"); } }
$shv(lasttm) = $TV(sn); } }
And here's what I saw in the log:
Jul 23 21:10:07 diminuendo-1 /usr/local/sbin/kamailio[26073]: INFO: Presented time gap: 1 Jul 23 21:10:12 diminuendo-1 /usr/local/sbin/kamailio[26073]: INFO: Presented time gap: 1 Jul 23 21:10:13 diminuendo-1 /usr/local/sbin/kamailio[26073]: INFO: Presented time gap: 1 Jul 23 21:10:16 diminuendo-1 /usr/local/sbin/kamailio[26073]: INFO: Presented time gap: 1 Jul 23 21:10:18 diminuendo-1 /usr/local/sbin/kamailio[26073]: INFO: Presented time gap: 1 Jul 23 21:10:21 diminuendo-1 /usr/local/sbin/kamailio[26073]: INFO: Presented time gap: 1 Jul 23 21:10:23 diminuendo-1 /usr/local/sbin/kamailio[26073]: INFO: Presented time gap: 1 Jul 23 21:10:29 diminuendo-1 /usr/local/sbin/kamailio[26073]: INFO: Presented time gap: 1 Jul 23 21:10:30 diminuendo-1 /usr/local/sbin/kamailio[26073]: INFO: Presented time gap: 1 Jul 23 21:10:34 diminuendo-1 /usr/local/sbin/kamailio[26073]: INFO: Presented time gap: 1 Jul 23 21:10:36 diminuendo-1 /usr/local/sbin/kamailio[26073]: INFO: Presented time gap: 1 Jul 23 21:10:38 diminuendo-1 /usr/local/sbin/kamailio[26073]: INFO: Presented time gap: 1 Jul 23 21:10:39 diminuendo-1 /usr/local/sbin/kamailio[26073]: INFO: Presented time gap: 1 Jul 23 21:10:40 diminuendo-1 /usr/local/sbin/kamailio[26073]: INFO: Presented time gap: 1 Jul 23 21:10:43 diminuendo-1 /usr/local/sbin/kamailio[26073]: INFO: Presented time gap: 1
When I repeated the experiment with interval=3:
Jul 23 21:11:51 diminuendo-1 /usr/local/sbin/kamailio[26132]: INFO: Presented time gap: 1 Jul 23 21:11:52 diminuendo-1 /usr/local/sbin/kamailio[26132]: INFO: Presented time gap: 1 Jul 23 21:11:54 diminuendo-1 /usr/local/sbin/kamailio[26132]: INFO: Presented time gap: 1 Jul 23 21:11:55 diminuendo-1 /usr/local/sbin/kamailio[26132]: INFO: Presented time gap: 1 Jul 23 21:11:56 diminuendo-1 /usr/local/sbin/kamailio[26132]: INFO: Presented time gap: 1 Jul 23 21:11:59 diminuendo-1 /usr/local/sbin/kamailio[26132]: INFO: Presented time gap: 1 Jul 23 21:12:03 diminuendo-1 /usr/local/sbin/kamailio[26132]: INFO: Presented time gap: 1 Jul 23 21:12:04 diminuendo-1 /usr/local/sbin/kamailio[26132]: INFO: Presented time gap: 1
So, something is definitely interrupting the possibility of continuous execution of these rtimer tasks.
On 07/23/2012 08:38 PM, Alex Balashov wrote:
route[ACCT_WRITER] { while(mq_fetch("acct_write")) { # Some stuff to write $mqv(...) values to PostgreSQL. ... sql_query("ca", "SELECT * FROM acct_event_proc(...)"); ... mq_pv_free("acct_write"); } }
I added the mq_size() function in script:
http://git.sip-router.org/cgi-bin/gitweb.cgi?p=sip-router;a=commit;h=f492b41...
Then, added this to the bottom of this route:
$var(sz) = mq_size("acct_write");
if($var(sz) > 0) xlog("L_INFO", "Exiting request route with mq size: $var(sz)\n");
And, I'm seeing this in the log, running at a 1-second interval:
Jul 24 22:41:59 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 229 Jul 24 22:42:00 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 82 Jul 24 22:42:05 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 328 Jul 24 22:42:11 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 689 Jul 24 22:42:15 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 946 Jul 24 22:42:22 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 1586 Jul 24 22:42:27 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 1783 Jul 24 22:42:32 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 1964 Jul 24 22:42:37 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 2346 Jul 24 22:42:39 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 2323 Jul 24 22:42:41 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 2210 Jul 24 22:42:45 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 2224 Jul 24 22:42:46 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 2056 Jul 24 22:42:47 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 1904 Jul 24 22:42:48 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 1798 Jul 24 22:42:49 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 1619 Jul 24 22:42:53 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 1651 Jul 24 22:42:54 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 1467 Jul 24 22:42:55 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 1378 Jul 24 22:42:56 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 1206 Jul 24 22:42:59 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 1220 Jul 24 22:43:04 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 1542 Jul 24 22:43:08 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 1733 Jul 24 22:43:13 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 1846 Jul 24 22:43:21 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 2437 Jul 24 22:43:26 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 2751 Jul 24 22:43:29 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 2734 Jul 24 22:43:33 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 2778 Jul 24 22:43:35 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 2742 Jul 24 22:43:37 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 2582 Jul 24 22:43:42 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 2828 Jul 24 22:43:44 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 2663 Jul 24 22:43:45 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 2490 Jul 24 22:43:46 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 2339 Jul 24 22:43:47 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 2153 Jul 24 22:43:49 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 2022 Jul 24 22:43:53 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 2182 Jul 24 22:43:54 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 2002 Jul 24 22:43:55 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 1813 Jul 24 22:43:57 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 1638 Jul 24 22:43:58 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 1439 Jul 24 22:44:02 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 1452 Jul 24 22:44:06 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 1640 Jul 24 22:44:11 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 1836 Jul 24 22:44:17 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 2120 Jul 24 22:44:22 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 2156 Jul 24 22:44:27 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 2495
So, perhaps my issue comes down to the while() loop breaking before the queue is truly empty. That is, unless those thousands of records are somehow showing up between the while() condition evaluation and my log statement, which seems quite unlikely. Any way to fix that?
Hello,
On 7/24/12 4:46 AM, Alex Balashov wrote:
On 07/23/2012 08:38 PM, Alex Balashov wrote:
route[ACCT_WRITER] { while(mq_fetch("acct_write")) { # Some stuff to write $mqv(...) values to PostgreSQL. ... sql_query("ca", "SELECT * FROM acct_event_proc(...)"); ... mq_pv_free("acct_write"); } }
I added the mq_size() function in script:
http://git.sip-router.org/cgi-bin/gitweb.cgi?p=sip-router;a=commit;h=f492b41...
Then, added this to the bottom of this route:
$var(sz) = mq_size("acct_write"); if($var(sz) > 0) xlog("L_INFO", "Exiting request route with mq size:
$var(sz)\n");
I assume the xlog message above is starting ' Executing ...' as showed in next logs.
Have you tried it with the log message in the sources to see if it goes after the C code line:
run_top_route(main_rt.rlist[rt->route], fmsg, 0);
?
The way mqueue is consuming items from queue does not ensure queue locking until all items are consumed. So the first item there is removed from queue and cloned in private memory where is accessible via PV, then the queue is unlocked so others can consume or add to it.
Not being in office for few days, I cannot run tests myself ... do you have heavy writers? Can you add few more timer processes to consume from queue and see what happens?
Cheers, Daniel
And, I'm seeing this in the log, running at a 1-second interval:
Jul 24 22:41:59 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 229 Jul 24 22:42:00 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 82 Jul 24 22:42:05 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 328 Jul 24 22:42:11 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 689 Jul 24 22:42:15 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 946 Jul 24 22:42:22 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 1586 Jul 24 22:42:27 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 1783 Jul 24 22:42:32 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 1964 Jul 24 22:42:37 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 2346 Jul 24 22:42:39 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 2323 Jul 24 22:42:41 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 2210 Jul 24 22:42:45 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 2224 Jul 24 22:42:46 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 2056 Jul 24 22:42:47 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 1904 Jul 24 22:42:48 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 1798 Jul 24 22:42:49 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 1619 Jul 24 22:42:53 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 1651 Jul 24 22:42:54 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 1467 Jul 24 22:42:55 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 1378 Jul 24 22:42:56 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 1206 Jul 24 22:42:59 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 1220 Jul 24 22:43:04 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 1542 Jul 24 22:43:08 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 1733 Jul 24 22:43:13 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 1846 Jul 24 22:43:21 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 2437 Jul 24 22:43:26 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 2751 Jul 24 22:43:29 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 2734 Jul 24 22:43:33 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 2778 Jul 24 22:43:35 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 2742 Jul 24 22:43:37 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 2582 Jul 24 22:43:42 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 2828 Jul 24 22:43:44 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 2663 Jul 24 22:43:45 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 2490 Jul 24 22:43:46 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 2339 Jul 24 22:43:47 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 2153 Jul 24 22:43:49 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 2022 Jul 24 22:43:53 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 2182 Jul 24 22:43:54 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 2002 Jul 24 22:43:55 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 1813 Jul 24 22:43:57 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 1638 Jul 24 22:43:58 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 1439 Jul 24 22:44:02 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 1452 Jul 24 22:44:06 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 1640 Jul 24 22:44:11 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 1836 Jul 24 22:44:17 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 2120 Jul 24 22:44:22 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 2156 Jul 24 22:44:27 diminuendo-1 /usr/local/sbin/kamailio[31118]: INFO: Executing request route with mq size: 2495
So, perhaps my issue comes down to the while() loop breaking before the queue is truly empty. That is, unless those thousands of records are somehow showing up between the while() condition evaluation and my log statement, which seems quite unlikely. Any way to fix that?
On 07/24/2012 02:59 AM, Daniel-Constantin Mierla wrote:
The way mqueue is consuming items from queue does not ensure queue locking until all items are consumed. So the first item there is removed from queue and cloned in private memory where is accessible via PV, then the queue is unlocked so others can consume or add to it.
Not being in office for few days, I cannot run tests myself ... do you have heavy writers? Can you add few more timer processes to consume from queue and see what happens?
I figured it out; Alex Hermann nailed it. The number of accounting events I was generating was tripping max_while_loops. I thought I had it set high enough, but I guess I didn't. I didn't catch the log message because it was being lost in thousands of lines per second of logging.
Sorry for the noise and the idiocy on my part!
-- Alex
On 07/25/2012 03:32 AM, Alex Balashov wrote:
On 07/24/2012 02:59 AM, Daniel-Constantin Mierla wrote:
The way mqueue is consuming items from queue does not ensure queue locking until all items are consumed. So the first item there is removed from queue and cloned in private memory where is accessible via PV, then the queue is unlocked so others can consume or add to it.
Not being in office for few days, I cannot run tests myself ... do you have heavy writers? Can you add few more timer processes to consume from queue and see what happens?
I figured it out; Alex Hermann nailed it. The number of accounting events I was generating was tripping max_while_loops. I thought I had it set high enough, but I guess I didn't. I didn't catch the log message because it was being lost in thousands of lines per second of logging.
There is an occasional situation where the queue empties out, while(mq_fetch(...)) returns false, and an instant later, something is added back to the queue, but now we must wait 1 rtimer task interval for event writing to resume.
However, the ability to do microsecond-based intervals in Kamailio 3.3 has made this problem largely irrelevant. :-)
On Tuesday 24 July 2012 04:46:43 Alex Balashov wrote:
So, perhaps my issue comes down to the while() loop breaking before the queue is truly empty. That is, unless those thousands of records are somehow showing up between the while() condition evaluation and my log statement, which seems quite unlikely. Any way to fix that?
Just to be sure, you do have set max_while_loops high enough to handle your message rate?
On 07/24/2012 04:20 AM, Alex Hermann wrote:
On Tuesday 24 July 2012 04:46:43 Alex Balashov wrote:
So, perhaps my issue comes down to the while() loop breaking before the queue is truly empty. That is, unless those thousands of records are somehow showing up between the while() condition evaluation and my log statement, which seems quite unlikely. Any way to fix that?
Just to be sure, you do have set max_while_loops high enough to handle your message rate?
I have it set to 2000, and am not exceeding 80-90 CPS, which each generate ~4-5 events/sec. I would think that would be high enough, but I suppose it's worth testing that theory just in case. Good call.