Hey Øyvind,
Am 17.01.2012 um 13:41 schrieb Øyvind Kolbu:
we use the dialog-module to keep track of concurrent
calls, and then set
a treshold for simultanious calls. After we upgraded from 1.5 to 3.2, we've
been having a lot of dialogs which are stuck in both the database and
memory.
kamailio -V
version: kamailio 3.2.1 (i386/linux) 918035-dirty
flags: STATS: Off, USE_IPV6, USE_TCP, USE_TLS, TLS_HOOKS, USE_RAW_SOCKS,
DISABLE_NAGLE, USE_MCAST, DNS_IP_HACK, SHM_MEM, SHM_MMAP, PKG_MALLOC,
DBG_QM_MALLOC, USE_FUTEX, FAST_LOCK-ADAPTIVE_WAIT, USE_DNS_CACHE,
USE_DNS_FAILOVER, USE_NAPTR, USE_DST_BLACKLIST, HAVE_RESOLV_RES
ADAPTIVE_WAIT_LOOPS=1024, MAX_RECV_BUFFER_SIZE 262144, MAX_LISTEN 16,
MAX_URI_SIZE 1024, BUF_SIZE 65535, DEFAULT PKG_SIZE 4MB
poll method support: poll, epoll_lt, epoll_et, sigio_rt, select.
id: 918035 -dirty
compiled on 10:00:10 Dec 20 2011 with gcc 4.1.2
Our configuration:
modparam("dialog", "dlg_flag", 3)
modparam("dialog", "default_timeout", 21600)
modparam("dialog", "dlg_match_mode", 1)
modparam("dialog", "db_url", VOIP_DATA4)
modparam("dialog", "db_mode", 1)
#!ifdef VOIP1
modparam("dialog", "table_name", "dialog1")
modparam("dialog", "vars_table_name", "dialog_vars1")
#!endif
#!ifdef VOIP2
modparam("dialog", "table_name", "dialog2")
modparam("dialog", "vars_table_name", "dialog_vars2")
#!endif
modparam("dialog", "profiles_with_value", "busy")
To Kamailio instances, voip1 and voip2, share the same config, but define VOIP1
and VOIP2, respectively.
Then in the main route:
if !allow_trusted() {
route(AUTH);
}
if (is_method("INVITE")) {
setflag(3);
dlg_manage();
if ($avp(s:f_uid) != $null) {
set_dlg_profile("busy","$avp(s:f_uid)");
get_profile_size("busy", "$avp(s:f_uid)",
"$var(dlg_busy)");
xlog("L_INFO", "BUSY: f_uid: $avp(s:f_uid), dlg_busy:
$var(dlg_busy)\n");
}
}
$avp(s:f_uid) is set in route(AUTH), so it will catch calls originating from one of
our users.
Then later in the callforward route:
# Testing dialog
$var(dlg_busy) = 0;
$var(dlg_busy1) = 0;
$var(dlg_busy2) = 0;
#!ifdef VOIP1
get_profile_size("busy", "$rU", "$var(dlg_busy1)");
sql_query("data4","select id from dialog2 where caller_contact like
'sip:$rU@%' or callee_contact like 'sip:$rU@%'",
"dialog");
$var(dlg_busy2) = $dbr(dialog=>rows);
#!endif
#!ifdef VOIP2
get_profile_size("busy", "$rU", "$var(dlg_busy2)");
sql_query("data4","select id from dialog1 where caller_contact like
'sip:$rU@%' or callee_contact like 'sip:$rU@%'",
"dialog");
$var(dlg_busy1) = $dbr(dialog=>rows);
#!endif
sql_result_free("dialog");
$var(dlg_busy) = $var(dlg_busy1) + $var(dlg_busy2);
And then check if $var(dlg_busy) is above a users treshold.
Finally when routing to a local user we mark it as busy:
set_dlg_profile("busy","$rU");
As of writing I have 5 entries in the dialog1-table, all valid, but:
# kamctl fifo dlg_list | grep -c "^dialog"
18
# kamctl fifo dlg_list | grep state | sort | uniq -c
13 state:: 1
5 state:: 4
on the other hand in the busy profile, which is the one we use:
# kamctl fifo profile_list_dlgs busy | grep -c "^dialog"
11
# kamctl fifo profile_list_dlgs busy | grep state | sort | uniq -c
6 state:: 1
5 state:: 4
It seems as this happens when, for some reason, kamailio is trying to parse duplicate
INVITEs.
From the logs:
# grep "^Jan 17" /var/log/messages | grep BUSY | grep -v "dlg_busy:
1" | cut -d\ -f8- |sort | uniq -c
1 BUSY: f_uid: 2367449, dlg_busy: 2
3 BUSY: f_uid: 2367453, dlg_busy: 2
1 BUSY: f_uid: 2367453, dlg_busy: 3
1 BUSY: f_uid: 2367453, dlg_busy: 4
1 BUSY: f_uid: 2574912, dlg_busy: 2
1 BUSY: f_uid: 2582270, dlg_busy: 2
1 BUSY: f_uid: ahmedma, dlg_busy: 2
1 BUSY: f_uid: josephim, dlg_busy: 2
1 BUSY: f_uid: perla, dlg_busy: 2
1 BUSY: f_uid: rosalinm, dlg_busy: 2
1 BUSY: f_uid: suresha, dlg_busy: 2
1 BUSY: f_uid: sveinb, dlg_busy: 2
I've full debugging logs available, if needed.
Do you possibly start to track dialogs before they are authenticated and reply to
unauthenticated INVITEs with 407 statelessly? IIRC, that causes dialogs to dangle in state
1 ("unconfirmed") because the module requires statefulness to operate and
conclude dialog tracking properly.
If that's the case, you have two options: Either return 407's statefully, or defer
dialog tracking until dialogs are authenticated. The former may impose a security risk,
the latter may (depending on your situation) limit your dialog tracking needs.
Cheers,
--Timo