The dialog module will expire dialogs that don't actually belong to the particular proxy.
Scenario: 2 proxies on 5.2.4 with a loadbalancer in front. Call was routed over proxy1, dialog timeout on proxy2 was shorter (to make it easier to reproduce it). The results:
* Dialog synchronized with DMQ: proxy 2 would timeout the dialog * Dialog entry synchronized to proxy2 DB and proxy2 restarted: Proxy2 would timeout the dialog * Dialog entry synchronized to proxy2 DB and proxy2 not restarted: Proxy2 would _not_ timeout the dialog (I’ve tried both db_mode 1 and 2)
So the proxy2 would load the dialog into memory and then later expire it. So timeout routes, send_bye on timeout would not work correctly.
One idea to solve this would be to check e.g. over the existing socket data if the dialog belongs to this proxy and skip loading it in this case. This could be also made configurable e.g. with a module parameter.
Any comments? Better ideas how to solve it?
When records are replicated, is expected to be same kind of config, so both proxies should have same dialog lifetime/expire value, otherwise they are not like a single node behaviour. Replication is mainly for redundancy, not scalability.
Dialog is not working with database only, it has the records in memory and works with those. There are functions that you can load the dialog records from database if they are not found in memory when processing a specific SIP message.
Now, given the purpose of redundancy/high-availability, the proxy 2 should expire the dialog, because it doesn't know if the proxy 1 is still alive.
If the replication is for another purpose, like active calls limit, then it might be good to set a flag for it to say do not send BYEs on expiration or other operations that can be a conflict.
There are also options to replicate only profiles, not all dialogs, so by that one can have distributed calls limit.
Thanks for the reply Daniel. You are right about the redundancy/high-availability remarks.
But still I can see scenarios where the current behaviour can cause problems. Imagine a small dialog timeout combined with SIP Session Timer keep-alive re-INVITEs. The proxy 1 would not expire the dialog, but proxy2 (which did not got the re-INVITE) would expire it. Of course one could use DMQ or synchronize the dialog data by another means.
What do you think about something like this skip_remote_socket option which exists in usrloc as well: https://kamailio.org/docs/modules/devel/modules/usrloc.html#usrloc.p.skip_re...
Maybe this is irrelevant, but I'd like to add another scenario (not sure if you have contemplated this one @henningw): replicating dialogs exclusively via DMQ (with no database at all).
* Proxy1 creates a dialog (let's call it **dialogA**) and replicates it via DMQ to Proxy2. * Proxy1 is restarted/crashes/something. * Proxy1 comes back up before **dialogA** has expired. * Proxy1 receives all dialogs from Proxy2 via DMQ (including **dialogA**).
What would happen when expiration for the **dialogA** is due?
1- Would Proxy1 send the BYE and then "tell" via DMQ to Proxy2 to remove it from its memory? 2- What if Proxy1 never came back up in time, would Proxy2 know that **it** should take care of sending out a BYE for that expired dialog? 3- Would both Proxy1 and Proxy2 send a BYE and then tell the other proxy via DMQ to remove it? (this would be a race-condition I believe, and one would end up logging a *non-existent dialog* on syslog?)
I know the subject of this issue clearly talks about scenarios **with** a database, so if you believe this database-less scenario shouldn't be discussed here let me know and I can create a new issue.
Hello Joel,
these two topics (dialog expiration and send BYE) are handled individually.
Two proxies with this cfg:
modparam("dialog", "db_mode", 0) modparam("dialog", "db_update_period", 10) modparam("dialog", "enable_dmq", 1) modparam("dialog", "default_timeout", 60); modparam("dialog", "send_bye", 1)
* proxy1 restarted after call established - it will get the dialog data from proxy2, with a different timeout value set internally:
root@proxy-1:/etc/kamailio# kamcmd dlg.list |egrep "_ts|timeout" start_ts: 1570005344 init_ts: 1570005344 end_ts: 0 timeout: 1570005404 root@proxy-1:/etc/kamailio# /etc/init.d/kamailio restart [ ok ] Restarting kamailio (via systemctl): kamailio.service. root@proxy-1:/etc/kamailio# kamcmd dlg.list |egrep "_ts|timeout" start_ts: 1570005344 init_ts: 1570005344 end_ts: 0 timeout: 1570005415
* but call will be expired on time (from proxy2):
10:35:44.137 pjsua_app.c .....Call 1 state changed to CONFIRMED root@proxy-1:/etc/kamailio# date Wed Oct 2 10:36:46 CEST 2019 root@proxy-1:/etc/kamailio# kamcmd dlg.list |egrep "_ts|timeout" root@proxy-1:/etc/kamailio#
* if proxy2 is stopped _after_ proxy1 was restarted, call will be expired later (from proxy1):
10:43:09.950 pjsua_app.c .....Call 2 state changed to CONFIRMED root@proxy-1:/etc/kamailio# date; kamcmd dlg.list |egrep "_ts|timeout" Wed Oct 2 10:44:12 CEST 2019 start_ts: 1570005789 init_ts: 1570005789 end_ts: 0 timeout: 1570005860 root@proxy-1:/etc/kamailio# date; kamcmd dlg.list |egrep "_ts|timeout" Wed Oct 2 10:44:30 CEST 2019
* proxy2 will not send a timeout for a dialog that does not belong to him
* proxy1 (after a restart) will not send a BYE at timeout, this might be related to the socket:
root@proxy-1:/etc/kamailio# date; kamcmd dlg.list | grep socket Wed Oct 2 10:54:49 CEST 2019 socket: udp:116.203.XXX.XX:5060 socket: udp:116.203.XXX.XX:5060 root@proxy-1:/etc/kamailio# /etc/init.d/kamailio restart [ ok ] Restarting kamailio (via systemctl): kamailio.service. root@proxy-1:/etc/kamailio# date; kamcmd dlg.list | grep socket Wed Oct 2 10:54:56 CEST 2019 socket: socket:
There should be a new issue created for the last topic.
This issue is stale because it has been open 6 weeks with no activity. Remove stale label or comment or this will be closed in 2 weeks.
We will investigate adding a skip_remote_socket for this scenario, default behavior will stay of course the same.
But still I can see scenarios where the current behaviour can cause problems. Imagine a small dialog timeout combined with SIP Session Timer keep-alive re-INVITEs. The proxy 1 would not expire the dialog, but proxy2 (which did not got the re-INVITE) would expire it.
I also came across that issue, but the other way round. I started with a short session timer, intending to extend it on every re-invite according to the timer values in the sip message and reply. This turned out not to work, as the peer nodes would not get an update on re-invites and thus fire the timeout and expiring the dialog from memory, and also sync state 5 (delete) back to the node handling the dialog causing the next in-dialog sip message hitting a 481 dialog not found.
This issue is stale because it has been open 6 weeks with no activity. Remove stale label or comment or this will be closed in 2 weeks.
Closed #2080 as not planned.