### Description
When a TLS connection is closed and handle_lost_tcp=1 the entry is deleted for usrloc. This deletion is not synced via DMQ even though usrloc_delete=1.
There is also an issue on the same server if a re-registration happens right after before the timeout was supposed to expire then $var(sv_res) = 2 even though the usrloc table is empty. So it seems like the way handle_lost_tcp deletes is not deleting fully in registrar.
usrloc db_mode is 0
### Troubleshooting
#### Reproduction
Setup 2 servers with DMQ and close the TCP connection.
### Additional Information
* **Kamailio Version** - output of `kamailio -v`
``` version: kamailio 5.6.1 (x86_64/linux) d8f98b flags: USE_TCP, USE_TLS, USE_SCTP, TLS_HOOKS, USE_RAW_SOCKS, DISABLE_NAGLE, USE_MCAST, DNS_IP_HACK, SHM_MMAP, PKG_MALLOC, Q_MALLOC, F_MALLOC, TLSF_MALLOC, DBG_SR_MEMORY, USE_FUTEX, FAST_LOCK-ADAPTIVE_WAIT, USE_DNS_CACHE, USE_DNS_FAILOVER, USE_NAPTR, USE_DST_BLOCKLIST, HAVE_RESOLV_RES, TLS_PTHREAD_MUTEX_SHARED ADAPTIVE_WAIT_LOOPS 1024, MAX_RECV_BUFFER_SIZE 262144, MAX_URI_SIZE 1024, BUF_SIZE 65535, DEFAULT PKG_SIZE 8MB poll method support: poll, epoll_lt, epoll_et, sigio_rt, select. id: d8f98b compiled on 19:05:04 Aug 16 2022 with gcc 8.3.0 ```
* **Operating System**:
<!-- Details about the operating system, the type: Linux (e.g.,: Debian 8.4, Ubuntu 16.04, CentOS 7.1, ...), MacOS, xBSD, Solaris, ...; Kernel details (output of `lsb_release -a` and `uname -a`) -->
``` Debian 10 ```
Thanks for the report, will have a look
We have managed to reproduce it. Its actually quite a tricky bug, that it will synchronise the data back to itself in certain conditions.
Could you please verify the output of kamcmd dmq.list_nodes? It should show only the number of nodes. In our broken scenario it was showing the number of nodes two times. We will investigate why this happens now.
@henningw I don't think we are talking about the same issue here. I don't see any data being synchronized anywhere. I checked kamcmd dmq.list_nodes but I don't see the same issue you are talking about.
Our issue is that when handle_lost_tcp logic is ran, instead of deleting the entry it instead labels it as "deleted" and allows some timer to remove it after 10sec or so. This process of setting it "deleted" then removing the entry doesn't sync with DMQ. It also causes issues with save("location") because when a rereg happens during this 10sec when the entry is labeled as "deleted" but not actually removed it sees it as an update in the registrar instead of a new registration.
I purpose changing the handle_lost_tcp logic so that it uses the same logic as unregister("location"). As the unregister command correct removes the entry from the registrar table and correctly syncs via DMQ.
You are right, there are probably three issues here: 1. if the DMQ is somehow synchronizing with itself the deletion on TCP connection lost will not work at all 2. The TCP connection lost will be not synchronized with DMQ 3. If an entry was marked expired because of TCP connection lost, and a new REGISTER is received before the timer was run to delete it, it will update the registration
Regarding 1 - this is another topic, which we are looking into right now. About 2, yes the logic could be changed to actually remove/unregister it right now. This would also fix the issue 3. Will look also into that.