### Description
After carrying out some load tests using the `Websocket` module, I noticed that sometimes the `websocket:closed` event is not triggered.
This problem seems to be particularly noticeable when the load is high and simultaneous. Example: 1000 records simultaneously.
In the tests carried out, I realized that the number of times the event is not triggered is small, in the order of [1-10]/1000.
I check that when I do a `ws.dump` I don't have any active connections, and I confirm that they are all effectively cleaned (`netstat -ano | grep 8061`). However, the event is not triggered for some connections. I can see this in the logs, and also when i display the content of an hash table, i'm dropping key inside `websocket:closed` event.
### Troubleshooting
The troubleshooting was simplistic, i.e. I ran the same test n times, and validated that the event was triggered for all connections, the number of `ws` connections listed by `Kamailio`, and the open connections.
#### Reproduction
To reproduce, we can use any mechanism that emulates the use of the `Websocket` module.
For example, I used the `sipexer` tool to create a `wss` connection with `Kamailio`, forcing simultaneous registrations.
Example command: `sipexer -cb -mt register -ex 60 -au <user> -ap <pass> -fuser example -fd mydomain -wso https://10.0.0.12:10000 -ruri sip:kamailio -su wss://kamailio:8061`
Kamailio TLS listener:
`listen=tls:10.0.0.12:8061 advertise PUBLIC_IP`
Websocket module:
``` loadmodule "websocket.so" tcp_accept_no_cl=yes ```
``` event_route[websocket:closed] { xlog("L_NOTICE", "WebSocket connection closed $proto:$si:$sp\n"); <I also delete where a htable key> } ```
#### Debugging Data
``` [root@ ~]$ prcmd ws.dump { connections: { } info: { wscounter: 0 truncated: no } } [root@ ~]$ netstat -ano | grep 8061 tcp 0 0 10.0.0.12:8061 0.0.0.0:* LISTEN off (0.00/0/0)
[root ~]$ prcmd htable.dump wsauth | grep "name" | wc -l 3
[root ~]$ cat /var/log/kamailio.log | grep "WebSocket connection closed" | wc -l 997
[root ~]$ cat /var/log/proxy-registrar/proxy-registrar.log | grep "ERR" | grep "WAR" | wc -l 0 ```
#### Log Messages
Nothing relevant.
#### SIP Traffic
Nothing relevant here, the SIP Flow is OK.
### Additional Information
``` version: kamailio 5.6.4 (x86_64/linux) a004cf-dirty flags: USE_TCP, USE_TLS, USE_SCTP, TLS_HOOKS, USE_RAW_SOCKS, DISABLE_NAGLE, USE_MCAST, DNS_IP_HACK, SHM_MMAP, PKG_MALLOC, Q_MALLOC, F_MALLOC, TLSF_MALLOC, DBG_SR_MEMORY, USE_FUTEX, FAST_LOCK-ADAPTIVE_WAIT, USE_DNS_CACHE, USE_DNS_FAILOVER, USE_NAPTR, USE_DST_BLOCKLIST, HAVE_RESOLV_RES ADAPTIVE_WAIT_LOOPS 1024, MAX_RECV_BUFFER_SIZE 262144, MAX_URI_SIZE 1024, BUF_SIZE 65535, DEFAULT PKG_SIZE 8MB poll method support: poll, epoll_lt, epoll_et, sigio_rt, select. id: a004cf -dirty compiled on 23:17:27 Jun 7 2024 with gcc 4.8.5 ```
* **Operating System**:
``` CentOS Linux release 7.9.2009 (Core) ```
This issue is stale because it has been open 6 weeks with no activity. Remove stale label or comment or this will be closed in 2 weeks.
I can confirm this behavior in 5.8.2
sergey-safarov left a comment (kamailio/kamailio#3950)
Please try to revert abe60832de46796a1395a75a67753c1a12a1ec0a I have the same issue, and in my local build, this commit was reverted to address this issue.
henningw left a comment (kamailio/kamailio#3950)
The mentioned commit was about adressing a memory leak, so reverting it might be not the best cure for that problem. This problem was discussed in #3236
miconda left a comment (kamailio/kamailio#3950)
I pushed a commit that might fix the issues with event route not being executed. Somehow I doubt that all these cases were anyhow covered with execution of the event route before the previous commit for fixing the leak, as the structures were kept in forever, but it might get it overall good now. Tests have to be run with latest git master branch or the backported commit referenced above.
Closed #3950 as completed.
miconda left a comment (kamailio/kamailio#3950)
Commit was backported, I am closing this one, being rather old and the code base is different than the time of the report. If the issue is still there open a new one.