We had Kamailio 5.1.4 with websocket module. Unfortunately, our clients don't support websocket keepalive mechanism at all, so I used TCP keepalive instead with the following parameters:
tcp_keepalive=yes
tcp_keepcnt=6
tcp_keepidle=60
tcp_keepintvl=10
and set up KEEPALIVE_MECHANISM_NONE:
modparam("websocket", "keepalive_mechanism", 0)
During load testing and debugging, when 8k clients sent registrations, it was found out that shared memory was not freed after closing connections (ws_connection_list_t *wsconn_used_list variable in ws_conn.c ).
I've decided to add new keepalive mechanism that periodically checks TCP connection related to websocket:
enum
{
KEEPALIVE_MECHANISM_NONE = 0,
KEEPALIVE_MECHANISM_PING = 1,
KEEPALIVE_MECHANISM_PONG = 2,
KEEPALIVE_MECHANISM_TCP_CONN_CHECK = 3
};
and added the line to config:
# Enable custom tcp-connection-health based keepalive mechanism (3)
# KEEPALIVE_MECHANISM_NONE = 0,
# KEEPALIVE_MECHANISM_PING = 1,
# KEEPALIVE_MECHANISM_PONG = 2
# KEEPALIVE_MECHANISM_TCP_CONN_CHECK = 3
modparam("websocket", "keepalive_mechanism", 3)
Also, I've implemented the mechanism in ws_keepalive function:
void ws_keepalive(unsigned int ticks, void *param)
{
int check_time =
(int)time(NULL) - cfg_get(websocket, ws_cfg, keepalive_timeout);
ws_connection_t **list = NULL, **list_head = NULL;
ws_connection_t *wsc = NULL;
/* get an array of pointer to all ws connection */
list_head = wsconn_get_list();
if(!list_head)
return;
list = list_head;
wsc = *list_head;
while(wsc && wsc->last_used < check_time) {
if (ws_keepalive_mechanism == KEEPALIVE_MECHANISM_TCP_CONN_CHECK) {
struct tcp_connection *con = tcpconn_get(wsc->id, 0, 0, 0, 0);
if(!con) {
LM_INFO("tcp connection has been lost\n");
wsc->state = WS_S_CLOSING;
}
}
if(wsc->state == WS_S_CLOSING || wsc->awaiting_pong) {
LM_INFO("forcibly closing connection\n");
wsconn_close_now(wsc);
} else {
int opcode = (ws_keepalive_mechanism == KEEPALIVE_MECHANISM_PING)
? OPCODE_PING
: OPCODE_PONG;
ping_pong(wsc, opcode);
}
wsc = *(++list);
}
wsconn_put_list(list_head);
}
and changed memory allocation method in wsconn_get_list and wsconn_put_list methods from pkg to shm, because, as it turned out during load testing, using pkg_malloc (the C malloc) in this functions may cousing fails under serious loads.
These modifications solved the problem. But about a week ago we've started switching to ver. 5.2.1 and found a lot of changes in the websocket module. So, I've added my changes in this commit korizza@b3e03d0 . Please take a look.
Adding ws_conn_put_id in this commit a975bca#diff-59c50f19ab1ccf4afe10617cdc346bc2 did not solve problem with ref counter increasing.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or mute the thread.