[SR-Users] Questions on http_async_query() in event_route[xhttp:request] and async best practices for Kamailio

James Aylett jay.01.sub at gmail.com
Fri Jun 4 10:53:17 CEST 2021


Hi,

I am using Kamailio v5.2 as a WebRTC proxy for SIP clients, as part of
my setup I call a REST API before calling ws_handle_handshake() in
event_route[xhttp:request] using http_client_query() to authenticate
and retrieve some user details (routing etc) for the user connected
via WebSockets.

This worked fine until increased load on the system meant API
responses were slow, this caused a knock-on effect to existing
connected users. Specifically, we saw mass WebSocket disconnects for
existing connected users - believed to be due to using too aggressive
proxy timeout settings on our reverse proxy in front of Kamailio.

My understanding is that event_route[xhttp:request] uses the shared
SIP TCP worker threads, so potentially could slow API requests over
the network block all the handlers? Could this potentially impact the
keepalive processes used by the WebSocket module to check existing
WebSocket connections? We were using ping keepalives, again with
pretty aggressive timers...

To resolve this issue, I am looking at whether I can call API requests
asynchronously so that TCP workers are not blocked, my first thought
was to use http_async_query() in event_route[xhttp:request] and call
ws_handle_handshake() in the HTTP_REPLY route when the API call had
completed, but I get "ERROR: websocket [ws_handshake.c:143]:
ws_handle_handshake(): retrieving connection". I presume then this
approach is a deadend, looking at newer kamailio versions 5.5, there
doesn't seem to be any way to do this, correct?

Instead of using http_async_query() in event_route[xhttp:request], I
presume I could set a short timeout in http_client_query(), but my
concern is new WebSocket connections could still impact and block
existing ones in this case...

My other thought was to move my API call out from the
event_route[xhttp:request] into my route handler for REGISTER
requests, thinking I could offload new register requests to async
workers and move the API call there so as to not block existing
connections. Does this seem like a reasonable approach to the problem?
Given my current script does not use any async workers, how would one
go about optimising the number of plain SIP TCP workers to async
workers, are there any guides for this?

In general, for any potentially blocking "APIs" or calls over the
network, is it best practice to offload them into async workers. For
example, we run RTPEngine on separate hosts to Kamailio so should I be
wrapping rtpengine_offer/answer calls into async workers in case of a
slow network?

Apologies for the long text, but I would really appreciate any help to
understand these problems.

Thank you



More information about the sr-users mailing list