In the scope of what you're looking at, 200 ms is an EXTREMELY log time if you're
looking to handle 750 requests per second. Looking at this with simple math and
discounting everything else that can add small amounts of latency, lets assume that ONLY
200ms of latency occurs. This means that any single process can only handle a maximum of
5 requests per second (this even discounts handling replies). If you have 32 child
processes, then the number of requests/second (this assumes that requests are consistent
and evenly spaced) becomes:
5 * 32 = 160
Now, this doesn't mean 160 seconds is an absolute point at which problems occur, but
it should be the point at which your UDP buffer comes into play. Lets say your Kamailio
is idle and you receive 32 requests at the same time (or as close as is physically
possible). You'd have all processes tied up for 200ms. 100ms after receiving those 32
requests, you receive a single request. That request should be buffered, but won't
get handled for 100ms. Basically the 5*32 calculation is not "the point at which
Kamailio starts to fail" but the point at which it "starts to fall behind).
Send 161 requests per second and eventually it will fall over. Send more than that and
it will fall over faster.
Using http_async_client, you will still have whatever latency exists from the receipt of
the SIP request to when the http request is sent, but given your sample code this is
minimal - the biggest latency is probably the time to write to your cdr file. Let's
suppose the total time then is 4 ms (a relatively arbitrary number here - it's
probably less, but this will give us easy math). Once the request is sent to the http
server, the request is stored in memory, and the SIP worker process is freed up to handle
a new request. When an http response is received, it is handled in a new process.
Assuming all you're doing is copying a value from an http reply header and sending a
SIP reply header with that response, the time is very mimimal (should be sub millisecond,
so lets say 1ms). Your request processing time, then goes down to 5ms per request. Now
each process can handle 200 requests per second. If you have 32 children, then you'd
expect your maximum total before blocking to be:
200 * 32 = 6400
Of course there are still some other things that can cause latency, etc, so don't take
this as a guarantee of handling 6,400 requests/second, but it should be viewed as how
using http_async_query rather than just http_async will address blocking caused by a 200ms
wait on your web service.
Kaufman
Senior Voice Engineer
E: bkaufman(a)bcmone.com
SIP.US Client Support: 800.566.9810 | SIPTRUNK Client Support: 800.250.6510 |
Flowroute Client Support: 855.356.9768
[img]<https://www.sip.us/>
[
img]<https://www.siptrunk.com/>
[
img]<https://www.flowroute.com/>
________________________________
From: Sergio Charrua <sergio.charrua(a)voip.pt>
Sent: Thursday, December 19, 2024 10:21 AM
To: Ben Kaufman <bkaufman(a)bcmone.com>
Cc: Kamailio (SER) - Users Mailing List <sr-users(a)lists.kamailio.org>
Subject: Re: [SR-Users] Kamailio not receiving packets on high CPS
CAUTION: This email originated from outside the organization. Do not click links or open
attachments unless you recognize the sender and know the content is safe.
Thanks Ben!
replies from HTTP server are below 200ms. Not sure if this is enough to be a
bottleneck....
Regarding async_http, knowing that Kamailio is stateless, means (it is my understanding,
at least) that the HTTP request needs to be synchronous. But I will give a try to
async_http.
Atenciosamente / Kind Regards / Cordialement / Un saludo,
Sérgio Charrua
On Thu, Dec 19, 2024 at 3:49 PM Ben Kaufman
<bkaufman@bcmone.com<mailto:bkaufman@bcmone.com>> wrote:
What are your http query times like? You're using http_client, which functions in a
blocking manner: The http request is sent, and that thread does nothing else until a
response is received and then it continues. If your request time is even moderately high
try http_async_client.
Kaufman
Senior Voice Engineer
E: bkaufman@bcmone.com<mailto:bkaufman@bcmone.com>
SIP.US<http://sip.us/> Client Support: 800.566.9810 | SIPTRUNK Client Support:
800.250.6510 | Flowroute Client Support: 855.356.9768
[img]<https://www.sip.us/>
[
img]<https://www.siptrunk.com/>
[
img]<https://www.flowroute.com/>
________________________________
From: Sergio Charrua via sr-users
<sr-users@lists.kamailio.org<mailto:sr-users@lists.kamailio.org>>
Sent: Thursday, December 19, 2024 9:08 AM
To: Kamailio (SER) - Users Mailing List
<sr-users@lists.kamailio.org<mailto:sr-users@lists.kamailio.org>>
Cc: Sergio Charrua <sergio.charrua@voip.pt<mailto:sergio.charrua@voip.pt>>
Subject: [SR-Users] Kamailio not receiving packets on high CPS
CAUTION: This email originated from outside the organization. Do not click links or open
attachments unless you recognize the sender and know the content is safe.
Hello!
for this ST/SH project, we are using Kamailio 5.8.4 in stateless mode, making HTTP
requests to a REST API (Java Spring) that will reply with a JSON object, and then Kamailio
replaces the Contact header and sends a SIP 300 Multiple Choice reply to the SBC that sent
the initial INVITE.
Kamailio is running on a VM with 4vCPUs and 4GB RAM and 2 NICs.
On NIC ens224 there is a Virtual IP managed by Keepalived.
Kamailio is listening on port 5060 on ens224 and set with 32 children process (tried with
8, 16, 24 and also 64, but with this value it was a lot worst).
We are doing load performance tests on our PreProd environment. With the SBC we are
sending 450 to 600 CAPS to Kamailio, and what we noticed is that above 450 CAPS, and after
less than 1 minute, Kamailio only replies SIP 100 to some INVITEs. We also could figure
out that Kamailio is not receiving all the INVITES, despite having proved that ALL invites
are sent to the server.
It seems to me that, for some reason, the OS is somehow not able to deliver *all* SIP
packets to Kamailio, because:
- sngrep does capture all the SIP packets and shows the flows
- though the flow shows SIP invite coming from SBC to Kamailio Server, there is no SIP 100
replied back to SBC, even though this is the 1st thing kamailio does, so, for me, Kamailio
did not received the SIP Invite
- picking a call with reinvites, which after some seconds, SBC cancels, if I copy the
Call-ID value and grep on kamailio's logs, I just do not find any content!
I have read some "network tuning" articles, including Linux Tune Network Stack
(Buffers Size) To Increase Networking Performance -
nixCraft<https://www.cyberciti.biz/faq/linux-tcp-tuning/> and Tuning Kamailio for
high throughput and performance | Evariste Systems
Blog<https://blog.evaristesys.com/2016/02/15/tuning-kamailio-for-high-th…
and when running the following command I get:
$ ss -4 -n -l | grep 5060
udp UNCONN 0 0 10.242.17.125:5060<http://10.242.17.125:5060/>
0.0.0.0:*
udp UNCONN 324224 0 10.242.17.146:5060<http://10.242.17.146:5060/>
0.0.0.0:*
According to the article, the 3rd column should be as near 0 as possible, ideally 0.
While I am waiting for some sysadmin with root permissions to be available and modify some
of the network parameters on this VM, I wonder if anyone has some tips to share on how to
solve this behaviour. I'm not sure, either, what parameters to change or if they
should be changed! The goal is to have a minimum 750 CAPS.
Lastly, some parts of the code:
Kamailio.cfg:
### LOG Levels: 3=DBG, 2=INFO, 1=NOTICE, 0=WARN, -1=ERR
debug=1
log_stderror=no
rundir="/tmp"
memdbg=5
memlog=5
log_facility=LOG_LOCAL0
# configure the prefix for all log messages
log_prefix_mode = 1
log_prefix="{$mt $hdr(CSeq) $ci} "
/* number of SIP routing processes */
children=32 # ------------- ALREADY TRIED WITH 8,16,24... no results.....
/* uncomment the next line to disable TCP (default on) */
disable_tcp=no
/* uncomment the next line to disable the auto discovery of local aliases
* based on reverse DNS on IPs (default on) */
# auto_aliases=no
/* add local domain aliases */
#
alias="sip.mydomain.com<http://sip.mydomain.com/>"
####### Custom Parameters #########
/* These parameters can be modified runtime via RPC interface
* - see the documentation of 'cfg_rpc' module.
*
* Format: group.id<http://group.id/> = value 'desc' description
* Access: $sel(cfg_get.group.id<http://cfg_get.group.id/>) or
@cfg_get.group.id<http://cfg_get.group.id/> */
####### Modules Section ########
/* set paths to location of modules */
loadmodule "db_mysql.so"
loadmodule "db_cluster.so"
loadmodule "http_client.so"
loadmodule "jsonrpcs.so"
loadmodule "kex.so"
loadmodule "corex.so"
loadmodule "sl.so"
loadmodule "pv.so"
loadmodule "maxfwd.so"
loadmodule "textops.so"
loadmodule "xlog.so"
loadmodule "sanity.so"
loadmodule "jansson.so"
loadmodule "snmpstats.so"
loadmodule "file_out.so"
loadmodule "ctl.so"
loadmodule "permissions.so"
loadmodule "xhttp.so"
loadmodule "xhttp_rpc.so"
####### Other Args and Env Vars #########
/* listen addresses */
include_file "listen.cfg"
include_file "db_conn.cfg"
# ----------------- setting module-specific parameters --------------
# --- DB Cluster params ---
# minimum requirement is to have DBURL1 defined
#!ifdef DBURL1
modparam("db_cluster", "connection" , DBURL1)
#!endif
#!ifdef DBURL2
modparam("db_cluster", "connection" , DBURL2)
#!endif
#!ifdef DBURL3
modparam("db_cluster", "connection" , DBURL3)
#!endif
#!ifdef DBURL4
modparam("db_cluster", "connection" , DBURL4)
#!endif
#!ifdef DBURL5
modparam("db_cluster", "connection" , DBURL5)
#!endif
modparam("db_cluster", "cluster", DBCLUSTER)
# --- Permissions params ---
modparam("permissions", "db_url", "cluster://k1")
modparam("permissions", "db_mode", 1)
#modparam("permissions", "reload_delta", 30)
# ----- jsonrpcs params -----
modparam("jsonrpcs", "pretty_format", 1)
/* set the path to RPC fifo control file */
modparam("jsonrpcs", "fifo_name",
"/tmp/kamailio_rpc.fifo")
/* set the path to RPC unix socket control file */
modparam("jsonrpcs", "dgram_socket",
"/tmp/kamailio_rpc.sock")
modparam("jsonrpcs", "transport", 0)
# ----- ctl params -----
/* set the path to RPC unix socket control file */
modparam("ctl", "binrpc", "unix:/tmp/kamailio_ctl")
# ----- http_async_client params -----
modparam("http_client", "query_result", 0)
modparam("http_client", "keep_connections", 0)
modparam("http_client", "connection_timeout",2)
modparam("http_client", "timeout_mode",2)
modparam("http_client", "config_file",
"/usr/local/etc/kamailio/http_client.cfg")
# ---- SNMP Stats params ----
modparam("snmpstats", "sipEntityType", "proxyServer")
modparam("snmpstats", "sipEntityType", "redirectServer")
modparam("snmpstats", "sipEntityType", "other")
modparam("snmpstats", "snmpgetPath", "/usr/bin/")
modparam("snmpstats", "MsgQueueMinorThreshold", 20)
modparam("snmpstats", "MsgQueueMajorThreshold", 100)
modparam("snmpstats", "dlg_minor_threshold", 20)
modparam("snmpstats", "dlg_major_threshold", 100)
modparam("snmpstats", "snmpCommunity", "kamailio")
# ---- File_Out params ----
modparam("file_out", "base_folder",
"/data/sabire002/log/kamailio/")
modparam("file_out", "file",
"name=stsh_requests;interval=1440;extension=.log")
modparam("file_out", "file",
"name=cdr;interval=1440;extension=.log")
modparam("file_out", "file",
"name=http;interval=1440;extension=.log")
####### Routing Logic ########
include_file "includes/reqinit.cfg"
include_file "includes/handle_options.cfg"
include_file "includes/handle_cancel.cfg"
include_file "includes/handle_stir_shaken_stateless.cfg"
include_file "includes/handle_http_rpc.cfg"
request_route {
$avp(START_TIME)=$utimef(%Y-%m-%d %H:%M:%S);
$avp(GROUPID) = allow_address_group($si, $sp);
if ( $avp(GROUPID) == 100 || !allow_address_group($si, $sp) ) {
xlog("L_INFO", "INIT - $si:$sp is not in the allowed ACL Group ID
!\n");
#sl_reply("401", "Address not allowed");
exit;
};
if (is_method("ACK") ) { #&& t_check_trans() ){
exit;
}
if (is_method("INVITE")) {
file_out("cdr","$rm|$ft|$tt|$ci|$rs|$rr|$Ts|$avp(START_TIME)|$fU|$fd|$si|$tU|$rU|$rd|$utimef(%Y-%m-%d
%H:%M:%S)");
send_reply("100","Trying");
}
route(HANDLE_OPTIONS);
route(REQINIT);
xlog("L_INFO"," ********** Route START ***********");
# log the basic info regarding this call
xlog("L_INFO","===================================================
\n");
xlog("L_INFO","New SIP message $rm with call-ID $ci received $pr request
$rm $ou source $si:$sp from $fu to $tu \n");
xlog("L_INFO","===================================================
\n");
route(HANDLE_STIRSHAKEN);
route(HANDLE_CANCEL);
if (method == "INVITE"){
route(RELAY);
}
}
route[RELAY] {
# Sends a 300 Multiple Choices back to the proxy that requested the routing lookup
xlog("L_INFO","RELAY - send reply \n");
file_out("cdr","RELAY|$ft|$tt|$ci|$rs|$rr|$Ts|$avp(START_TIME)|$fU|$fd|$si|$tU|$rU|$rd|$utimef(%Y-%m-%d
%H:%M:%S)");
send_reply("300", "Multiple Choices");
exit;
}
As for the HANDLE_STIRSHAKEN route, where the main process is:
route[HANDLE_STIRSHAKEN]
{
if (!is_method("INVITE")) {
return;
}
$xavp(requestTime) = $utimef(%s);
$var(post) = $null; // resets and makes sure the $var(post) variable is null before
usage
[.... get some header values ...]
$var(res) = http_connect("api1", "/stsh",
"application/json", "$var(post)", "$var(result)");
jansson_xdecode($var(result), "json");
if ( $xavp(json=>sip-response-code) == 300 )
{
remove_hf("Contact");
append_to_reply("Contact: $xavp(json=>contact)\r\n");
}
else
{
# add error message to reply
sl_reply("$xavp(json=>sip-response-code)","$xavp(json=>sip-response-text)");
exit;
}
}
Atenciosamente / Kind Regards / Cordialement / Un saludo,
Sérgio Charrua