Hi,
Some percentage of requests processed with async_route("REQ_PROCESS", "5") seem to end up with a resumed transaction, though most do. Not sure what the exact percentage is. The ones that don't
Is this due to excessive requests? Is there a limit on internal IPC queue depth? Is it conceptually similar to a generic shared blocking queue internally, along the lines of 'mqueue'? Is there any reasonable way to troubleshoot this?
Thanks!
-- Alex
--
Alex Balashov
Principal Consultant
Evariste Systems LLC
Web: https://evaristesys.com
Tel: +1-706-510-6800
Hello,
when I use
dns_try_naptr=on
corelog=-1
debug=-1
enable_tls=yes
use_dns_cache=off
dns_cache_init=off
modparam("topoh", "mask_key", "TEAI32l)- eauiDEUIA!?()")
and run Kamailio under valgrind, Kamailio logs:
20(21) ERROR: kemix [kemix_mod.c:229]: ki_kx_get_ruri_attr(): failed to parse the R-URI
20(21) ERROR: rr [loose.c:1011]: loose_route_mode(): failed to parse Request URI
The workflow is: I start Kamailio. A UDP-client registers. Then a websocket client calls the UDP-client. Finally the websocket client hangs up.
Valgrind does not report anything suspicious.
When I use instead
modparam("topoh", "mask_key", "TEAI32l")
for the same workflow, with the same configuration, valgrind logs:
==14== Invalid read of size 4
==14== at 0x4867E5: atomic_cmpxchg_int (atomic_x86.h:224)
==14== by 0x486830: futex_get (futexlock.h:99)
==14== by 0x490CE4: dns_hash_get (dns_cache.c:673)
==14== by 0x4972A6: dns_get_entry (dns_cache.c:2001)
==14== by 0x499288: dns_srv_get_he (dns_cache.c:2455)
==14== by 0x597AAD: no_naptr_srv_sip_resolvehost (resolve.c:1599)
==14== by 0x598332: naptr_sip_resolvehost (resolve.c:1675)
==14== by 0x5983C5: _sip_resolvehost (resolve.c:1707)
==14== by 0x49943B: dns_srv_sip_resolvehost (dns_cache.c:2516)
==14== by 0x49B3DD: dns_sip_resolvehost (dns_cache.c:2738)
==14== by 0x59846A: sip_hostport2su (resolve.c:1727)
==14== by 0x4CB951: forward_request (forward.c:515)
==14== by 0x99993CD: t_relay_to (t_funcs.c:300)
==14== by 0x99EBA32: _w_t_relay_to (tm.c:1764)
==14== by 0x99F4DEB: ki_t_relay (tm.c:2917)
==14== by 0xA80A94B: sr_kemi_lua_exec_func_ex (app_lua_api.c:1022)
==14== by 0xA81237D: sr_kemi_lua_exec_func (app_lua_api.c:1706)
==14== by 0xA81B93F: sr_kemi_lua_exec_func_209 (app_lua_kemi_export.c:1717)
==14== by 0xA8383C0: luaD_precall (in /lib64/kamailio/modules/app_lua.so)
==14== by 0xA84C4CA: luaV_execute (in /lib64/kamailio/modules/app_lua.so)
==14== by 0xA838C90: luaD_callnoyield (in /lib64/kamailio/modules/app_lua.so)
==14== by 0xA837029: luaD_rawrunprotected (in /lib64/kamailio/modules/app_lua.so)
==14== by 0xA8391DF: luaD_pcall (in /lib64/kamailio/modules/app_lua.so)
==14== by 0xA8333DE: lua_pcallk (in /lib64/kamailio/modules/app_lua.so)
==14== by 0xA806F0C: app_lua_run_ex (app_lua_api.c:773)
==14== by 0xA825F06: sr_kemi_config_engine_lua (app_lua_mod.c:119)
==14== by 0x5048BE: sr_kemi_route (kemi.c:3784)
==14== by 0x588909: receive_msg (receive.c:502)
==14== by 0xA609ADF: ws_frame_receive (ws_frame.c:644)
==14== by 0x4BCAAB: sr_event_exec (events.c:299)
==14== Address 0x0 is not stack'd, malloc'd or (recently) free'd
==14==
==14==
==14== Process terminating with default action of signal 11 (SIGSEGV): dumping core
==14== Access not within mapped region at address 0x0
==14== at 0x4867E5: atomic_cmpxchg_int (atomic_x86.h:224)
==14== by 0x486830: futex_get (futexlock.h:99)
==14== by 0x490CE4: dns_hash_get (dns_cache.c:673)
==14== by 0x4972A6: dns_get_entry (dns_cache.c:2001)
==14== by 0x499288: dns_srv_get_he (dns_cache.c:2455)
==14== by 0x597AAD: no_naptr_srv_sip_resolvehost (resolve.c:1599)
==14== by 0x598332: naptr_sip_resolvehost (resolve.c:1675)
==14== by 0x5983C5: _sip_resolvehost (resolve.c:1707)
==14== by 0x49943B: dns_srv_sip_resolvehost (dns_cache.c:2516)
==14== by 0x49B3DD: dns_sip_resolvehost (dns_cache.c:2738)
==14== by 0x59846A: sip_hostport2su (resolve.c:1727)
==14== by 0x4CB951: forward_request (forward.c:515)
==14== by 0x99993CD: t_relay_to (t_funcs.c:300)
==14== by 0x99EBA32: _w_t_relay_to (tm.c:1764)
==14== by 0x99F4DEB: ki_t_relay (tm.c:2917)
==14== by 0xA80A94B: sr_kemi_lua_exec_func_ex (app_lua_api.c:1022)
==14== by 0xA81237D: sr_kemi_lua_exec_func (app_lua_api.c:1706)
==14== by 0xA81B93F: sr_kemi_lua_exec_func_209 (app_lua_kemi_export.c:1717)
==14== by 0xA8383C0: luaD_precall (in /lib64/kamailio/modules/app_lua.so)
==14== by 0xA84C4CA: luaV_execute (in /lib64/kamailio/modules/app_lua.so)
==14== by 0xA838C90: luaD_callnoyield (in /lib64/kamailio/modules/app_lua.so)
==14== by 0xA837029: luaD_rawrunprotected (in /lib64/kamailio/modules/app_lua.so)
==14== by 0xA8391DF: luaD_pcall (in /lib64/kamailio/modules/app_lua.so)
==14== by 0xA8333DE: lua_pcallk (in /lib64/kamailio/modules/app_lua.so)
==14== by 0xA806F0C: app_lua_run_ex (app_lua_api.c:773)
==14== by 0xA825F06: sr_kemi_config_engine_lua (app_lua_mod.c:119)
==14== by 0x5048BE: sr_kemi_route (kemi.c:3784)
==14== by 0x588909: receive_msg (receive.c:502)
==14== by 0xA609ADF: ws_frame_receive (ws_frame.c:644)
==14== by 0x4BCAAB: sr_event_exec (events.c:299)
The stacktrace with the values of the variables is available at https://github.com/kamailio/kamailio/issues/3350 . Or at least these things seem very
similar to me.
Any idea?
I can share the OCI-image and the full configuration.
Greetings
Дилян
I try to workout if - currently it would work, or - where and how to debug more:
I face - 2 interfacec - public internet (so, TLS + sRTP) is desired
and private - old infrastructure - i mus only use plain RTP
172.23.9.70 - private ip - from this endpoint of kamailio and rtpengine should send only basic RTP
172.23.210.75:5060 private - target for kamailio
1.2.3.24 obfuscated public IP (TLS + sRTP required)
kamailio 5.4.4 (x86_64/linux)
rtpengine -v Version: 11.1.1.4-1~bpo11+1
all i do is:
if (proto==TLS) {
rtpengine_manage("RTP/AVP ICE=remove replace-session-connection replace-origin pad-crypto ptime=20 codec-transcode-PCMA record-call=on allow-transcoding direction=external direction=internal record-call=on");
} else if ($ru =~ "transport=tls") {
rtpengine_manage("DTLS=on SRTP AVPF ICE=remove replace-session-connection replace-origin pad-crypto ptime=20 codec-transcode-PCMA record-call=on allow-transcoding direction=internal direction=external record-call=on media-address=1.2.3.24");
}
# 1.2.3.24 obfuscated public IP
172.23.210.75:5060 is in dispatch.cfg, as '11'
route[SBC_CORE] {
append_hf("X-My-SRTP: removed31337\r\n");
### i see this text in invtes from kamailio 172.23.9.70 towards 172.23.210.75:5060
### i see only RTP, so as expected
if (!ds_select_dst("11", "0")) {
xwarn("I:$var(i) DROP(DOWN!) FWD:$rm [$fU->$tU] [SBCVIP] to $du\n");
sl_send_reply("503", "Destination down");
exit;
}
what i did:
certificate is a paid one (the public party needs it)
TLS works
i deleted - entries in (not kamailo) cryptosuite that caused this:
13:08:05 localhost rtpengine[15140]: ERR: [51ad8758-b64d-4d2f-9fd0-41d03a38f74d]: [core] Failed to parse a=crypto attribute, ignoring: unknown crypto suite
Tried to search for any "ready" examples for this - only found old threads and - that this should be possible, but - no example for woking config.
what i see:
Jan 19 19:00:57 localhost rtpengine[17301]: DEBUG: [core] timer run time = 0.000038 sec
Jan 19 19:00:58 localhost rtpengine[17301]: DEBUG: [core] timer run time = 0.000036 sec
Jan 19 19:00:59 localhost rtpengine[17301]: INFO: [c17bab16-5eea-492e-b1c4-ac9387f3e265]: [core] Closing call due to timeout
Jan 19 19:00:59 localhost rtpengine[17301]: DEBUG: [c17bab16-5eea-492e-b1c4-ac9387f3e265]: [core] Redis delete_async=0
Jan 19 19:00:59 localhost rtpengine[17301]: INFO: [c17bab16-5eea-492e-b1c4-ac9387f3e265]: [core] Final packet stats:
Jan 19 19:00:59 localhost rtpengine[17301]: INFO: [c17bab16-5eea-492e-b1c4-ac9387f3e265]: [core] --- Tag 'JVR5LTs', created 60:00 ago for branch ''
Jan 19 19:00:59 localhost rtpengine[17301]: INFO: [c17bab16-5eea-492e-b1c4-ac9387f3e265]: [core] --- subscribed to ''
Jan 19 19:00:59 localhost rtpengine[17301]: INFO: [c17bab16-5eea-492e-b1c4-ac9387f3e265]: [core] --- subscription for ''
Jan 19 19:00:59 localhost rtpengine[17301]: INFO: [c17bab16-5eea-492e-b1c4-ac9387f3e265]: [core] ------ Media #1 (audio over RTP/SAVP) using unknown codec
Jan 19 19:00:59 localhost rtpengine[17301]: INFO: [c17bab16-5eea-492e-b1c4-ac9387f3e265]: [core] --------- Port 1.2.3.24:30136 <> 52.129.106.28:17030, SSRC 0, in 0 p, 0 b, 0 e, 3600 ts, out 0 p, 0 b, 0 e
Jan 19 19:00:59 localhost rtpengine[17301]: INFO: [c17bab16-5eea-492e-b1c4-ac9387f3e265]: [core] --------- Port 1.2.3.24:30137 <> 52.129.106.28:17031 (RTCP), SSRC 0, in 0 p, 0 b, 0 e, 3600 ts, out 0 p, 0 b, 0 e
Jan 19 19:00:59 localhost rtpengine[17301]: INFO: [c17bab16-5eea-492e-b1c4-ac9387f3e265]: [core] --- Tag '', created 60:00 ago for branch ''
Jan 19 19:00:59 localhost rtpengine[17301]: INFO: [c17bab16-5eea-492e-b1c4-ac9387f3e265]: [core] --- subscribed to 'JVR5LTs'
Jan 19 19:00:59 localhost rtpengine[17301]: INFO: [c17bab16-5eea-492e-b1c4-ac9387f3e265]: [core] --- subscription for 'JVR5LTs'
Jan 19 19:00:59 localhost rtpengine[17301]: INFO: [c17bab16-5eea-492e-b1c4-ac9387f3e265]: [core] ------ Media #1 (audio over RTP/AVP) using unknown codec
Jan 19 19:00:59 localhost rtpengine[17301]: INFO: [c17bab16-5eea-492e-b1c4-ac9387f3e265]: [core] --------- Port 172.23.9.70:30014 <> :0 , SSRC 0, in 0 p, 0 b, 0 e, 3600 ts, out 0 p, 0 b, 0 e
Jan 19 19:00:59 localhost rtpengine[17301]: INFO: [c17bab16-5eea-492e-b1c4-ac9387f3e265]: [core] --------- Port 172.23.9.70:30015 <> :0 (RTCP), SSRC 0, in 0 p, 0 b, 0 e, 3600 ts, out 0 p, 0 b, 0 e
Jan 19 19:00:59 localhost rtpengine[17301]: INFO: [c17bab16-5eea-492e-b1c4-ac9387f3e265]: [core] Moved metadata file "/var/spool/rtpengine/tmp/rtpengine-meta-c17bab16-5eea-492e-b1c4-ac9387f3e265-7003946f152c6c8d.tmp" to "/var/spool/rtpengine/metadata"
Jan 19 19:00:59 localhost rtpengine[17301]: DEBUG: [core] timer run time = 0.000828 sec
Jan 19 19:01:00 localhost rtpengine[17301]: DEBUG: [core] timer run time = 0.000053 sec
route(SBC_CORE);
maybe any hint or - someone has working exmaple of kamailio config + rtpengine settings ?
i use only userspace daemon rtp forwarding (this is a test, dont need any performance here)
Thanks,
Hi all,
We're hitting an issue while integrating secure websockets in our existing SIP infrastructure using Kamailio.
When the registration comes in, it causes an entry in our AOR table, with ";transport=ws" appended.
When we want to send a message to this client (using t_relay), kamailio seems to take 'ws' as being *unsecure* websockets. In turn, this makes Kamailio try to send out the message using a TCP listener - while it should have picked the TLS listener.
There are some remarks in the sources about ws vs. wss, so i'm struggling to figure out where things go wrong. I've also created github issue #3340 with more details.
Any help would be appreciated. If this turns out to be a Kamailio bug, i'm happy to provide a patch.
Hello,
How can I setup Kamailio for failover on LCR ?
Currently my Kamailio 4.x works with lcr but doesn’t care about gateway status. I mean that if a gateway match rule , it sends the call through without check if the gateway is available an never try the next one if the first fail.
I’ve tried to implement inactivate_gw()
But I received error that lcr_id_avp is not defined. I’ve added the parameters to lcr module for Ping and enable defunc but I’m unable to get it working.
By the way I ‘ve trad that defunct is not a good practice as hacker sending malformed sip header can throw down the gateway.
So what is the good and working way to setup outbound call failover ?
Regards
Sebastien
Hello
We are currently using EVAPI to push messages into kamailio from a go app.
For the most part ift works without problems, messages get delivered into
the event route and we use it to update presence status.
Recently we started noticing delays on the evapi processing. After 500
requests x min the pid corresponding to the EVAPI dispatcher gets pegged at
100% CPU and processing of requests starts slowing down until in some cases
the connection just drops and we have to reconnect.
After that pid drops, the dispatcher goes back to working.
Unfortunately the error we get back is very generic and is not telling us
much
logger.go:39: ERR
Handler returned error (write tcp 127.0.0.1:48448->127.0.0.1:8228:
write: broken pipe)
This only happens on prod servers and with high load which makes it hard to
debug. There is no other slowness or increase of traffic or drop SIP
traffic, only evapi dispatcher at 100% and the EVAPI workers on idle.
Curious if anyone has run into this issue. We've tried different versions
(5.4 , 5.5 and 5.6) and it happens on all of them.
Any feedback appreciated !
Thanks !
Hello,
https://www.kamailio.org/docs/modules/devel/modules/tls says:
For OpenSSL (libssl) v1.1.x, it is required to preload 'openssl_mutex_shared' library shipped by Kamailio. For more details see
'src/modules/tls/utils/openssl_mutex_shared/README.md'.
'src/modules/tls/utils/openssl_mutex_shared/README.md says:
**IMPORTANT: the workaround of using this preloaded shared library is no longer
needed starting with Kamailio v5.3.0-pre1 (git master branch after September 14, 2019).
The code of this shared library has been included in the core of Kamailio and the
same behaviour is now achieved by default.**
This is a shared library required as a short term workaround for using Kamailio
with OpenSSL (libssl) v1.1. It has to be pre-loaded before starting Kamailio.
In v1.1, libssl does not allow setting custom locking functions, using internally
pthread mutexes and rwlocks, but they are not initialized with process shared
option (PTHREAD_PROCESS_SHARED), which can result in blocking Kamailio worker
processes.
Is preloading openssl_mutex_shared necessary for Kamailio 5.6 or is it not necessary?
If it is not necessary, why isn’t it removed from the source code?
Greetings
Дилян
Hello,
when I connect to Kamailio over websocket, according to the websocket protocol K. sends “HTTP/1.1 101 Switching Protocols” and K. logs:
17(18) DEBUG: websocket [ws_handshake.c:179]: ws_handle_handshake(): found Upgrade: websocket
17(18) DEBUG: websocket [ws_conn.c:195]: wsconn_add(): connection id [14]
17(18) DEBUG: websocket [ws_conn.c:213]: wsconn_add(): new wsc => [0x7fcad6e10160], ref => [0]
17(18) DEBUG: websocket [ws_conn.c:233]: wsconn_add(): added to conn_table wsc => [0x7fcad6e10160], ref => [1]
17(18) DEBUG: sl [sl.c:310]: send_reply(): reply in stateless mode (sl)
17(18) DEBUG: <core> [core/msg_translator.c:162]: check_via_address(): (22.222.222.222, 22.222.222.222, 0)
17(18) DEBUG: <core> [core/tcp_main.c:1644]: _tcpconn_find(): found connection by id: 14
17(18) DEBUG: <core> [core/tcp_main.c:2528]: tcpconn_send_put(): send from reader (18 (17)), reusing fd
17(18) DEBUG: <core> [core/tcp_main.c:2763]: tcpconn_do_send(): sending...
17(18) DEBUG: <core> [core/tcp_main.c:2796]: tcpconn_do_send(): after real write: c= 0x7fcad6da7b38 n=238 fd=9
17(18) DEBUG: <core> [core/tcp_main.c:2797]: tcpconn_do_send(): buf=
17(18) DEBUG: <core> [core/parser/parse_fline.c:255]: parse_first_line(): bad request first line
17(18) DEBUG: <core> [core/parser/parse_fline.c:258]: parse_first_line(): at line 0 char 22:
17(18) DEBUG: <core> [core/parser/parse_fline.c:264]: parse_first_line(): parsed so far: HTTP/1.1 101 Switching
17(18) ERROR: <core> [core/parser/parse_fline.c:271]: parse_first_line(): parse_first_line: bad message (offset: 22)
17(18) DEBUG: <core> [core/parser/msg_parser.c:675]: parse_msg(): invalid message
17(18) ERROR: <core> [core/parser/msg_parser.c:749]: parse_msg(): ERROR: parse_msg: message=<HTTP/1.1 101 Switching Protocols
Sia: SIP/2.0/TLS 22.222.222.222:33776
Sec-WebSocket-Protocol: sip
Upgrade: websocket
Connection: upgrade
Sec-WebSocket-Accept: NeswoJX5ZQp7ER8hTKM5B3a3HDQ=
Content-Length: 0
>
17(18) ERROR: <core> [core/msg_translator.c:3256]: build_sip_msg_from_buf(): parsing failed
17(18) DEBUG: app_lua [app_lua_api.c:1924]: sr_kemi_lua_exit(): script exit call
17(18) DEBUG: app_lua [app_lua_api.c:789]: app_lua_run_ex(): ksr error call from Lua: ~~ksr~exit~~
17(18) DEBUG: app_lua [app_lua_mod.c:164]: sr_kemi_config_engine_lua(): execution of route type 513 with name [ksr_xhttp_event] returned
1
22.222.222.222 is my public address (the address of my router). It is not the IP-address of Kamailio.
While the returned address is not parseable in terms of SIP, it is a valid Websocket-answer message. This valid condition triggers logging an error,
but does not represent an error condition.
How can I avoid that the above non-error is logged as an error?
Thanks for your feedback
Дилян