[SR-Users] R: Kamailio connection issue
M S
shaheryarkh at gmail.com
Wed Mar 4 20:09:33 CET 2020
You have many problems,
1. You do not seem to have TCP keepalives setup. So idle TCP connections
may be disrupted, closed or reset by any router on the internet, forcing
SIP client to re-register. As a result many more tcp connections are
created by a single SIP end point then expected. You said you have 5K
active users with 8K connection limit, which is way too low. The limit
should be at least more then twice the active users (12.5K in your case). I
usually just set it to maximum i.e. 64K.
2. TCP connection life time of 3605 seconds is way too high. There will be
many many hack attempts on your server, some users will enter wrong
password, some users change their network (and thus need to re-register),
so allowing 3605 seconds life time for ALL of them is a HUGE mistake, and
you will run out of connection limit pretty quick. So keep the life time as
small as possible 5 or 10 seconds (approx. max time the server takes to
process sip register x 20 e.g. 250ms x 20 = 5000 ms = 5 seconds). Once the
SIP client successfully registers, you can increase the TCP life time using,
https://kamailio.org/docs/modules/4.4.x/modules/tcpops.html#tcpops.f.tcp_set_connection_lifetime
3. Similarly setup TCP keepalives according to SIP Register success,
https://kamailio.org/docs/modules/4.4.x/modules/tcpops.html#tcpops.f.tcp_keepalive_enable
https://kamailio.org/docs/modules/4.4.x/modules/tcpops.html#tcpops.f.tcp_keepalive_disable
4. If average SIP packet processing time is too high e.g. due to slow db
connection for authentication etc. push notification setup, web queries
etc. then use ASYNC for all time consuming tasks. So TCP listener processes
get freed quickly and can process more new connections.
5. It's good to monitor and analyze TCP connection states and events,
especially when TCP connections closes and why? See the TCPOPS module for
more help.
Hope this helps.
On Wed, Mar 4, 2020 at 5:30 PM Tomas Zanet <tzanet at came.com> wrote:
> As you suggested I increased the debug level from -1 to 3 and I got this
> Kamailio log:
>
> Mar 4 15:40:21 server-xip-01 /usr/local/sbin/kamailio[29417]: DEBUG:
> <core> [io_wait.h:602]: io_watch_del(): DBG: io_watch_del (0xa20dc0, 3433,
> -1, 0x0) fd_no=2922 called
>
> Mar 4 15:40:21 server-xip-01 /usr/local/sbin/kamailio[29417]: DEBUG:
> <core> [tcp_main.c:4139]: handle_tcpconn_ev(): sending to child, events 1
>
> *Mar 4 15:40:21 server-xip-01 /usr/local/sbin/kamailio[29417]: DEBUG:
> <core> [tcp_main.c:3816]: send2child(): WARNING: no free tcp receiver,
> connection passed to the least busy one (30)*
>
> Mar 4 15:40:21 server-xip-01 /usr/local/sbin/kamailio[29417]: DEBUG:
> <core> [tcp_main.c:3821]: send2child(): selected tcp worker 8 36(29409) for
> activity on [tls:X.X.X.X:5061], 0x7f482c2c2bc0
>
> Mar 4 15:40:21 server-xip-01 /usr/local/sbin/kamailio[29409]: DEBUG:
> <core> [tcp_read.c:1576]: handle_io(): received n=8 con=0x7f482c2c2bc0,
> fd=13
>
> Mar 4 15:40:21 server-xip-01 /usr/local/sbin/kamailio[29409]: DEBUG: tls
> [tls_server.c:197]: tls_complete_init(): completing tls connection
> initialization
>
> Mar 4 15:40:21 server-xip-01 /usr/local/sbin/kamailio[29409]: DEBUG: tls
> [tls_server.c:226]: tls_complete_init(): Using initial TLS domain
> TLSs<default> (dom 0x7f481b22ba50 ctx 0x7f481b6fbf20 sn [])
>
> Mar 4 15:40:21 server-xip-01 /usr/local/sbin/kamailio[29409]: DEBUG: tls
> [tls_domain.c:715]: sr_ssl_ctx_info_callback(): SSL handshake started
>
>
>
> Looking for this warning into the log, I realized that there are a lot of
> these messages:
>
>
>
> sudo tail -f /var/log/kamailio.log | grep "no free tcp receiver"
>
> Mar 4 15:40:23 server-xip-01 /usr/local/sbin/kamailio[29417]: DEBUG:
> <core> [tcp_main.c:3816]: send2child(): WARNING: no free tcp receiver,
> connection passed to the least busy one (20)
>
> Mar 4 15:40:23 server-xip-01 /usr/local/sbin/kamailio[29417]: DEBUG:
> <core> [tcp_main.c:3816]: send2child(): WARNING: no free tcp receiver,
> connection passed to the least busy one (20)
>
> Mar 4 15:40:23 server-xip-01 /usr/local/sbin/kamailio[29417]: DEBUG:
> <core> [tcp_main.c:3816]: send2child(): WARNING: no free tcp receiver,
> connection passed to the least busy one (21)
>
> Mar 4 15:40:23 server-xip-01 /usr/local/sbin/kamailio[29417]: DEBUG:
> <core> [tcp_main.c:3816]: send2child(): WARNING: no free tcp receiver,
> connection passed to the least busy one (20)
>
> Mar 4 15:40:23 server-xip-01 /usr/local/sbin/kamailio[29417]: DEBUG:
> <core> [tcp_main.c:3816]: send2child(): WARNING: no free tcp receiver,
> connection passed to the least busy one (21)
>
> Mar 4 15:40:24 server-xip-01 /usr/local/sbin/kamailio[29417]: DEBUG:
> <core> [tcp_main.c:3816]: send2child(): WARNING: no free tcp receiver,
> connection passed to the least busy one (14)
>
> Mar 4 15:40:24 server-xip-01 /usr/local/sbin/kamailio[29417]: DEBUG:
> <core> [tcp_main.c:3816]: send2child(): WARNING: no free tcp receiver,
> connection passed to the least busy one (15)
>
> Mar 4 15:40:24 server-xip-01 /usr/local/sbin/kamailio[29417]: DEBUG:
> <core> [tcp_main.c:3816]: send2child(): WARNING: no free tcp receiver,
> connection passed to the least busy one (16)
>
> Mar 4 15:40:24 server-xip-01 /usr/local/sbin/kamailio[29417]: DEBUG:
> <core> [tcp_main.c:3816]: send2child(): WARNING: no free tcp receiver,
> connection passed to the least busy one (16)
>
> Mar 4 15:40:24 server-xip-01 /usr/local/sbin/kamailio[29417]: DEBUG:
> <core> [tcp_main.c:3816]: send2child(): WARNING: no free tcp receiver,
> connection passed to the least busy one (16)
>
> Mar 4 15:40:24 server-xip-01 /usr/local/sbin/kamailio[29417]: DEBUG:
> <core> [tcp_main.c:3816]: send2child(): WARNING: no free tcp receiver,
> connection passed to the least busy one (17)
>
> Mar 4 15:40:24 server-xip-01 /usr/local/sbin/kamailio[29417]: DEBUG:
> <core> [tcp_main.c:3816]: send2child(): WARNING: no free tcp receiver,
> connection passed to the least busy one (17)
>
> Mar 4 15:40:24 server-xip-01 /usr/local/sbin/kamailio[29417]: DEBUG:
> <core> [tcp_main.c:3816]: send2child(): WARNING: no free tcp receiver,
> connection passed to the least busy one (17)
>
> Mar 4 15:40:24 server-xip-01 /usr/local/sbin/kamailio[29417]: DEBUG:
> <core> [tcp_main.c:3816]: send2child(): WARNING: no free tcp receiver,
> connection passed to the least busy one (16)
>
> Mar 4 15:40:24 server-xip-01 /usr/local/sbin/kamailio[29417]: DEBUG:
> <core> [tcp_main.c:3816]: send2child(): WARNING: no free tcp receiver,
> connection passed to the least busy one (17)
>
> Mar 4 15:40:24 server-xip-01 /usr/local/sbin/kamailio[29417]: DEBUG:
> <core> [tcp_main.c:3816]: send2child(): WARNING: no free tcp receiver,
> connection passed to the least busy one (18)
>
> Mar 4 15:40:24 server-xip-01 /usr/local/sbin/kamailio[29417]: DEBUG:
> <core> [tcp_main.c:3816]: send2child(): WARNING: no free tcp receiver,
> connection passed to the least busy one (18)
>
> Mar 4 15:40:24 server-xip-01 /usr/local/sbin/kamailio[29417]: DEBUG:
> <core> [tcp_main.c:3816]: send2child(): WARNING: no free tcp receiver,
> connection passed to the least busy one (18)
>
> Mar 4 15:40:24 server-xip-01 /usr/local/sbin/kamailio[29417]: DEBUG:
> <core> [tcp_main.c:3816]: send2child(): WARNING: no free tcp receiver,
> connection passed to the least busy one (18)
>
> Mar 4 15:40:24 server-xip-01 /usr/local/sbin/kamailio[29417]: DEBUG:
> <core> [tcp_main.c:3816]: send2child(): WARNING: no free tcp receiver,
> connection passed to the least busy one (18)
>
> Mar 4 15:40:24 server-xip-01 /usr/local/sbin/kamailio[29417]: DEBUG:
> <core> [tcp_main.c:3816]: send2child(): WARNING: no free tcp receiver,
> connection passed to the least busy one (18)
>
> …..
>
> ….
>
>
>
> As far as I know, I should increase the tcp_children parameter, right?.
>
> Now, it’s set to 16 but it’s pretty clear which is not enough. I increased
> it step by step till 48 but that’s still not enough.
>
> If I set it to 64 then Kamailio failed to start.
>
>
>
> What would you recommend for a server which has more or less 5K users?
>
>
>
> Many thanks.
>
> T.
>
>
>
> *Tomas Zanet*
> R&D Software Designer
> tzanet at came.com
>
> <http://www.came.com>
>
> *CAME S.p.A.*
> *Sede legale e operativa*
> Via Martiri della Libertà 15
> 31030 Dosson di Casier (TV) / Italy
> *T* +39 0422 4940 *F* +39 0422 4941
> *P.IVA* 03481280265
> Cod. dest. fatturazione elettronica: UHWULCH
>
> *Sede operativa*
> Via Cornia 1/b,c
> 33079 Sesto al Reghena (PN) / Italy
> www.came.com
>
> Follow us on: Facebook
> <https://www.facebook.com/CAME-Italia-1604586846431438/> / Twitter
> <https://twitter.com/CameGlobal> / Youtube
> <https://www.youtube.com/user/CameChannel> / Linkedin
> <https://www.linkedin.com/company/came>
>
> Le informazioni contenute in questa e-mail e negli eventuali allegati sono
> riservate e confidenziali e ne è vietata la diffusione e distribuzione in
> qualunque modo. Qualora non ne fosse il legittimo destinatario, è invitato
> a non diffonderle e ad eliminarle, dandone gentilmente informazione al
> mittente.
>
> The information included in this e-mail and any attachments are
> confidential and can not be disclosed, duplicated or redistributed by any
> means. If you are not the correct recipient, you are kindly requested to
> cancel it and notify the sender immediately
>
> *Da:* sr-users [mailto:sr-users-bounces at lists.kamailio.org] *Per conto di
> *Denys Pozniak
> *Inviato:* mercoledì 4 marzo 2020 15:00
> *A:* Kamailio (SER) - Users Mailing List <sr-users at lists.kamailio.org>
> *Oggetto:* Re: [SR-Users] Kamailio connection issue
>
>
>
> Hello!
>
> You can try to increase the debug printing level (like debug=2 or even
> debug=3) and check syslog when Kamailio gets stuck.
>
>
>
>
>
> ср, 4 мар. 2020 г. в 13:12, Tomas Zanet <tzanet at came.com>:
>
> Hi All,
>
> In these days I'm facing an issue or a misconfiguration, I don't know,
> with our remote apps (SIP clients) and our cloud server in which there's a
> Kamailio version 4.4.4.
> Sometimes our app takes some seconds (about 10-12sec) to connect to the
> Kamailio socket (sips on 5061) and our app is stuck on connect sytem call.
>
> To debug this behaviour I put a rule in the iptable like this:
> sudo iptables -I INPUT -p tcp -m tcp --dport 5061 -m state --state NEW -j
> LOG --log-level 1 --log-prefix "New Connection "
>
> when everything goes well, there is one row which describes the new
> incoming connection Mar 4 07:08:40 server-xip-01 kernel: New Connection
> IN=eth0 OUT= MAC=bc:76:4e:20:1f:fb:e8:ed:f3:3a:52:41:08:00 SRC=X.X.X.X
> DST=Y.Y.Y.Y LEN=64 TOS=0x00 PREC=0x00 TTL=45 ID=0 DF PROTO=TCP SPT=15819
> DPT=5061 WINDOW=65535 RES=0x00 CWR ECE SYN URGP=0
>
> when our app gets stuck in the connect, there are many attempts and seems
> that Kamailio is not ready to accept this connection:
> Mar 4 07:11:54 server-xip-01 kernel: New Connection IN=eth0 OUT=
> MAC=bc:76:4e:20:1f:fb:e8:ed:f3:3a:52:41:08:00 SRC=X.X.X.X DST=Y.Y.Y.Y
> LEN=64 TOS=0x00 PREC=0x00 TTL=45 ID=0 DF PROTO=TCP SPT=52900 DPT=5061
> WINDOW=65535 RES=0x00 CWR ECE SYN URGP=0 Mar 4 07:11:54 server-xip-01
> kernel: New Connection IN=eth0 OUT=
> MAC=bc:76:4e:20:1f:fb:e8:ed:f3:3a:52:41:08:00 SRC=X.X.X.X DST=Y.Y.Y.Y
> LEN=64 TOS=0x00 PREC=0x00 TTL=45 ID=0 DF PROTO=TCP SPT=52900 DPT=5061
> WINDOW=65535 RES=0x00 SYN URGP=0 Mar 4 07:11:55 server-xip-01 kernel: New
> Connection IN=eth0 OUT= MAC=bc:76:4e:20:1f:fb:e8:ed:f3:3a:52:41:08:00
> SRC=X.X.X.X DST=Y.Y.Y.Y LEN=64 TOS=0x00 PREC=0x00 TTL=45 ID=0 DF PROTO=TCP
> SPT=52900 DPT=5061 WINDOW=65535 RES=0x00 SYN URGP=0 Mar 4 07:11:55
> server-xip-01 kernel: New Connection IN=eth0 OUT=
> MAC=bc:76:4e:20:1f:fb:e8:ed:f3:3a:52:41:08:00 SRC=X.X.X.X DST=Y.Y.Y.Y
> LEN=64 TOS=0x00 PREC=0x00 TTL=45 ID=0 DF PROTO=TCP SPT=52900 DPT=5061
> WINDOW=65535 RES=0x00 SYN URGP=0 Mar 4 07:11:55 server-xip-01 kernel: New
> Connection IN=eth0 OUT= MAC=bc:76:4e:20:1f:fb:e8:ed:f3:3a:52:41:08:00
> SRC=X.X.X.X DST=Y.Y.Y.Y LEN=64 TOS=0x00 PREC=0x00 TTL=45 ID=0 DF PROTO=TCP
> SPT=52900 DPT=5061 WINDOW=65535 RES=0x00 SYN URGP=0 Mar 4 07:11:56
> server-xip-01 kernel: New Connection IN=eth0 OUT=
> MAC=bc:76:4e:20:1f:fb:e8:ed:f3:3a:52:41:08:00 SRC=X.X.X.X DST=Y.Y.Y.Y
> LEN=64 TOS=0x00 PREC=0x00 TTL=45 ID=0 DF PROTO=TCP SPT=52900 DPT=5061
> WINDOW=65535 RES=0x00 SYN URGP=0 Mar 4 07:11:56 server-xip-01 kernel: New
> Connection IN=eth0 OUT= MAC=bc:76:4e:20:1f:fb:e8:ed:f3:3a:52:41:08:00
> SRC=X.X.X.X DST=Y.Y.Y.Y LEN=64 TOS=0x00 PREC=0x00 TTL=45 ID=0 DF PROTO=TCP
> SPT=52900 DPT=5061 WINDOW=65535 RES=0x00 SYN URGP=0 Mar 4 07:11:56
> server-xip-01 kernel: New Connection IN=eth0 OUT=
> MAC=bc:76:4e:20:1f:fb:e8:ed:f3:3a:52:41:08:00 SRC=X.X.X.X DST=Y.Y.Y.Y
> LEN=64 TOS=0x00 PREC=0x00 TTL=45 ID=0 DF PROTO=TCP SPT=52900 DPT=5061
> WINDOW=65535 RES=0x00 SYN URGP=0 Mar 4 07:11:58 server-xip-01 kernel: New
> Connection IN=eth0 OUT= MAC=bc:76:4e:20:1f:fb:e8:ed:f3:3a:52:41:08:00
> SRC=X.X.X.X DST=Y.Y.Y.Y LEN=64 TOS=0x00 PREC=0x00 TTL=45 ID=0 DF PROTO=TCP
> SPT=52900 DPT=5061 WINDOW=65535 RES=0x00 SYN URGP=0 Mar 4 07:12:00
> server-xip-01 kernel: New Connection IN=eth0 OUT=
> MAC=bc:76:4e:20:1f:fb:e8:ed:f3:3a:52:41:08:00 SRC=X.X.X.X DST=Y.Y.Y.Y
> LEN=64 TOS=0x00 PREC=0x00 TTL=45 ID=0 DF PROTO=TCP SPT=52900 DPT=5061
> WINDOW=65535 RES=0x00 SYN URGP=0 Mar 4 07:12:05 server-xip-01 kernel: New
> Connection IN=eth0 OUT= MAC=bc:76:4e:20:1f:fb:e8:ed:f3:3a:52:41:08:00
> SRC=X.X.X.X DST=Y.Y.Y.Y LEN=48 TOS=0x00 PREC=0x00 TTL=45 ID=0 DF PROTO=TCP
> SPT=52900 DPT=5061 WINDOW=65535 RES=0x00 SYN URGP=0 When it happens our app
> takes more than 10 sec to connect to the Kamailio socket.
>
> I tried to increase the children configuration parameter in the cfg file
> because I thought there were not enough tcp receiver but with no luck.
> Here are some configuration parameters:
>
> children = 16 // was 8 before found this issue
>
> # life time of TCP connection when there is no traffic # - a bit higher
> than registration expires to cope with UA behind NAT
> tcp_connection_lifetime=3605
>
> open_files_limit=8192
>
> tcp_max_connections=8192
> tls_max_connections=8192
>
> Here is the output of the ulimit command:
> ulimit -a
> core file size (blocks, -c) 0
> data seg size (kbytes, -d) unlimited
> scheduling priority (-e) 0
> file size (blocks, -f) unlimited
> pending signals (-i) 31393
> max locked memory (kbytes, -l) 64
> max memory size (kbytes, -m) unlimited
> open files (-n) 65535
> pipe size (512 bytes, -p) 8
> POSIX message queues (bytes, -q) 819200
> real-time priority (-r) 0
> stack size (kbytes, -s) 8192
> cpu time (seconds, -t) unlimited
> max user processes (-u) 31393
> virtual memory (kbytes, -v) unlimited
> file locks (-x) unlimited
> In the kamailio log file I don't see anything wrong, I don't see messages
> like "[tcp_main.c:3913]: handle_new_connect(): maximum number of tls
> connections exceeded".
>
> Here is the output of kamctl ps command
> Process:: ID=0 PID=11725 Type=main process - attendant
> Process:: ID=1 PID=11727 Type=udp receiver child=0 sock=
> 10.208.194.26:5061
> Process:: ID=2 PID=11728 Type=udp receiver child=1 sock=
> 10.208.194.26:5061
> Process:: ID=3 PID=11729 Type=udp receiver child=2 sock=
> 10.208.194.26:5061
> Process:: ID=4 PID=11730 Type=udp receiver child=3 sock=
> 10.208.194.26:5061
> Process:: ID=5 PID=11731 Type=udp receiver child=4 sock=
> 10.208.194.26:5061
> Process:: ID=6 PID=11732 Type=udp receiver child=5 sock=
> 10.208.194.26:5061
> Process:: ID=7 PID=11733 Type=udp receiver child=6 sock=
> 10.208.194.26:5061
> Process:: ID=8 PID=11734 Type=udp receiver child=7 sock=
> 10.208.194.26:5061
> Process:: ID=9 PID=11735 Type=udp receiver child=8 sock=
> 10.208.194.26:5061
> Process:: ID=10 PID=11736 Type=udp receiver child=9 sock=
> 10.208.194.26:5061
> Process:: ID=11 PID=11737 Type=udp receiver child=10 sock=
> 10.208.194.26:5061
> Process:: ID=12 PID=11738 Type=udp receiver child=11 sock=
> 10.208.194.26:5061
> Process:: ID=13 PID=11739 Type=udp receiver child=12 sock=
> 10.208.194.26:5061
> Process:: ID=14 PID=11740 Type=udp receiver child=13 sock=
> 10.208.194.26:5061
> Process:: ID=15 PID=11741 Type=udp receiver child=14 sock=
> 10.208.194.26:5061
> Process:: ID=16 PID=11742 Type=udp receiver child=15 sock=
> 10.208.194.26:5061
> Process:: ID=17 PID=11743 Type=slow timer
> Process:: ID=18 PID=11744 Type=timer
> Process:: ID=19 PID=11745 Type=secondary timer
> Process:: ID=20 PID=11746 Type=MI FIFO
> Process:: ID=21 PID=11747 Type=USRLOC Timer
> Process:: ID=22 PID=11748 Type=USRLOC Timer
> Process:: ID=23 PID=11749 Type=USRLOC Timer
> Process:: ID=24 PID=11750 Type=USRLOC Timer
> Process:: ID=25 PID=11751 Type=ctl handler
> Process:: ID=26 PID=11752 Type=RTIMER SEC EXEC
> Process:: ID=27 PID=11753 Type=TIMER NH
> Process:: ID=28 PID=11754 Type=tcp receiver (generic) child=0
> Process:: ID=29 PID=11755 Type=tcp receiver (generic) child=1
> Process:: ID=30 PID=11756 Type=tcp receiver (generic) child=2
> Process:: ID=31 PID=11757 Type=tcp receiver (generic) child=3
> Process:: ID=32 PID=11758 Type=tcp receiver (generic) child=4
> Process:: ID=33 PID=11759 Type=tcp receiver (generic) child=5
> Process:: ID=34 PID=11760 Type=tcp receiver (generic) child=6
> Process:: ID=35 PID=11761 Type=tcp receiver (generic) child=7
> Process:: ID=36 PID=11762 Type=tcp receiver (generic) child=8
> Process:: ID=37 PID=11763 Type=tcp receiver (generic) child=9
> Process:: ID=38 PID=11764 Type=tcp receiver (generic) child=10
> Process:: ID=39 PID=11765 Type=tcp receiver (generic) child=11
> Process:: ID=40 PID=11766 Type=tcp receiver (generic) child=12
> Process:: ID=41 PID=11767 Type=tcp receiver (generic) child=13
> Process:: ID=42 PID=11768 Type=tcp receiver (generic) child=14
> Process:: ID=43 PID=11769 Type=tcp receiver (generic) child=15
> Process:: ID=44 PID=11770 Type=tcp main process
>
>
> Can anyone address me to find out this happens, please?
> Please let me know if you need further information...
>
> Thanks in advance,
> Best Regards
> Tomas
>
>
>
> Tomas Zanet
> R&D Software Designer
> tzanet at came.com
> CAME S.p.A.
>
> _______________________________________________
> Kamailio (SER) - Users Mailing List
> sr-users at lists.kamailio.org
> https://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-users
>
>
>
>
> --
>
>
>
> BR,
>
> Denys Pozniak
>
>
>
>
> _______________________________________________
> Kamailio (SER) - Users Mailing List
> sr-users at lists.kamailio.org
> https://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.kamailio.org/pipermail/sr-users/attachments/20200304/d91d4bd5/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: LogoNew_f0706ef0-76e6-499c-97ed-10496392d3f5.png
Type: image/png
Size: 1742 bytes
Desc: not available
URL: <http://lists.kamailio.org/pipermail/sr-users/attachments/20200304/d91d4bd5/attachment.png>
More information about the sr-users
mailing list