Hi All,
In these days I'm facing an issue or a misconfiguration, I don't know, with our remote apps (SIP clients) and our cloud server in which there's a Kamailio version 4.4.4. Sometimes our app takes some seconds (about 10-12sec) to connect to the Kamailio socket (sips on 5061) and our app is stuck on connect sytem call.
To debug this behaviour I put a rule in the iptable like this: sudo iptables -I INPUT -p tcp -m tcp --dport 5061 -m state --state NEW -j LOG --log-level 1 --log-prefix "New Connection "
when everything goes well, there is one row which describes the new incoming connection Mar 4 07:08:40 server-xip-01 kernel: New Connection IN=eth0 OUT= MAC=bc:76:4e:20:1f:fb:e8:ed:f3:3a:52:41:08:00 SRC=X.X.X.X DST=Y.Y.Y.Y LEN=64 TOS=0x00 PREC=0x00 TTL=45 ID=0 DF PROTO=TCP SPT=15819 DPT=5061 WINDOW=65535 RES=0x00 CWR ECE SYN URGP=0
when our app gets stuck in the connect, there are many attempts and seems that Kamailio is not ready to accept this connection: Mar 4 07:11:54 server-xip-01 kernel: New Connection IN=eth0 OUT= MAC=bc:76:4e:20:1f:fb:e8:ed:f3:3a:52:41:08:00 SRC=X.X.X.X DST=Y.Y.Y.Y LEN=64 TOS=0x00 PREC=0x00 TTL=45 ID=0 DF PROTO=TCP SPT=52900 DPT=5061 WINDOW=65535 RES=0x00 CWR ECE SYN URGP=0 Mar 4 07:11:54 server-xip-01 kernel: New Connection IN=eth0 OUT= MAC=bc:76:4e:20:1f:fb:e8:ed:f3:3a:52:41:08:00 SRC=X.X.X.X DST=Y.Y.Y.Y LEN=64 TOS=0x00 PREC=0x00 TTL=45 ID=0 DF PROTO=TCP SPT=52900 DPT=5061 WINDOW=65535 RES=0x00 SYN URGP=0 Mar 4 07:11:55 server-xip-01 kernel: New Connection IN=eth0 OUT= MAC=bc:76:4e:20:1f:fb:e8:ed:f3:3a:52:41:08:00 SRC=X.X.X.X DST=Y.Y.Y.Y LEN=64 TOS=0x00 PREC=0x00 TTL=45 ID=0 DF PROTO=TCP SPT=52900 DPT=5061 WINDOW=65535 RES=0x00 SYN URGP=0 Mar 4 07:11:55 server-xip-01 kernel: New Connection IN=eth0 OUT= MAC=bc:76:4e:20:1f:fb:e8:ed:f3:3a:52:41:08:00 SRC=X.X.X.X DST=Y.Y.Y.Y LEN=64 TOS=0x00 PREC=0x00 TTL=45 ID=0 DF PROTO=TCP SPT=52900 DPT=5061 WINDOW=65535 RES=0x00 SYN URGP=0 Mar 4 07:11:55 server-xip-01 kernel: New Connection IN=eth0 OUT= MAC=bc:76:4e:20:1f:fb:e8:ed:f3:3a:52:41:08:00 SRC=X.X.X.X DST=Y.Y.Y.Y LEN=64 TOS=0x00 PREC=0x00 TTL=45 ID=0 DF PROTO=TCP SPT=52900 DPT=5061 WINDOW=65535 RES=0x00 SYN URGP=0 Mar 4 07:11:56 server-xip-01 kernel: New Connection IN=eth0 OUT= MAC=bc:76:4e:20:1f:fb:e8:ed:f3:3a:52:41:08:00 SRC=X.X.X.X DST=Y.Y.Y.Y LEN=64 TOS=0x00 PREC=0x00 TTL=45 ID=0 DF PROTO=TCP SPT=52900 DPT=5061 WINDOW=65535 RES=0x00 SYN URGP=0 Mar 4 07:11:56 server-xip-01 kernel: New Connection IN=eth0 OUT= MAC=bc:76:4e:20:1f:fb:e8:ed:f3:3a:52:41:08:00 SRC=X.X.X.X DST=Y.Y.Y.Y LEN=64 TOS=0x00 PREC=0x00 TTL=45 ID=0 DF PROTO=TCP SPT=52900 DPT=5061 WINDOW=65535 RES=0x00 SYN URGP=0 Mar 4 07:11:56 server-xip-01 kernel: New Connection IN=eth0 OUT= MAC=bc:76:4e:20:1f:fb:e8:ed:f3:3a:52:41:08:00 SRC=X.X.X.X DST=Y.Y.Y.Y LEN=64 TOS=0x00 PREC=0x00 TTL=45 ID=0 DF PROTO=TCP SPT=52900 DPT=5061 WINDOW=65535 RES=0x00 SYN URGP=0 Mar 4 07:11:58 server-xip-01 kernel: New Connection IN=eth0 OUT= MAC=bc:76:4e:20:1f:fb:e8:ed:f3:3a:52:41:08:00 SRC=X.X.X.X DST=Y.Y.Y.Y LEN=64 TOS=0x00 PREC=0x00 TTL=45 ID=0 DF PROTO=TCP SPT=52900 DPT=5061 WINDOW=65535 RES=0x00 SYN URGP=0 Mar 4 07:12:00 server-xip-01 kernel: New Connection IN=eth0 OUT= MAC=bc:76:4e:20:1f:fb:e8:ed:f3:3a:52:41:08:00 SRC=X.X.X.X DST=Y.Y.Y.Y LEN=64 TOS=0x00 PREC=0x00 TTL=45 ID=0 DF PROTO=TCP SPT=52900 DPT=5061 WINDOW=65535 RES=0x00 SYN URGP=0 Mar 4 07:12:05 server-xip-01 kernel: New Connection IN=eth0 OUT= MAC=bc:76:4e:20:1f:fb:e8:ed:f3:3a:52:41:08:00 SRC=X.X.X.X DST=Y.Y.Y.Y LEN=48 TOS=0x00 PREC=0x00 TTL=45 ID=0 DF PROTO=TCP SPT=52900 DPT=5061 WINDOW=65535 RES=0x00 SYN URGP=0 When it happens our app takes more than 10 sec to connect to the Kamailio socket.
I tried to increase the children configuration parameter in the cfg file because I thought there were not enough tcp receiver but with no luck. Here are some configuration parameters:
children = 16 // was 8 before found this issue
# life time of TCP connection when there is no traffic # - a bit higher than registration expires to cope with UA behind NAT tcp_connection_lifetime=3605
open_files_limit=8192
tcp_max_connections=8192 tls_max_connections=8192
Here is the output of the ulimit command: ulimit -a core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 31393 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) unlimited open files (-n) 65535 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 31393 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited In the kamailio log file I don't see anything wrong, I don't see messages like "[tcp_main.c:3913]: handle_new_connect(): maximum number of tls connections exceeded".
Here is the output of kamctl ps command Process:: ID=0 PID=11725 Type=main process - attendant Process:: ID=1 PID=11727 Type=udp receiver child=0 sock=10.208.194.26:5061 Process:: ID=2 PID=11728 Type=udp receiver child=1 sock=10.208.194.26:5061 Process:: ID=3 PID=11729 Type=udp receiver child=2 sock=10.208.194.26:5061 Process:: ID=4 PID=11730 Type=udp receiver child=3 sock=10.208.194.26:5061 Process:: ID=5 PID=11731 Type=udp receiver child=4 sock=10.208.194.26:5061 Process:: ID=6 PID=11732 Type=udp receiver child=5 sock=10.208.194.26:5061 Process:: ID=7 PID=11733 Type=udp receiver child=6 sock=10.208.194.26:5061 Process:: ID=8 PID=11734 Type=udp receiver child=7 sock=10.208.194.26:5061 Process:: ID=9 PID=11735 Type=udp receiver child=8 sock=10.208.194.26:5061 Process:: ID=10 PID=11736 Type=udp receiver child=9 sock=10.208.194.26:5061 Process:: ID=11 PID=11737 Type=udp receiver child=10 sock=10.208.194.26:5061 Process:: ID=12 PID=11738 Type=udp receiver child=11 sock=10.208.194.26:5061 Process:: ID=13 PID=11739 Type=udp receiver child=12 sock=10.208.194.26:5061 Process:: ID=14 PID=11740 Type=udp receiver child=13 sock=10.208.194.26:5061 Process:: ID=15 PID=11741 Type=udp receiver child=14 sock=10.208.194.26:5061 Process:: ID=16 PID=11742 Type=udp receiver child=15 sock=10.208.194.26:5061 Process:: ID=17 PID=11743 Type=slow timer Process:: ID=18 PID=11744 Type=timer Process:: ID=19 PID=11745 Type=secondary timer Process:: ID=20 PID=11746 Type=MI FIFO Process:: ID=21 PID=11747 Type=USRLOC Timer Process:: ID=22 PID=11748 Type=USRLOC Timer Process:: ID=23 PID=11749 Type=USRLOC Timer Process:: ID=24 PID=11750 Type=USRLOC Timer Process:: ID=25 PID=11751 Type=ctl handler Process:: ID=26 PID=11752 Type=RTIMER SEC EXEC Process:: ID=27 PID=11753 Type=TIMER NH Process:: ID=28 PID=11754 Type=tcp receiver (generic) child=0 Process:: ID=29 PID=11755 Type=tcp receiver (generic) child=1 Process:: ID=30 PID=11756 Type=tcp receiver (generic) child=2 Process:: ID=31 PID=11757 Type=tcp receiver (generic) child=3 Process:: ID=32 PID=11758 Type=tcp receiver (generic) child=4 Process:: ID=33 PID=11759 Type=tcp receiver (generic) child=5 Process:: ID=34 PID=11760 Type=tcp receiver (generic) child=6 Process:: ID=35 PID=11761 Type=tcp receiver (generic) child=7 Process:: ID=36 PID=11762 Type=tcp receiver (generic) child=8 Process:: ID=37 PID=11763 Type=tcp receiver (generic) child=9 Process:: ID=38 PID=11764 Type=tcp receiver (generic) child=10 Process:: ID=39 PID=11765 Type=tcp receiver (generic) child=11 Process:: ID=40 PID=11766 Type=tcp receiver (generic) child=12 Process:: ID=41 PID=11767 Type=tcp receiver (generic) child=13 Process:: ID=42 PID=11768 Type=tcp receiver (generic) child=14 Process:: ID=43 PID=11769 Type=tcp receiver (generic) child=15 Process:: ID=44 PID=11770 Type=tcp main process
Can anyone address me to find out this happens, please? Please let me know if you need further information...
Thanks in advance, Best Regards Tomas
Tomas Zanet R&D Software Designer tzanet@came.com CAME S.p.A.