[SR-Users] Kamailio stop to process incoming SIP traffic via TCP.

Daniel-Constantin Mierla miconda at gmail.com
Thu May 2 10:03:12 CEST 2019


Hello,

I think one possibility to reproduce the issue would be to create a
scenario when same connection is wanted at the same time, when the first
process that gets the lock on it needs a bit more time to execute. Not
sure how to create the case, maybe something like:

  - enable onsend_route for replies and when getting a 200ok reply to
and INVITE there, do a sleep()
  - because the ACK is not coming fast enough, 200ok should be
retransmitted by the callee, another K process will get it and will try
to send it over the same connection
  - run with not many tcp workers, maybe like tcp_children=4
  - do several calls and see how it goes

Again, not sure it covers the case properly, but it is something to
test, because the backtraces I got showed attempts to use same connection.

Otherwise, just running it with traffic for long time, eventually with
two kamailio connected via tls, so a single connection is used for all
traffic between them and makes it likely to have many processes trying
to use it.

Cheers,
Daniel

On 01.05.19 22:26, Aymeric Moizard wrote:
> HI Daniel,
>
> I have received your request and have added it to my TODO list...
>
> Unfortunatly, no much time currently. I will certainly do it later, but
> cannot give any delay for it.
>
> Also, I would really like to understand how to "generate" the issue.
> (I think I had the issue only once or twice this year...)
>
> Otherwise, I will have no way to make sure the workaround would
> work...
>
> Tks
> Aymeric
>
>
>
> Le lun. 15 avr. 2019 à 09:06, Daniel-Constantin Mierla
> <miconda at gmail.com <mailto:miconda at gmail.com>> a écrit :
>
>     Hello Aymeric,
>
>     would you be able to test with tls module compiled against libssl
>     1.1 and using the pre-loaded shared object workaround?
>
>       *
>     https://github.com/kamailio/kamailio/tree/master/src/modules/tls/utils/openssl_mutex_shared
>
>     You should be able to use it with any version, no need to test
>     with kamailio master branch.
>
>     Just clone the master branch, then:
>
>     cd src/modules/tls/utils/openssl_mutex_shared
>
>     make
>
>     Either from there or copy openssl_mutex_shared.so to a location
>     you want, then pre-load it before starting your version of Kamailio.
>
>     The README.md in the folder has some more details.
>
>     I would like to have some validation that it works fine before
>     approaching this topic with libssl project to allow to init the
>     locks with shared process option.
>
>     Thanks,
>     Daniel
>
>     On 26.03.19 16:18, Daniel-Constantin Mierla wrote:
>>
>>     Hello,
>>
>>     yep, locking there is expected, as listing the tls connections
>>     wait for no other processes to change the content of internal tls
>>     connection structures. So it is a side effect of libssl/libcrypto
>>     getting stuck and the other processing waiting for it to move
>>     one. I have the Kamailio training in USA these days, so the trip
>>     and schedule of the day didn't allow me to look more at the
>>     libsll/libcrypto code in order to find a solution here. It is a
>>     high priority in my list, as I get time during the next days.
>>
>>     Cheers,
>>     Daniel
>>
>>     On 26.03.19 15:55, Aymeric Moizard wrote:
>>>     Hi All,
>>>
>>>     I was debugging a TCP issue (most probably, I may start a thread
>>>     for this question).
>>>
>>>     I was trying to get some info for TCP and TLS.
>>>
>>>     I typed:
>>>     $> sudo kamctl rpc tls.list
>>>
>>>     And waited for a while.... until... I realized that my
>>>     User-Agent, connected with TCP was not able to register any
>>>     more. I think the rpc command has introduced something wrong.
>>>
>>>     The device can successfully "connect", send the REGISTER over
>>>     the established TCP connection. The REGISTER do not appear in
>>>     the logs any more, I don't see any traffic for TCP any more. So
>>>     the behavior is the same as I had before: TCP and TLS are both
>>>     not working and UDP is still working fine.
>>>
>>>     kamctl do not work any more... so kamctl trap do not work...
>>>
>>>     I have been able to type.. manually... for (all?) kamailio threads:
>>>
>>>     gdb /usr/sbin/kamailio 16500 -batch --eval-command="bt full" >>
>>>     kamailio-trap-tcp-down.txt
>>>
>>>     I'm temporarly puting the backtrace I have here:
>>>     https://sip.antisip.com/kamailio-trap-tcp-down.txt
>>>
>>>     You can see a thread stuck on the json command line: "tls_list"
>>>     And many other waiting on CRYPTO_THREAD_write_lock
>>>     ? might be related to:
>>>     https://github.com/openssl/openssl/issues/5376
>>>     SIDE NOTE:
>>>     Right before I was typing the last gdb command for the last
>>>     thread, kamailio
>>>     has crashed: This was around 5 minutes after the dead lock started.
>>>
>>>     Mar 26 14:47:11 sip kamailio[16493]: ERROR: <core>
>>>     [core/tcp_main.c:2561]: tcpconn_do_send(): failed to send on
>>>     0x7ff8dfc2fdc8 (91.121.30.149:5061->62.210.97.21:49351
>>>     <http://62.210.97.21:49351>): Broken pipe (32)
>>>     Mar 26 14:47:11 sip kamailio[16493]: ERROR: <core>
>>>     [core/tcp_read.c:1505]: tcp_read_req(): ERROR: tcp_read_req:
>>>     error reading - c: 0x7ff8dfc2fdc8 r: 0x7ff8dfc2fe48 (-1)
>>>     Mar 26 14:47:11 sip kamailio[16493]: WARNING: <core>
>>>     [core/tcp_read.c:1848]: handle_io(): F_TCPCONN connection marked
>>>     as bad: 0x7ff8dfa6a408 id 846 refcnt 3
>>>     Mar 26 14:47:11 sip kamailio[16371]: ALERT: <core> [main.c:755]:
>>>     handle_sigs(): child process 16374 exited by a signal 11
>>>     Mar 26 14:47:11 sip kamailio[16371]: ALERT: <core> [main.c:758]:
>>>     handle_sigs(): core was not generated
>>>     Mar 26 14:47:11 sip kamailio[16371]: INFO: <core> [main.c:781]:
>>>     handle_sigs(): terminating due to SIGCHLD
>>>     Mar 26 14:47:11 sip kamailio[16493]: INFO: <core> [main.c:836]:
>>>     sig_usr(): signal 15 received
>>>     Mar 26 14:47:11 sip kamailio[16500]: INFO: <core> [main.c:836]:
>>>     sig_usr(): signal 15 received
>>>     Mar 26 14:47:11 sip kamailio[16479]: INFO: <core> [main.c:836]:
>>>     sig_usr(): signal 15 received
>>>
>>>
>>>     Unfortunalty, even if I did my best to setup my service to
>>>     generate a core on crash, I still have "core was not
>>>     generated".... (debian stretch)
>>>
>>>     Tks for reading!
>>>     Regards
>>>     Aymeric
>>>
>>>
>>>
>>>     Le mar. 26 mars 2019 à 14:11, Kristijan Vrban
>>>     <vrban.lkml at gmail.com <mailto:vrban.lkml at gmail.com>> a écrit :
>>>
>>>         And again one more kamctl trap file where
>>>
>>>         set_reply_no_connect was set.
>>>
>>>         Am Di., 26. März 2019 um 08:53 Uhr schrieb Kristijan Vrban
>>>         <vrban.lkml at gmail.com <mailto:vrban.lkml at gmail.com>>:
>>>         >
>>>         > Attached also the output of kamctl trap
>>>         >
>>>         > Am Di., 26. März 2019 um 08:42 Uhr schrieb Kristijan Vrban
>>>         > <vrban.lkml at gmail.com <mailto:vrban.lkml at gmail.com>>:
>>>         > >
>>>         > > > Have you done a test with tools such as sipp, or was
>>>         this happening
>>>         > > > after a while, with usual phones registering?
>>>         > >
>>>         > > Usual variety of devices registering via TLS. But i can
>>>         not exclude
>>>         > > that some devices displaying behavioural problems.
>>>         > >
>>>         > > > Can you list the tcp connections and see if they are
>>>         listed?
>>>         > > > kamctl tcp core.tcp_list
>>>         > >
>>>         > > Need Kex module for that? So i can deliver next time.
>>>         But when i do
>>>         > > "lsof -u kamailio |grep TCP"
>>>         > > i get a long list of more then 2000 lines with:
>>>         > >
>>>         > > ...
>>>         > > kamailio 37561 kamailio 2105u     sock               
>>>         0,9      0t0
>>>         > > 27856287 protocol: TCP
>>>         > > kamailio 37561 kamailio 2106u     sock               
>>>         0,9      0t0
>>>         > > 27856305 protocol: TCP
>>>         > > kamailio 37561 kamailio 2107u     sock               
>>>         0,9      0t0
>>>         > > 27856306 protocol: TCP
>>>         > > kamailio 37561 kamailio 2108u     sock               
>>>         0,9      0t0
>>>         > > 27856914 protocol: TCP
>>>         > > ...
>>>         > >
>>>         > > So about the time Kamailio created a lot of socket in
>>>         the TCP domain,
>>>         > > but which are not bound to any port (eg via connect(2)
>>>         or listen(2) or
>>>         > > bind(2))
>>>         > > Until we get to the maximum number of 2048 connections.
>>>         > >
>>>         > > Best
>>>         > > Kristijan
>>>         > >
>>>         > > Am Mo., 25. März 2019 um 14:27 Uhr schrieb
>>>         Daniel-Constantin Mierla
>>>         > > <miconda at gmail.com <mailto:miconda at gmail.com>>:
>>>         > > >
>>>         > > > Have you done a test with tools such as sipp, or was
>>>         this happening
>>>         > > > after a while, with usual phones registering?
>>>         > > >
>>>         > > > Can you list the tcp connections and see if they are
>>>         listed?
>>>         > > >
>>>         > > > kamctl tcp core.tcp_list
>>>         > > >
>>>         > > > Cheers,
>>>         > > > Daniel
>>>         > > >
>>>         > > > On 25.03.19 08:03, Kristijan Vrban wrote:
>>>         > > > >> The solution here is to use set_reply_no_connect()
>>>         > > > > implemented it. Now the issue has shifted to:
>>>         > > > >
>>>         > > > > ERROR: <core> [core/tcp_main.c:3959]:
>>>         handle_new_connect(): maximum
>>>         > > > > number of connections exceeded: 2048/2048
>>>         > > > >
>>>         > > > > But not a single TCP connection is active between
>>>         Kamailio and any
>>>         > > > > device. Seems this counter for maximum number of
>>>         connections
>>>         > > > > now has an issue?
>>>         > > > >
>>>         > > > > Kristijan
>>>         > > > >
>>>         > > > > Am Mi., 20. März 2019 um 15:07 Uhr schrieb
>>>         Daniel-Constantin Mierla
>>>         > > > > <miconda at gmail.com <mailto:miconda at gmail.com>>:
>>>         > > > >> Hello,
>>>         > > > >>
>>>         > > > >> based on the trap output I think I could figure out
>>>         what happened there.
>>>         > > > >>
>>>         > > > >> You have tcp_children to very low value (1 or so),
>>>         the problem is not
>>>         > > > >> actually that one, but the fact that the connection
>>>         to upstream (the
>>>         > > > >> device/app sending the request) was closed after
>>>         receiving the request
>>>         > > > >> and routing of the reply gets stuck in the way of:
>>>         > > > >>
>>>         > > > >>   - a reply is received and has to be forwarded
>>>         > > > >>   - connection was lost, so Kamailio tries to
>>>         establish a new one, but
>>>         > > > >> takes time till fails because the upstream is
>>>         behind nat or so based on
>>>         > > > >> the via header:
>>>         > > > >>
>>>         > > > >> Via: SIP/2.0/TLS
>>>         > > > >>
>>>         10.1.0.4:10002;rport=55229;received=13.94.188.218;branch=z9hG4bK-3336-7f2927bfd703ae907348edff3611bfc9
>>>         > > > >>
>>>         > > > >>   - the reply is retransmitted and gets to another
>>>         worker, which tries
>>>         > > > >> to forward it again, but discovers a connection
>>>         structure for that
>>>         > > > >> destination exists (created by previous reply
>>>         worker) and now waits for
>>>         > > > >> the connection to be released (or better said, for
>>>         the mutex on writing
>>>         > > > >> buffer to be unlocked)
>>>         > > > >>
>>>         > > > >>   - as the second reply waits, there can be other
>>>         retransmissions of the
>>>         > > > >> reply ending up in other workers stuck on waiting
>>>         for the mutex of the
>>>         > > > >> connection write buffer
>>>         > > > >>
>>>         > > > >> The solution here is to use set_reply_no_connect()
>>>         -- you can put it
>>>         > > > >> first in request_route block. I think this would be
>>>         a good addition to
>>>         > > > >> the default configuration file as well, IMO, the
>>>         sip server should not
>>>         > > > >> connect for sending replies and should do it also
>>>         for requests that go
>>>         > > > >> behind nat.
>>>         > > > >>
>>>         > > > >> Cheers,
>>>         > > > >> Daniel
>>>         > > > >>
>>>         > > > >> On 19.03.19 10:53, Kristijan Vrban wrote:
>>>         > > > >>> So i had again the situation. But this time,
>>>         incoming udp was
>>>         > > > >>> affected. Kamailio was sending out OPTIONS (via
>>>         dispatcher module) to
>>>         > > > >>> a group of asterisk machines
>>>         > > > >>> but the 200 OK reply to the OPTIONS where not
>>>         processed, so the
>>>         > > > >>> dispatcher module set all asterisk to inactive,
>>>         even though they
>>>         > > > >>> replied 200 OK
>>>         > > > >>>
>>>         > > > >>> Attached the output of kamctl trap during the
>>>         situation. Hope there is
>>>         > > > >>> any useful in it. Because after "kamctl trap" it
>>>         was working again
>>>         > > > >>> without kamailio restart.
>>>         > > > >>>
>>>         > > > >>> Best
>>>         > > > >>> Kristijan
>>>         > > > >>>
>>>         > > > >>> Am Mo., 18. März 2019 um 12:27 Uhr schrieb
>>>         Daniel-Constantin Mierla
>>>         > > > >>> <miconda at gmail.com <mailto:miconda at gmail.com>>:
>>>         > > > >>>> Hello,
>>>         > > > >>>>
>>>         > > > >>>> setting tcp_children=1 is not a god option for
>>>         scallability, practically
>>>         > > > >>>> you set kamailio to process a single tcp message
>>>         at one time, on high
>>>         > > > >>>> traffic, that won't work well.
>>>         > > > >>>>
>>>         > > > >>>> Maybe try to set tcp_children to 2 or 4, that
>>>         should make an eventual
>>>         > > > >>>> race appear faster.
>>>         > > > >>>>
>>>         > > > >>>> Regarding the pid, if it is an outgoing
>>>         connection, then it can be
>>>         > > > >>>> created by any worker process, including a UDP
>>>         worker, if that was the
>>>         > > > >>>> one receiving the sip message over udp and sends
>>>         it out via tcp.
>>>         > > > >>>>
>>>         > > > >>>> Cheers,
>>>         > > > >>>> Daniel
>>>         > > > >>>>
>>>         > > > >>>> On 18.03.19 10:09, Kristijan Vrban wrote:
>>>         > > > >>>>> Hi Daniel,
>>>         > > > >>>>>
>>>         > > > >>>>> for testing, i now had set: "tcp_children=1" and
>>>         so far this issue did not occur
>>>         > > > >>>>> ever since. So now value to provide for "kamctl
>>>         trap" yet.
>>>         > > > >>>>>
>>>         > > > >>>>> "kamctl ps" show this two process to handle tcp:
>>>         > > > >>>>>
>>>         > > > >>>>> ...
>>>         > > > >>>>>     }, {
>>>         > > > >>>>>       "IDX":  25,
>>>         > > > >>>>>       "PID":  71929,
>>>         > > > >>>>>       "DSC":  "tcp receiver (generic) child=0"
>>>         > > > >>>>>     }, {
>>>         > > > >>>>>       "IDX":  26,
>>>         > > > >>>>>       "PID":  71933,
>>>         > > > >>>>>       "DSC":  "tcp main process"
>>>         > > > >>>>>     }
>>>         > > > >>>>> ...
>>>         > > > >>>>>
>>>         > > > >>>>>
>>>         > > > >>>>> Ok, but then is was wondering to see a TCP
>>>         connection on a udp receiver child:
>>>         > > > >>>>>
>>>         > > > >>>>>
>>>         > > > >>>>> netstat -ntp |grep 5061
>>>         > > > >>>>>
>>>         > > > >>>>> ...
>>>         > > > >>>>> tcp        0      0 172.17.217.10:5061
>>>         <http://172.17.217.10:5061>      195.70.114.125:18252
>>>         <http://195.70.114.125:18252>
>>>         > > > >>>>> ESTABLISHED 71895/kamailio
>>>         > > > >>>>> ...
>>>         > > > >>>>>
>>>         > > > >>>>> An pid 71895 is:
>>>         > > > >>>>>
>>>         > > > >>>>> }, {
>>>         > > > >>>>>       "IDX":  3,
>>>         > > > >>>>>       "PID":  71895,
>>>         > > > >>>>>       "DSC":  "udp receiver child=2
>>>         sock=127.0.0.1:5060 <http://127.0.0.1:5060>"
>>>         > > > >>>>>     }, {
>>>         > > > >>>>>
>>>         > > > >>>>>
>>>         > > > >>>>>
>>>         > > > >>>>> And if i look into it via "lsof -p 71895" (the
>>>         udp receiver child)
>>>         > > > >>>>>
>>>         > > > >>>>> ...
>>>         > > > >>>>> kamailio 71895 kamailio   14u  sock             
>>>           0,9      0t0
>>>         > > > >>>>> 8856085 protocol: TCP
>>>         > > > >>>>> kamailio 71895 kamailio   15u  sock             
>>>           0,9      0t0
>>>         > > > >>>>> 8886886 protocol: TCP
>>>         > > > >>>>> kamailio 71895 kamailio   16u  sock             
>>>           0,9      0t0
>>>         > > > >>>>> 8854886 protocol: TCP
>>>         > > > >>>>> kamailio 71895 kamailio   17u  sock             
>>>           0,9      0t0
>>>         > > > >>>>> 8828915 protocol: TCP
>>>         > > > >>>>> kamailio 71895 kamailio   18u  unix
>>>         0x000000005f73cb91      0t0
>>>         > > > >>>>> 1680314 type=DGRAM
>>>         > > > >>>>> kamailio 71895 kamailio   19u  IPv4           
>>>         1846523      0t0
>>>         > > > >>>>> TCP kamailio-preview:sip-tls->XXX:18252
>>>         (ESTABLISHED)
>>>         > > > >>>>> kamailio 71895 kamailio   20u  sock             
>>>           0,9      0t0
>>>         > > > >>>>> 8887192 protocol: TCP
>>>         > > > >>>>> kamailio 71895 kamailio   21u  sock             
>>>           0,9      0t0
>>>         > > > >>>>> 8813634 protocol: TCP
>>>         > > > >>>>> kamailio 71895 kamailio   22u  unix
>>>         0x00000000c19bd102      0t0
>>>         > > > >>>>> 1681407 type=STREAM
>>>         > > > >>>>> kamailio 71895 kamailio   23u  sock             
>>>           0,9      0t0
>>>         > > > >>>>> 8850488 protocol: TCP
>>>         > > > >>>>> ...
>>>         > > > >>>>>
>>>         > > > >>>>> Not only the ESTABLISHED TCP session. But also
>>>         this empty sockets
>>>         > > > >>>>> "protocol: TCP"
>>>         > > > >>>>> What are they doing there in the udp receiver?
>>>         Is that how it's supposed to be?
>>>         > > > >>>>>
>>>         > > > >>>>> Kristijan
>>>         > > > >>>>>
>>>         > > > >>>>> Am Do., 14. März 2019 um 14:48 Uhr schrieb
>>>         Daniel-Constantin Mierla
>>>         > > > >>>>> <miconda at gmail.com <mailto:miconda at gmail.com>>:
>>>         > > > >>>>>> Can you get file written by `kamctl trap`? It
>>>         should have the backtrace
>>>         > > > >>>>>> for all kamailio processes. You need latest
>>>         kamailio 5.2.
>>>         > > > >>>>>>
>>>         > > > >>>>>> Also, get the output for: kamctl ps
>>>         > > > >>>>>>
>>>         > > > >>>>>> Cheers,
>>>         > > > >>>>>> Daniel
>>>         > > > >>>>>>
>>>         > > > >>>>>> On 14.03.19 13:52, Kristijan Vrban wrote:
>>>         > > > >>>>>>> When i attach via gdb to one of the tcp
>>>         worker, i see this:
>>>         > > > >>>>>>>
>>>         > > > >>>>>>> (gdb) bt
>>>         > > > >>>>>>> #0  0x00007fdaf4d14470 in futex_wait
>>>         (private=<optimized out>,
>>>         > > > >>>>>>> expected=1, futex_word=0x7fdaeca92f8c) at
>>>         > > > >>>>>>> ../sysdeps/unix/sysv/linux/futex-internal.h:61
>>>         > > > >>>>>>> #1  futex_wait_simple (private=<optimized
>>>         out>, expected=1,
>>>         > > > >>>>>>> futex_word=0x7fdaeca92f8c) at
>>>         ../sysdeps/nptl/futex-internal.h:135
>>>         > > > >>>>>>> #2  __pthread_rwlock_wrlock_slow
>>>         (rwlock=0x7fdaeca92f80) at
>>>         > > > >>>>>>> pthread_rwlock_wrlock.c:67
>>>         > > > >>>>>>> #3  0x00007fdaf0912ee9 in
>>>         CRYPTO_THREAD_write_lock () from
>>>         > > > >>>>>>> /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1
>>>         > > > >>>>>>> #4  0x00007fdaf08e1c08 in ?? () from
>>>         /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1
>>>         > > > >>>>>>> #5  0x00007fdaf08a6f69 in ?? () from
>>>         /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1
>>>         > > > >>>>>>> #6  0x00007fdaf08b36c7 in EVP_CIPHER_CTX_ctrl
>>>         () from
>>>         > > > >>>>>>> /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1
>>>         > > > >>>>>>> #7  0x00007fdaf0c31144 in ?? () from
>>>         /usr/lib/x86_64-linux-gnu/libssl.so.1.1
>>>         > > > >>>>>>> #8  0x00007fdaf0c2bddb in ?? () from
>>>         /usr/lib/x86_64-linux-gnu/libssl.so.1.1
>>>         > > > >>>>>>> #9  0x00007fdaf0c22858 in ?? () from
>>>         /usr/lib/x86_64-linux-gnu/libssl.so.1.1
>>>         > > > >>>>>>> #10 0x00007fdaf0c1af61 in SSL_do_handshake () from
>>>         > > > >>>>>>> /usr/lib/x86_64-linux-gnu/libssl.so.1.1
>>>         > > > >>>>>>> #11 0x00007fdaf0e8d31b in tls_accept
>>>         (c=0x7fdaed26fa98,
>>>         > > > >>>>>>> error=0x7ffffe2a2df0) at tls_server.c:422
>>>         > > > >>>>>>> #12 0x00007fdaf0e96a1b in tls_read_f
>>>         (c=0x7fdaed26fa98,
>>>         > > > >>>>>>> flags=0x7ffffe2c318c) at tls_server.c:1116
>>>         > > > >>>>>>> #13 0x0000556ead5e7c46 in tcp_read_headers
>>>         (c=0x7fdaed26fa98,
>>>         > > > >>>>>>> read_flags=0x7ffffe2c318c) at core/tcp_read.c:469
>>>         > > > >>>>>>> #14 0x0000556ead5ef9cb in tcp_read_req
>>>         (con=0x7fdaed26fa98,
>>>         > > > >>>>>>> bytes_read=0x7ffffe2c3184,
>>>         read_flags=0x7ffffe2c318c) at
>>>         > > > >>>>>>> core/tcp_read.c:1496
>>>         > > > >>>>>>> #15 0x0000556ead5f575f in handle_io
>>>         (fm=0x7fdaf597aa98, events=1,
>>>         > > > >>>>>>> idx=-1) at core/tcp_read.c:1862
>>>         > > > >>>>>>> #16 0x0000556ead5e2053 in io_wait_loop_epoll
>>>         (h=0x556eadaaeec0 <io_w>,
>>>         > > > >>>>>>> t=2, repeat=0) at core/io_wait.h:1065
>>>         > > > >>>>>>> #17 0x0000556ead5f6b35 in tcp_receive_loop
>>>         (unix_sock=49) at
>>>         > > > >>>>>>> core/tcp_read.c:1974
>>>         > > > >>>>>>> #18 0x0000556ead4c8e24 in tcp_init_children ()
>>>         at core/tcp_main.c:4853
>>>         > > > >>>>>>> #19 0x0000556ead3c352a in main_loop () at
>>>         main.c:1735
>>>         > > > >>>>>>> #20 0x0000556ead3ca5f8 in main (argc=13,
>>>         argv=0x7ffffe2c3828) at main.c:2675
>>>         > > > >>>>>>>
>>>         > > > >>>>>>>
>>>         > > > >>>>>>>
>>>         > > > >>>>>>>
>>>         > > > >>>>>>>
>>>         > > > >>>>>>>
>>>         > > > >>>>>>>
>>>         > > > >>>>>>> Am Do., 14. März 2019 um 13:41 Uhr schrieb
>>>         Kristijan Vrban
>>>         > > > >>>>>>> <vrban.lkml at gmail.com
>>>         <mailto:vrban.lkml at gmail.com>>:
>>>         > > > >>>>>>>> Hi, with full debug is see this in log for
>>>         every incoming TCP SIP request:
>>>         > > > >>>>>>>>
>>>         > > > >>>>>>>> Mar 14 12:10:15 kamailio-preview
>>>         /usr/sbin/kamailio[17940]: DEBUG:
>>>         > > > >>>>>>>> <core> [core/tcp_main.c:3871]: send2child():
>>>         WARNING: no free tcp
>>>         > > > >>>>>>>> receiver, connection passed to the least busy
>>>         one (105)
>>>         > > > >>>>>>>> Mar 14 12:10:15 kamailio-preview
>>>         /usr/sbin/kamailio[17940]: DEBUG:
>>>         > > > >>>>>>>> <core> [core/tcp_main.c:3875]: send2child():
>>>         selected tcp worker 2
>>>         > > > >>>>>>>> 27(17937) for activity on
>>>         [tls:172.17.217.10:5061 <http://172.17.217.10:5061>],
>>>         0x7fdaeda8f928
>>>         > > > >>>>>>>>
>>>         > > > >>>>>>>> So the Kamailio TCP process is working, and
>>>         received TCP traffic. But
>>>         > > > >>>>>>>> the tcp workers are somehow busy.
>>>         > > > >>>>>>>>
>>>         > > > >>>>>>>> When i attach via strace to the TCP worker, i
>>>         do not see any activity. Just:
>>>         > > > >>>>>>>>
>>>         > > > >>>>>>>> futex(0x7fdaeca92f8c, FUTEX_WAIT_PRIVATE, 1, NULL
>>>         > > > >>>>>>>>
>>>         > > > >>>>>>>> and nothing, even when i see the main tcp
>>>         process choose this worker process.
>>>         > > > >>>>>>>>
>>>         > > > >>>>>>>> Kristijan
>>>         > > > >>>>>>>>
>>>         > > > >>>>>>>> Am Mi., 27. Feb. 2019 um 15:14 Uhr schrieb
>>>         Kristijan Vrban
>>>         > > > >>>>>>>> <vrban.lkml at gmail.com
>>>         <mailto:vrban.lkml at gmail.com>>:
>>>         > > > >>>>>>>>> first of all thanks for the feedback. i
>>>         prepared our system now to run
>>>         > > > >>>>>>>>> with debug=3
>>>         > > > >>>>>>>>> I hope to see more then then.
>>>         > > > >>>>>>>>>
>>>         > > > >>>>>>>>> Am Mi., 27. Feb. 2019 um 11:53 Uhr schrieb
>>>         Kristijan Vrban
>>>         > > > >>>>>>>>> <vrban.lkml at gmail.com
>>>         <mailto:vrban.lkml at gmail.com>>:
>>>         > > > >>>>>>>>>> Hi kamailios,
>>>         > > > >>>>>>>>>>
>>>         > > > >>>>>>>>>> i have a creepy situation with v5.2.1
>>>         stable Kamilio. After a day or
>>>         > > > >>>>>>>>>> so, Kamailio stop to process incoming SIP
>>>         traffic via TCP. The
>>>         > > > >>>>>>>>>> incoming TCP network packages get TCP-ACK
>>>         from the OS (Debian 9,
>>>         > > > >>>>>>>>>> 4.18.0-15-generic-Linux) but Kamailio does
>>>         not show any processing for
>>>         > > > >>>>>>>>>> the SIP-Traffic incoming via TCP. No logs,
>>>         nothing. While traffic via
>>>         > > > >>>>>>>>>> UDP is working just totally fine.
>>>         > > > >>>>>>>>>>
>>>         > > > >>>>>>>>>> When i look via command "netstat -ntp" is
>>>         see, that the Recv-Q get
>>>         > > > >>>>>>>>>> bigger and bigger. e.g.:
>>>         > > > >>>>>>>>>>
>>>         > > > >>>>>>>>>> Proto Recv-Q Send-Q Local Address Foreign
>>>         Address State PID/Program
>>>         > > > >>>>>>>>>> name tcp 4566 0 172.17.217.12:5060
>>>         <http://172.17.217.12:5060> xxx.xxx.xxx.xxx:57252 ESTABLISHED
>>>         > > > >>>>>>>>>> 31347/kamailio
>>>         > > > >>>>>>>>>>
>>>         > > > >>>>>>>>>> After Kamailio restart, all is working fine
>>>         again for a day. We have
>>>         > > > >>>>>>>>>> maybe 10-20 devices online via TCP and low
>>>         call volume (1-2 call per
>>>         > > > >>>>>>>>>> minute). The only settings for tcp we have
>>>         is "tcp_delayed_ack=no"
>>>         > > > >>>>>>>>>>
>>>         > > > >>>>>>>>>> How to could we debug this situation?
>>>         Again, no error, no warings in
>>>         > > > >>>>>>>>>> the log. Just nothing.
>>>         > > > >>>>>>>>>>
>>>         > > > >>>>>>>>>> Kristijan
>>>         > > > >>>>>>> _______________________________________________
>>>         > > > >>>>>>> Kamailio (SER) - Users Mailing List
>>>         > > > >>>>>>> sr-users at lists.kamailio.org
>>>         <mailto:sr-users at lists.kamailio.org>
>>>         > > > >>>>>>>
>>>         https://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-users
>>>         > > > >>>>>> --
>>>         > > > >>>>>> Daniel-Constantin Mierla -- www.asipto.com
>>>         <http://www.asipto.com>
>>>         > > > >>>>>> www.twitter.com/miconda
>>>         <http://www.twitter.com/miconda> --
>>>         www.linkedin.com/in/miconda <http://www.linkedin.com/in/miconda>
>>>         > > > >>>>>> Kamailio World Conference - May 6-8, 2019 --
>>>         www.kamailioworld.com <http://www.kamailioworld.com>
>>>         > > > >>>>>> Kamailio Advanced Training - Mar 25-27, 2019,
>>>         in Washington, DC, USA -- www.asipto.com <http://www.asipto.com>
>>>         > > > >>>>>>
>>>         > > > >>>>> _______________________________________________
>>>         > > > >>>>> Kamailio (SER) - Users Mailing List
>>>         > > > >>>>> sr-users at lists.kamailio.org
>>>         <mailto:sr-users at lists.kamailio.org>
>>>         > > > >>>>>
>>>         https://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-users
>>>         > > > >>>> --
>>>         > > > >>>> Daniel-Constantin Mierla -- www.asipto.com
>>>         <http://www.asipto.com>
>>>         > > > >>>> www.twitter.com/miconda
>>>         <http://www.twitter.com/miconda> --
>>>         www.linkedin.com/in/miconda <http://www.linkedin.com/in/miconda>
>>>         > > > >>>> Kamailio World Conference - May 6-8, 2019 --
>>>         www.kamailioworld.com <http://www.kamailioworld.com>
>>>         > > > >>>> Kamailio Advanced Training - Mar 25-27, 2019, in
>>>         Washington, DC, USA -- www.asipto.com <http://www.asipto.com>
>>>         > > > >>>>
>>>         > > > >> --
>>>         > > > >> Daniel-Constantin Mierla -- www.asipto.com
>>>         <http://www.asipto.com>
>>>         > > > >> www.twitter.com/miconda
>>>         <http://www.twitter.com/miconda> --
>>>         www.linkedin.com/in/miconda <http://www.linkedin.com/in/miconda>
>>>         > > > >> Kamailio World Conference - May 6-8, 2019 --
>>>         www.kamailioworld.com <http://www.kamailioworld.com>
>>>         > > > >> Kamailio Advanced Training - Mar 25-27, 2019, in
>>>         Washington, DC, USA -- www.asipto.com <http://www.asipto.com>
>>>         > > > >>
>>>         > > > --
>>>         > > > Daniel-Constantin Mierla -- www.asipto.com
>>>         <http://www.asipto.com>
>>>         > > > www.twitter.com/miconda
>>>         <http://www.twitter.com/miconda> --
>>>         www.linkedin.com/in/miconda <http://www.linkedin.com/in/miconda>
>>>         > > > Kamailio World Conference - May 6-8, 2019 --
>>>         www.kamailioworld.com <http://www.kamailioworld.com>
>>>         > > > Kamailio Advanced Training - Mar 25-27, 2019, in
>>>         Washington, DC, USA -- www.asipto.com <http://www.asipto.com>
>>>         > > >
>>>         _______________________________________________
>>>         Kamailio (SER) - Users Mailing List
>>>         sr-users at lists.kamailio.org <mailto:sr-users at lists.kamailio.org>
>>>         https://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-users
>>>
>>>
>>>
>>>     -- 
>>>     Antisip - http://www.antisip.com
>>     -- 
>>     Daniel-Constantin Mierla -- www.asipto.com <http://www.asipto.com>
>>     www.twitter.com/miconda <http://www.twitter.com/miconda> -- www.linkedin.com/in/miconda <http://www.linkedin.com/in/miconda>
>>     Kamailio World Conference - May 6-8, 2019 -- www.kamailioworld.com <http://www.kamailioworld.com>
>>     Kamailio Advanced Training - Mar 25-27, 2019, in Washington, DC, USA -- www.asipto.com <http://www.asipto.com>
>
>     -- 
>     Daniel-Constantin Mierla -- www.asipto.com <http://www.asipto.com>
>     www.twitter.com/miconda <http://www.twitter.com/miconda> -- www.linkedin.com/in/miconda <http://www.linkedin.com/in/miconda>
>     Kamailio World Conference - May 6-8, 2019 -- www.kamailioworld.com <http://www.kamailioworld.com>
>
>
>
> -- 
> Antisip - http://www.antisip.com

-- 
Daniel-Constantin Mierla -- www.asipto.com
www.twitter.com/miconda -- www.linkedin.com/in/miconda
Kamailio World Conference - May 6-8, 2019 -- www.kamailioworld.com

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.kamailio.org/pipermail/sr-users/attachments/20190502/6d834bca/attachment.html>


More information about the sr-users mailing list