<div dir="ltr"><div dir="ltr">Hi!<div><br></div><div>I haven't used the workaround yet: I'm focusing on trying to make sure I have the same issue</div><div>or trying to figure out how to force it to happen.</div><div><br></div><div>I have started to check again the server today and I started by this command:</div><div> $> sudo kamcmd tls.list</div><div><br></div><div>In my previous description, the above was a dead lock. Today, It finally completed, but</div><div>after 5 minutes. (I suspect 5 minutes is abnormal)</div><div><br></div><div>During the long running command:</div><div>-> UDP was working</div><div>-> TCP was not: </div><div>-> The TCP connection is being ESTABLISHED, but the SIP message was not replied.</div><div>    (this was the behavior I had before)</div><div><br></div><div>At the same time, I took a trap "sudo kamctl trap". (during the dead lock)</div><div>-> one thread is on "tls_list" (tls_rpc.c:154)</div><div>-> one thread is on tcpconn_get (core/tcp_main.c:1449) called from tcp_send (core/tcp_main.c:1716)</div><div>    and seems to be sending a 484 Address Incomplete on a TLS connection</div><div>-> 2 threads are on CRYPTO_THREAD_write_lock on a backtrace showing "SSL_do_handshake/tls_accept"</div><div><br></div><div>Suddenly, "sudo kamcmd tls.list" completed, and then, my TCP Agent received</div><div>4 answers from kamailio for the last 4 REGISTER sent.</div><div><br></div><div>I have a network capture for my TCP agent.</div><div>I have a trap showing 2 thread waiting on "CRYPTO_THREAD_write_lock"</div><div><br></div><div>Conclusion:</div><div>The use-case showed that the lock was VERY long.</div><div>The use-case showed that the lock was TEMPORARY...</div><div><br></div><div>Side-note: From my understanding of the multi-fork/openssl issue, I would expect<br></div><div>to see dead lock happening very fast after a kamailio restart?</div><div><br></div><div>Do you expect the preload workaround to work in such behavior?<br></div><div>Or do you consider that my issue is different?</div><div><br></div><div>Because there is no "real" dead-lock, I don't understand why "my" issue would be related to libssl1.1...</div><div><br></div><div>My gdb trap, network capture are available in private exchange if you need! (please ask me by direct email)</div><div><br></div><div>Tks</div></div>Aymeric<div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">Le lun. 13 mai 2019 à 12:48, Daniel-Constantin Mierla <<a href="mailto:miconda@gmail.com">miconda@gmail.com</a>> a écrit :<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
  
    
  
  <div bgcolor="#FFFFFF">
    <p>Hello,</p>
    <p>thanks for the feedback! It is good to know that it works well so
      far for you. I don't see any reason not to make the library to
      preload as part of the next release.</p>
    <p>Just to let everyone know, for now, the built packages are pinned
      to link against libssl 1.0.x.</p>
    <p>Soon, I will approach the openssl project in order to find a
      proper solution for long term.</p>
    <p>Cheers,<br>
      Daniel<br>
    </p>
    <div class="gmail-m_4540465005220864230moz-cite-prefix">On 13.05.19 10:48, Floimair Florian
      wrote:<br>
    </div>
    <blockquote type="cite">
      
      
      
      
      <div class="gmail-m_4540465005220864230WordSection1">
        <p class="MsoNormal"><span lang="EN-US">Hi all!<u></u><u></u></span></p>
        <p class="MsoNormal"><span lang="EN-US"><u></u> <u></u></span></p>
        <p class="MsoNormal"><span lang="EN-US">We have used the work-around with the
            pre-loaded library and so far this seems to have fixed our
            problem (that my colleague Kristijan Vrban reported).<u></u><u></u></span></p>
        <p class="MsoNormal"><span lang="EN-US">At least we did not have a single failure
            within the last week, whereas before the issue happened
            about once every 2 days.<u></u><u></u></span></p>
        <p class="MsoNormal"><span lang="EN-US">Would be nice if this would be part of the next
            Kamailio version.<u></u><u></u></span></p>
        <p class="MsoNormal"><span lang="EN-US"><u></u> <u></u></span></p>
        <div>
          <p class="MsoNormal"><span lang="EN-US"><u></u> <u></u></span></p>
          <p class="MsoNormal"><span lang="EN-US"> <u></u><u></u></span></p>
          <p class="MsoNormal"><span style="font-size:10pt;font-family:Arial,sans-serif;color:black" lang="EN-US">With best regards<u></u><u></u></span></p>
          <p class="MsoNormal"><span style="font-size:10pt;font-family:Arial,sans-serif;color:black" lang="EN-US"><br>
              <b>Florian Floimair<br>
              </b>Innovation - Software-Development<br>
              <br>
              <b>COMMEND INTERNATIONAL GMBH<br>
              </b>A-5020 Salzburg, Saalachstraße 51<br>
            </span><span lang="DE-AT"><a href="http://www.commend.com/" target="_blank"><span style="font-size:10pt;font-family:Arial,sans-serif;color:rgb(5,99,193)" lang="EN-US">http://www.commend.com</span></a></span><span style="font-size:10pt;font-family:Arial,sans-serif;color:black" lang="EN-US"><br>
              <br>
              <b>Security and Communication by Commend<br>
                <br>
              </b></span><span style="font-size:8pt;font-family:Arial,sans-serif;color:gray" lang="EN-US">FN 178618z | LG Salzburg</span><span lang="EN-US"><u></u><u></u></span></p>
        </div>
        <p class="MsoNormal"><span lang="EN-US"><u></u> <u></u></span></p>
        <div style="border-right:none;border-bottom:none;border-left:none;border-top:1pt solid rgb(181,196,223);padding:3pt 0cm 0cm">
          <p class="MsoNormal"><b><span style="font-size:12pt;color:black">Von: </span></b><span style="font-size:12pt;color:black">sr-users
              <a class="gmail-m_4540465005220864230moz-txt-link-rfc2396E" href="mailto:sr-users-bounces@lists.kamailio.org" target="_blank"><sr-users-bounces@lists.kamailio.org></a> im Auftrag von
              Daniel-Constantin Mierla <a class="gmail-m_4540465005220864230moz-txt-link-rfc2396E" href="mailto:miconda@gmail.com" target="_blank"><miconda@gmail.com></a><br>
              <b>Antworten an: </b><a class="gmail-m_4540465005220864230moz-txt-link-rfc2396E" href="mailto:miconda@gmail.com" target="_blank">"miconda@gmail.com"</a>
              <a class="gmail-m_4540465005220864230moz-txt-link-rfc2396E" href="mailto:miconda@gmail.com" target="_blank"><miconda@gmail.com></a>, "Kamailio (SER) - Users Mailing
              List" <a class="gmail-m_4540465005220864230moz-txt-link-rfc2396E" href="mailto:sr-users@lists.kamailio.org" target="_blank"><sr-users@lists.kamailio.org></a><br>
              <b>Datum: </b>Montag, 15. April 2019 um 09:07<br>
              <b>An: </b>Aymeric Moizard <a class="gmail-m_4540465005220864230moz-txt-link-rfc2396E" href="mailto:amoizard@gmail.com" target="_blank"><amoizard@gmail.com></a>,
              "Kamailio (SER) - Users Mailing List"
              <a class="gmail-m_4540465005220864230moz-txt-link-rfc2396E" href="mailto:sr-users@lists.kamailio.org" target="_blank"><sr-users@lists.kamailio.org></a><br>
              <b>Betreff: </b>Re: [SR-Users] Kamailio stop to process
              incoming SIP traffic via TCP.<u></u><u></u></span></p>
        </div>
        <div>
          <p class="MsoNormal"><u></u> <u></u></p>
        </div>
        <p>Hello Aymeric,<u></u><u></u></p>
        <p>would you be able to test with tls module compiled against
          libssl 1.1 and using the pre-loaded shared object workaround?<u></u><u></u></p>
        <p>  * <a href="https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fkamailio%2Fkamailio%2Ftree%2Fmaster%2Fsrc%2Fmodules%2Ftls%2Futils%2Fopenssl_mutex_shared&data=02%7C01%7Cf.floimair%40commend.com%7C4008d49af1b347abe20308d6c1710532%7C13b1ddb756454e7fbe663171548559da%7C0%7C0%7C636909088526480174&sdata=d9E%2Fy4cvdLkGCPUexoCJ7tws3QL4rFqz5ebnMGnsESQ%3D&reserved=0" target="_blank">
https://github.com/kamailio/kamailio/tree/master/src/modules/tls/utils/openssl_mutex_shared</a><u></u><u></u></p>
        <p>You should be able to use it with any version, no need to
          test with kamailio master branch.<u></u><u></u></p>
        <p>Just clone the master branch, then:<u></u><u></u></p>
        <p>cd src/modules/tls/utils/openssl_mutex_shared<u></u><u></u></p>
        <p>make<u></u><u></u></p>
        <p>Either from there or copy openssl_mutex_shared.so to a
          location you want, then pre-load it before starting your
          version of Kamailio.<u></u><u></u></p>
        <p>The README.md in the folder has some more details.<u></u><u></u></p>
        <p>I would like to have some validation that it works fine
          before approaching this topic with libssl project to allow to
          init the locks with shared process option.<u></u><u></u></p>
        <p>Thanks,<br>
          Daniel<u></u><u></u></p>
        <div>
          <p class="MsoNormal">On 26.03.19 16:18, Daniel-Constantin
            Mierla wrote:<u></u><u></u></p>
        </div>
        <blockquote style="margin-top:5pt;margin-bottom:5pt">
          <p>Hello,<u></u><u></u></p>
          <p>yep, locking there is expected, as listing the tls
            connections wait for no other processes to change the
            content of internal tls connection structures. So it is a
            side effect of libssl/libcrypto getting stuck and the other
            processing waiting for it to move one. I have the Kamailio
            training in USA these days, so the trip and schedule of the
            day didn't allow me to look more at the libsll/libcrypto
            code in order to find a solution here. It is a high priority
            in my list, as I get time during the next days.<u></u><u></u></p>
          <p>Cheers,<br>
            Daniel<u></u><u></u></p>
          <div>
            <p class="MsoNormal">On 26.03.19 15:55, Aymeric Moizard
              wrote:<u></u><u></u></p>
          </div>
          <blockquote style="margin-top:5pt;margin-bottom:5pt">
            <div>
              <div>
                <div>
                  <div>
                    <div>
                      <div>
                        <div>
                          <p class="MsoNormal">Hi All, <u></u><u></u></p>
                          <div>
                            <p class="MsoNormal"><u></u> <u></u></p>
                          </div>
                          <div>
                            <p class="MsoNormal">I was debugging a TCP
                              issue (most probably, I may start a thread
                              for this question).<u></u><u></u></p>
                          </div>
                          <div>
                            <p class="MsoNormal"><u></u> <u></u></p>
                          </div>
                          <div>
                            <p class="MsoNormal">I was trying to get
                              some info for TCP and TLS.<u></u><u></u></p>
                          </div>
                          <div>
                            <p class="MsoNormal"><u></u> <u></u></p>
                          </div>
                          <div>
                            <p class="MsoNormal">I typed:<u></u><u></u></p>
                          </div>
                          <div>
                            <p class="MsoNormal">$> sudo kamctl rpc
                              tls.list<u></u><u></u></p>
                          </div>
                          <div>
                            <p class="MsoNormal"><u></u> <u></u></p>
                          </div>
                          <div>
                            <p class="MsoNormal">And waited for a
                              while.... until... I realized that my
                              User-Agent, connected with TCP was not
                              able to register any more. I think the rpc
                              command has introduced something wrong.<u></u><u></u></p>
                          </div>
                          <div>
                            <p class="MsoNormal"><u></u> <u></u></p>
                          </div>
                          <div>
                            <p class="MsoNormal">The device can
                              successfully "connect", send the REGISTER
                              over the established TCP connection. The
                              REGISTER do not appear in the logs any
                              more, I don't see any traffic for TCP any
                              more. So the behavior is the same as I had
                              before: TCP and TLS are both not working
                              and UDP is still working fine.<u></u><u></u></p>
                          </div>
                          <div>
                            <p class="MsoNormal"><u></u> <u></u></p>
                          </div>
                          <div>
                            <p class="MsoNormal">kamctl do not work any
                              more... so kamctl trap do not work...<u></u><u></u></p>
                          </div>
                          <div>
                            <p class="MsoNormal"><u></u> <u></u></p>
                          </div>
                          <div>
                            <p class="MsoNormal">I have been able to
                              type.. manually... for (all?) kamailio
                              threads:<u></u><u></u></p>
                          </div>
                          <div>
                            <p class="MsoNormal"><u></u> <u></u></p>
                          </div>
                          <div>
                            <p class="MsoNormal">gdb /usr/sbin/kamailio
                              16500 -batch --eval-command="bt full"
                              >> kamailio-trap-tcp-down.txt<u></u><u></u></p>
                          </div>
                          <div>
                            <p class="MsoNormal"><u></u> <u></u></p>
                          </div>
                          <div>
                            <p class="MsoNormal">I'm temporarly puting
                              the backtrace I have here:<u></u><u></u></p>
                          </div>
                          <div>
                            <p class="MsoNormal"><a href="https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsip.antisip.com%2Fkamailio-trap-tcp-down.txt&data=02%7C01%7Cf.floimair%40commend.com%7C4008d49af1b347abe20308d6c1710532%7C13b1ddb756454e7fbe663171548559da%7C0%7C0%7C636909088526490178&sdata=1lfFxvR0m4PVcfnYsrrIO%2FM2nbGK6zfpl2C01O2c7M0%3D&reserved=0" target="_blank">https://sip.antisip.com/kamailio-trap-tcp-down.txt</a><u></u><u></u></p>
                          </div>
                          <div>
                            <p class="MsoNormal"><u></u> <u></u></p>
                          </div>
                          <div>
                            <p class="MsoNormal">You can see a thread
                              stuck on the json command line: "<span style="color:black">tls_list"</span><u></u><u></u></p>
                          </div>
                          <div>
                            <p class="MsoNormal"><span style="color:black">And many other
                                waiting on CRYPTO_THREAD_write_lock</span><u></u><u></u></p>
                          </div>
                          <div>
                            <p class="MsoNormal"><span style="color:black">? might be related
                                to: <a href="https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fopenssl%2Fopenssl%2Fissues%2F5376&data=02%7C01%7Cf.floimair%40commend.com%7C4008d49af1b347abe20308d6c1710532%7C13b1ddb756454e7fbe663171548559da%7C0%7C0%7C636909088526490178&sdata=D5Fb4U3trdbRUY7ifMLSc5KE4mAxjK2%2BzOy8nSD1Rks%3D&reserved=0" target="_blank">
https://github.com/openssl/openssl/issues/5376</a></span><u></u><u></u></p>
                          </div>
                          <div>
                            <p class="MsoNormal">SIDE NOTE:<u></u><u></u></p>
                          </div>
                          <div>
                            <p class="MsoNormal">Right before I was
                              typing the last gdb command for the last
                              thread, kamailio<u></u><u></u></p>
                          </div>
                          <div>
                            <p class="MsoNormal">has crashed: This was
                              around 5 minutes after the dead lock
                              started.<u></u><u></u></p>
                          </div>
                          <div>
                            <p class="MsoNormal"><u></u> <u></u></p>
                          </div>
                          <div>
                            <div>
                              <p class="MsoNormal">Mar 26 14:47:11 sip
                                kamailio[16493]: ERROR: <core>
                                [core/tcp_main.c:2561]:
                                tcpconn_do_send(): failed to send on
                                0x7ff8dfc2fdc8 (91.121.30.149:5061-><a href="https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2F62.210.97.21%3A49351&data=02%7C01%7Cf.floimair%40commend.com%7C4008d49af1b347abe20308d6c1710532%7C13b1ddb756454e7fbe663171548559da%7C0%7C0%7C636909088526500195&sdata=9XqEUKoMwNEvCPFtKfvB0c43yk1GcSzYOiPdY9Pj1uo%3D&reserved=0" target="_blank">62.210.97.21:49351</a>):
                                Broken pipe (32)<u></u><u></u></p>
                            </div>
                            <div>
                              <p class="MsoNormal">Mar 26 14:47:11 sip
                                kamailio[16493]: ERROR: <core>
                                [core/tcp_read.c:1505]: tcp_read_req():
                                ERROR: tcp_read_req: error reading - c:
                                0x7ff8dfc2fdc8 r: 0x7ff8dfc2fe48 (-1)<u></u><u></u></p>
                            </div>
                            <div>
                              <p class="MsoNormal">Mar 26 14:47:11 sip
                                kamailio[16493]: WARNING: <core>
                                [core/tcp_read.c:1848]: handle_io():
                                F_TCPCONN connection marked as bad:
                                0x7ff8dfa6a408 id 846 refcnt 3<u></u><u></u></p>
                            </div>
                            <div>
                              <p class="MsoNormal">Mar 26 14:47:11 sip
                                kamailio[16371]: ALERT: <core>
                                [main.c:755]: handle_sigs(): child
                                process 16374 exited by a signal 11<u></u><u></u></p>
                            </div>
                            <div>
                              <p class="MsoNormal">Mar 26 14:47:11 sip
                                kamailio[16371]: ALERT: <core>
                                [main.c:758]: handle_sigs(): core was
                                not generated<u></u><u></u></p>
                            </div>
                            <div>
                              <p class="MsoNormal">Mar 26 14:47:11 sip
                                kamailio[16371]: INFO: <core>
                                [main.c:781]: handle_sigs(): terminating
                                due to SIGCHLD<u></u><u></u></p>
                            </div>
                            <div>
                              <p class="MsoNormal">Mar 26 14:47:11 sip
                                kamailio[16493]: INFO: <core>
                                [main.c:836]: sig_usr(): signal 15
                                received<u></u><u></u></p>
                            </div>
                            <div>
                              <p class="MsoNormal">Mar 26 14:47:11 sip
                                kamailio[16500]: INFO: <core>
                                [main.c:836]: sig_usr(): signal 15
                                received<u></u><u></u></p>
                            </div>
                            <div>
                              <p class="MsoNormal">Mar 26 14:47:11 sip
                                kamailio[16479]: INFO: <core>
                                [main.c:836]: sig_usr(): signal 15
                                received<u></u><u></u></p>
                            </div>
                          </div>
                          <div>
                            <p class="MsoNormal"><u></u> <u></u></p>
                          </div>
                          <div>
                            <p class="MsoNormal"><u></u> <u></u></p>
                          </div>
                          <div>
                            <p class="MsoNormal">Unfortunalty, even if I
                              did my best to setup my service to
                              generate a core on crash, I still have
                              "core was not generated".... (debian
                              stretch)<u></u><u></u></p>
                          </div>
                          <p class="MsoNormal"><u></u> <u></u></p>
                          <div>
                            <p class="MsoNormal">Tks for reading!<u></u><u></u></p>
                          </div>
                          <div>
                            <p class="MsoNormal">Regards<u></u><u></u></p>
                          </div>
                          <div>
                            <p class="MsoNormal">Aymeric<u></u><u></u></p>
                          </div>
                          <div>
                            <p class="MsoNormal"><u></u> <u></u></p>
                          </div>
                          <div>
                            <p class="MsoNormal"><u></u> <u></u></p>
                          </div>
                        </div>
                      </div>
                    </div>
                  </div>
                </div>
              </div>
            </div>
            <p class="MsoNormal"><u></u> </p></blockquote></blockquote></div></blockquote></div></blockquote></div><div><br></div>-- <br><div dir="ltr" class="gmail_signature"><img src="http://sip.antisip.com/am48.png">Antisip - <a href="http://www.antisip.com" target="_blank">http://www.antisip.com</a><br></div></div></div>