<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<p>Hello,</p>
<p>this kind of behaviour, with long time blocking and then moving
on, is a symptom of the same issue. One of the observed behaviours
was that attaching with gdb and detaching make code running
further, that's what kamctl trap does. I haven't looked deeper,
but my guess is that some signals are sent during the gdb
operations.</p>
<p>It would be good if you can test with the workaround and see the
results. There was already a report that the issue was not seen
after a rather long running time.</p>
<p>Cheers,<br>
Daniel<br>
</p>
<div class="moz-cite-prefix">On 17.05.19 16:03, Aymeric Moizard
wrote:<br>
</div>
<blockquote type="cite"
cite="mid:CALM7LKMjDcdSZGW5jQ206Vmq72QtRRixgjr2q4MKSkD6YTrN9Q@mail.gmail.com">
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<div dir="ltr">
<div dir="ltr">Hi!
<div><br>
</div>
<div>I haven't used the workaround yet: I'm focusing on trying
to make sure I have the same issue</div>
<div>or trying to figure out how to force it to happen.</div>
<div><br>
</div>
<div>I have started to check again the server today and I
started by this command:</div>
<div> $> sudo kamcmd tls.list</div>
<div><br>
</div>
<div>In my previous description, the above was a dead lock.
Today, It finally completed, but</div>
<div>after 5 minutes. (I suspect 5 minutes is abnormal)</div>
<div><br>
</div>
<div>During the long running command:</div>
<div>-> UDP was working</div>
<div>-> TCP was not: </div>
<div>-> The TCP connection is being ESTABLISHED, but the
SIP message was not replied.</div>
<div> (this was the behavior I had before)</div>
<div><br>
</div>
<div>At the same time, I took a trap "sudo kamctl trap".
(during the dead lock)</div>
<div>-> one thread is on "tls_list" (tls_rpc.c:154)</div>
<div>-> one thread is on tcpconn_get (core/tcp_main.c:1449)
called from tcp_send (core/tcp_main.c:1716)</div>
<div> and seems to be sending a 484 Address Incomplete on a
TLS connection</div>
<div>-> 2 threads are on CRYPTO_THREAD_write_lock on a
backtrace showing "SSL_do_handshake/tls_accept"</div>
<div><br>
</div>
<div>Suddenly, "sudo kamcmd tls.list" completed, and then, my
TCP Agent received</div>
<div>4 answers from kamailio for the last 4 REGISTER sent.</div>
<div><br>
</div>
<div>I have a network capture for my TCP agent.</div>
<div>I have a trap showing 2 thread waiting on
"CRYPTO_THREAD_write_lock"</div>
<div><br>
</div>
<div>Conclusion:</div>
<div>The use-case showed that the lock was VERY long.</div>
<div>The use-case showed that the lock was TEMPORARY...</div>
<div><br>
</div>
<div>Side-note: From my understanding of the
multi-fork/openssl issue, I would expect<br>
</div>
<div>to see dead lock happening very fast after a kamailio
restart?</div>
<div><br>
</div>
<div>Do you expect the preload workaround to work in such
behavior?<br>
</div>
<div>Or do you consider that my issue is different?</div>
<div><br>
</div>
<div>Because there is no "real" dead-lock, I don't understand
why "my" issue would be related to libssl1.1...</div>
<div><br>
</div>
<div>My gdb trap, network capture are available in private
exchange if you need! (please ask me by direct email)</div>
<div><br>
</div>
<div>Tks</div>
</div>
Aymeric
<div><br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">Le lun. 13 mai 2019
à 12:48, Daniel-Constantin Mierla <<a
href="mailto:miconda@gmail.com" moz-do-not-send="true">miconda@gmail.com</a>>
a écrit :<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
<div bgcolor="#FFFFFF">
<p>Hello,</p>
<p>thanks for the feedback! It is good to know that it
works well so far for you. I don't see any reason not
to make the library to preload as part of the next
release.</p>
<p>Just to let everyone know, for now, the built
packages are pinned to link against libssl 1.0.x.</p>
<p>Soon, I will approach the openssl project in order to
find a proper solution for long term.</p>
<p>Cheers,<br>
Daniel<br>
</p>
<div class="gmail-m_4540465005220864230moz-cite-prefix">On
13.05.19 10:48, Floimair Florian wrote:<br>
</div>
<blockquote type="cite">
<div class="gmail-m_4540465005220864230WordSection1">
<p class="MsoNormal"><span lang="EN-US">Hi all!</span></p>
<p class="MsoNormal"><span lang="EN-US"> </span></p>
<p class="MsoNormal"><span lang="EN-US">We have used
the work-around with the pre-loaded library and
so far this seems to have fixed our problem
(that my colleague Kristijan Vrban reported).</span></p>
<p class="MsoNormal"><span lang="EN-US">At least we
did not have a single failure within the last
week, whereas before the issue happened about
once every 2 days.</span></p>
<p class="MsoNormal"><span lang="EN-US">Would be
nice if this would be part of the next Kamailio
version.</span></p>
<p class="MsoNormal"><span lang="EN-US"> </span></p>
<div>
<p class="MsoNormal"><span lang="EN-US"> </span></p>
<p class="MsoNormal"><span lang="EN-US"> </span></p>
<p class="MsoNormal"><span
style="font-size:10pt;font-family:Arial,sans-serif;color:black"
lang="EN-US">With best regards</span></p>
<p class="MsoNormal"><span
style="font-size:10pt;font-family:Arial,sans-serif;color:black"
lang="EN-US"><br>
<b>Florian Floimair<br>
</b>Innovation - Software-Development<br>
<br>
<b>COMMEND INTERNATIONAL GMBH<br>
</b>A-5020 Salzburg, Saalachstraße 51<br>
</span><span lang="DE-AT"><a
href="http://www.commend.com/"
target="_blank" moz-do-not-send="true"><span
style="font-size:10pt;font-family:Arial,sans-serif;color:rgb(5,99,193)"
lang="EN-US">http://www.commend.com</span></a></span><span
style="font-size:10pt;font-family:Arial,sans-serif;color:black"
lang="EN-US"><br>
<br>
<b>Security and Communication by Commend<br>
<br>
</b></span><span
style="font-size:8pt;font-family:Arial,sans-serif;color:gray"
lang="EN-US">FN 178618z | LG Salzburg</span><span
lang="EN-US"></span></p>
</div>
<p class="MsoNormal"><span lang="EN-US"> </span></p>
<div
style="border-right:none;border-bottom:none;border-left:none;border-top:1pt
solid rgb(181,196,223);padding:3pt 0cm 0cm">
<p class="MsoNormal"><b><span
style="font-size:12pt;color:black">Von: </span></b><span
style="font-size:12pt;color:black">sr-users <a
class="gmail-m_4540465005220864230moz-txt-link-rfc2396E"
href="mailto:sr-users-bounces@lists.kamailio.org"
target="_blank" moz-do-not-send="true"><sr-users-bounces@lists.kamailio.org></a>
im Auftrag von Daniel-Constantin Mierla <a
class="gmail-m_4540465005220864230moz-txt-link-rfc2396E"
href="mailto:miconda@gmail.com"
target="_blank" moz-do-not-send="true"><miconda@gmail.com></a><br>
<b>Antworten an: </b><a
class="gmail-m_4540465005220864230moz-txt-link-rfc2396E"
href="mailto:miconda@gmail.com"
target="_blank" moz-do-not-send="true">"miconda@gmail.com"</a>
<a
class="gmail-m_4540465005220864230moz-txt-link-rfc2396E"
href="mailto:miconda@gmail.com"
target="_blank" moz-do-not-send="true"><miconda@gmail.com></a>,
"Kamailio (SER) - Users Mailing List" <a
class="gmail-m_4540465005220864230moz-txt-link-rfc2396E"
href="mailto:sr-users@lists.kamailio.org"
target="_blank" moz-do-not-send="true"><sr-users@lists.kamailio.org></a><br>
<b>Datum: </b>Montag, 15. April 2019 um 09:07<br>
<b>An: </b>Aymeric Moizard <a
class="gmail-m_4540465005220864230moz-txt-link-rfc2396E"
href="mailto:amoizard@gmail.com"
target="_blank" moz-do-not-send="true"><amoizard@gmail.com></a>,
"Kamailio (SER) - Users Mailing List" <a
class="gmail-m_4540465005220864230moz-txt-link-rfc2396E"
href="mailto:sr-users@lists.kamailio.org"
target="_blank" moz-do-not-send="true"><sr-users@lists.kamailio.org></a><br>
<b>Betreff: </b>Re: [SR-Users] Kamailio stop
to process incoming SIP traffic via TCP.</span></p>
</div>
<div>
<p class="MsoNormal"> </p>
</div>
<p>Hello Aymeric,</p>
<p>would you be able to test with tls module
compiled against libssl 1.1 and using the
pre-loaded shared object workaround?</p>
<p> * <a
href="https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fkamailio%2Fkamailio%2Ftree%2Fmaster%2Fsrc%2Fmodules%2Ftls%2Futils%2Fopenssl_mutex_shared&data=02%7C01%7Cf.floimair%40commend.com%7C4008d49af1b347abe20308d6c1710532%7C13b1ddb756454e7fbe663171548559da%7C0%7C0%7C636909088526480174&sdata=d9E%2Fy4cvdLkGCPUexoCJ7tws3QL4rFqz5ebnMGnsESQ%3D&reserved=0"
target="_blank" moz-do-not-send="true">
https://github.com/kamailio/kamailio/tree/master/src/modules/tls/utils/openssl_mutex_shared</a></p>
<p>You should be able to use it with any version, no
need to test with kamailio master branch.</p>
<p>Just clone the master branch, then:</p>
<p>cd src/modules/tls/utils/openssl_mutex_shared</p>
<p>make</p>
<p>Either from there or copy openssl_mutex_shared.so
to a location you want, then pre-load it before
starting your version of Kamailio.</p>
<p>The README.md in the folder has some more
details.</p>
<p>I would like to have some validation that it
works fine before approaching this topic with
libssl project to allow to init the locks with
shared process option.</p>
<p>Thanks,<br>
Daniel</p>
<div>
<p class="MsoNormal">On 26.03.19 16:18,
Daniel-Constantin Mierla wrote:</p>
</div>
<blockquote style="margin-top:5pt;margin-bottom:5pt">
<p>Hello,</p>
<p>yep, locking there is expected, as listing the
tls connections wait for no other processes to
change the content of internal tls connection
structures. So it is a side effect of
libssl/libcrypto getting stuck and the other
processing waiting for it to move one. I have
the Kamailio training in USA these days, so the
trip and schedule of the day didn't allow me to
look more at the libsll/libcrypto code in order
to find a solution here. It is a high priority
in my list, as I get time during the next days.</p>
<p>Cheers,<br>
Daniel</p>
<div>
<p class="MsoNormal">On 26.03.19 15:55, Aymeric
Moizard wrote:</p>
</div>
<blockquote
style="margin-top:5pt;margin-bottom:5pt">
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<p class="MsoNormal">Hi All, </p>
<div>
<p class="MsoNormal"> </p>
</div>
<div>
<p class="MsoNormal">I was
debugging a TCP issue (most
probably, I may start a thread
for this question).</p>
</div>
<div>
<p class="MsoNormal"> </p>
</div>
<div>
<p class="MsoNormal">I was
trying to get some info for
TCP and TLS.</p>
</div>
<div>
<p class="MsoNormal"> </p>
</div>
<div>
<p class="MsoNormal">I typed:</p>
</div>
<div>
<p class="MsoNormal">$> sudo
kamctl rpc tls.list</p>
</div>
<div>
<p class="MsoNormal"> </p>
</div>
<div>
<p class="MsoNormal">And waited
for a while.... until... I
realized that my User-Agent,
connected with TCP was not
able to register any more. I
think the rpc command has
introduced something wrong.</p>
</div>
<div>
<p class="MsoNormal"> </p>
</div>
<div>
<p class="MsoNormal">The device
can successfully "connect",
send the REGISTER over the
established TCP connection.
The REGISTER do not appear in
the logs any more, I don't see
any traffic for TCP any more.
So the behavior is the same as
I had before: TCP and TLS are
both not working and UDP is
still working fine.</p>
</div>
<div>
<p class="MsoNormal"> </p>
</div>
<div>
<p class="MsoNormal">kamctl do
not work any more... so kamctl
trap do not work...</p>
</div>
<div>
<p class="MsoNormal"> </p>
</div>
<div>
<p class="MsoNormal">I have been
able to type.. manually... for
(all?) kamailio threads:</p>
</div>
<div>
<p class="MsoNormal"> </p>
</div>
<div>
<p class="MsoNormal">gdb
/usr/sbin/kamailio 16500
-batch --eval-command="bt
full" >>
kamailio-trap-tcp-down.txt</p>
</div>
<div>
<p class="MsoNormal"> </p>
</div>
<div>
<p class="MsoNormal">I'm
temporarly puting the
backtrace I have here:</p>
</div>
<div>
<p class="MsoNormal"><a
href="https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsip.antisip.com%2Fkamailio-trap-tcp-down.txt&data=02%7C01%7Cf.floimair%40commend.com%7C4008d49af1b347abe20308d6c1710532%7C13b1ddb756454e7fbe663171548559da%7C0%7C0%7C636909088526490178&sdata=1lfFxvR0m4PVcfnYsrrIO%2FM2nbGK6zfpl2C01O2c7M0%3D&reserved=0"
target="_blank"
moz-do-not-send="true">https://sip.antisip.com/kamailio-trap-tcp-down.txt</a></p>
</div>
<div>
<p class="MsoNormal"> </p>
</div>
<div>
<p class="MsoNormal">You can see
a thread stuck on the json
command line: "<span
style="color:black">tls_list"</span></p>
</div>
<div>
<p class="MsoNormal"><span
style="color:black">And many
other waiting on
CRYPTO_THREAD_write_lock</span></p>
</div>
<div>
<p class="MsoNormal"><span
style="color:black">? might
be related to: <a
href="https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fopenssl%2Fopenssl%2Fissues%2F5376&data=02%7C01%7Cf.floimair%40commend.com%7C4008d49af1b347abe20308d6c1710532%7C13b1ddb756454e7fbe663171548559da%7C0%7C0%7C636909088526490178&sdata=D5Fb4U3trdbRUY7ifMLSc5KE4mAxjK2%2BzOy8nSD1Rks%3D&reserved=0"
target="_blank"
moz-do-not-send="true">
https://github.com/openssl/openssl/issues/5376</a></span></p>
</div>
<div>
<p class="MsoNormal">SIDE NOTE:</p>
</div>
<div>
<p class="MsoNormal">Right
before I was typing the last
gdb command for the last
thread, kamailio</p>
</div>
<div>
<p class="MsoNormal">has
crashed: This was around 5
minutes after the dead lock
started.</p>
</div>
<div>
<p class="MsoNormal"> </p>
</div>
<div>
<div>
<p class="MsoNormal">Mar 26
14:47:11 sip
kamailio[16493]: ERROR:
<core>
[core/tcp_main.c:2561]:
tcpconn_do_send(): failed to
send on 0x7ff8dfc2fdc8
(91.121.30.149:5061-><a
href="https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2F62.210.97.21%3A49351&data=02%7C01%7Cf.floimair%40commend.com%7C4008d49af1b347abe20308d6c1710532%7C13b1ddb756454e7fbe663171548559da%7C0%7C0%7C636909088526500195&sdata=9XqEUKoMwNEvCPFtKfvB0c43yk1GcSzYOiPdY9Pj1uo%3D&reserved=0"
target="_blank"
moz-do-not-send="true">62.210.97.21:49351</a>):
Broken pipe (32)</p>
</div>
<div>
<p class="MsoNormal">Mar 26
14:47:11 sip
kamailio[16493]: ERROR:
<core>
[core/tcp_read.c:1505]:
tcp_read_req(): ERROR:
tcp_read_req: error reading
- c: 0x7ff8dfc2fdc8 r:
0x7ff8dfc2fe48 (-1)</p>
</div>
<div>
<p class="MsoNormal">Mar 26
14:47:11 sip
kamailio[16493]: WARNING:
<core>
[core/tcp_read.c:1848]:
handle_io(): F_TCPCONN
connection marked as bad:
0x7ff8dfa6a408 id 846 refcnt
3</p>
</div>
<div>
<p class="MsoNormal">Mar 26
14:47:11 sip
kamailio[16371]: ALERT:
<core> [main.c:755]:
handle_sigs(): child process
16374 exited by a signal 11</p>
</div>
<div>
<p class="MsoNormal">Mar 26
14:47:11 sip
kamailio[16371]: ALERT:
<core> [main.c:758]:
handle_sigs(): core was not
generated</p>
</div>
<div>
<p class="MsoNormal">Mar 26
14:47:11 sip
kamailio[16371]: INFO:
<core> [main.c:781]:
handle_sigs(): terminating
due to SIGCHLD</p>
</div>
<div>
<p class="MsoNormal">Mar 26
14:47:11 sip
kamailio[16493]: INFO:
<core> [main.c:836]:
sig_usr(): signal 15
received</p>
</div>
<div>
<p class="MsoNormal">Mar 26
14:47:11 sip
kamailio[16500]: INFO:
<core> [main.c:836]:
sig_usr(): signal 15
received</p>
</div>
<div>
<p class="MsoNormal">Mar 26
14:47:11 sip
kamailio[16479]: INFO:
<core> [main.c:836]:
sig_usr(): signal 15
received</p>
</div>
</div>
<div>
<p class="MsoNormal"> </p>
</div>
<div>
<p class="MsoNormal"> </p>
</div>
<div>
<p class="MsoNormal">Unfortunalty,
even if I did my best to setup
my service to generate a core
on crash, I still have "core
was not generated".... (debian
stretch)</p>
</div>
<p class="MsoNormal"> </p>
<div>
<p class="MsoNormal">Tks for
reading!</p>
</div>
<div>
<p class="MsoNormal">Regards</p>
</div>
<div>
<p class="MsoNormal">Aymeric</p>
</div>
<div>
<p class="MsoNormal"> </p>
</div>
<div>
<p class="MsoNormal"> </p>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
<p class="MsoNormal"> </p>
</blockquote>
</blockquote>
</div>
</blockquote>
</div>
</blockquote>
</div>
<div><br>
</div>
-- <br>
<div dir="ltr" class="gmail_signature"><img
src="http://sip.antisip.com/am48.png"
moz-do-not-send="true">Antisip - <a
href="http://www.antisip.com" target="_blank"
moz-do-not-send="true">http://www.antisip.com</a><br>
</div>
</div>
</div>
</blockquote>
<pre class="moz-signature" cols="72">--
Daniel-Constantin Mierla -- <a class="moz-txt-link-abbreviated" href="http://www.asipto.com">www.asipto.com</a>
<a class="moz-txt-link-abbreviated" href="http://www.twitter.com/miconda">www.twitter.com/miconda</a> -- <a class="moz-txt-link-abbreviated" href="http://www.linkedin.com/in/miconda">www.linkedin.com/in/miconda</a></pre>
</body>
</html>