<div dir="ltr"><div dir="ltr"><div>Tks Daniel,</div><div><br></div><div>I have installed the workaround.</div><div><br></div>lsof seems to indicate that I have installed and pre-loaded openssl_mutex_shared.so correctly.</div><div dir="ltr"><br></div><div>I will let you know if I see the issue again.</div><div><br></div><div>Tks!</div><div>Aymeric<br></div><div dir="ltr"><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">Le lun. 20 mai 2019 à 09:49, Daniel-Constantin Mierla <<a href="mailto:miconda@gmail.com">miconda@gmail.com</a>> a écrit :<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div bgcolor="#FFFFFF">
<p>Hello,</p>
<p>this kind of behaviour, with long time blocking and then moving
on, is a symptom of the same issue. One of the observed behaviours
was that attaching with gdb and detaching make code running
further, that's what kamctl trap does. I haven't looked deeper,
but my guess is that some signals are sent during the gdb
operations.</p>
<p>It would be good if you can test with the workaround and see the
results. There was already a report that the issue was not seen
after a rather long running time.</p>
<p>Cheers,<br>
Daniel<br>
</p>
<div class="gmail-m_3589024026897677032moz-cite-prefix">On 17.05.19 16:03, Aymeric Moizard
wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">
<div dir="ltr">Hi!
<div><br>
</div>
<div>I haven't used the workaround yet: I'm focusing on trying
to make sure I have the same issue</div>
<div>or trying to figure out how to force it to happen.</div>
<div><br>
</div>
<div>I have started to check again the server today and I
started by this command:</div>
<div> $> sudo kamcmd tls.list</div>
<div><br>
</div>
<div>In my previous description, the above was a dead lock.
Today, It finally completed, but</div>
<div>after 5 minutes. (I suspect 5 minutes is abnormal)</div>
<div><br>
</div>
<div>During the long running command:</div>
<div>-> UDP was working</div>
<div>-> TCP was not: </div>
<div>-> The TCP connection is being ESTABLISHED, but the
SIP message was not replied.</div>
<div> (this was the behavior I had before)</div>
<div><br>
</div>
<div>At the same time, I took a trap "sudo kamctl trap".
(during the dead lock)</div>
<div>-> one thread is on "tls_list" (tls_rpc.c:154)</div>
<div>-> one thread is on tcpconn_get (core/tcp_main.c:1449)
called from tcp_send (core/tcp_main.c:1716)</div>
<div> and seems to be sending a 484 Address Incomplete on a
TLS connection</div>
<div>-> 2 threads are on CRYPTO_THREAD_write_lock on a
backtrace showing "SSL_do_handshake/tls_accept"</div>
<div><br>
</div>
<div>Suddenly, "sudo kamcmd tls.list" completed, and then, my
TCP Agent received</div>
<div>4 answers from kamailio for the last 4 REGISTER sent.</div>
<div><br>
</div>
<div>I have a network capture for my TCP agent.</div>
<div>I have a trap showing 2 thread waiting on
"CRYPTO_THREAD_write_lock"</div>
<div><br>
</div>
<div>Conclusion:</div>
<div>The use-case showed that the lock was VERY long.</div>
<div>The use-case showed that the lock was TEMPORARY...</div>
<div><br>
</div>
<div>Side-note: From my understanding of the
multi-fork/openssl issue, I would expect<br>
</div>
<div>to see dead lock happening very fast after a kamailio
restart?</div>
<div><br>
</div>
<div>Do you expect the preload workaround to work in such
behavior?<br>
</div>
<div>Or do you consider that my issue is different?</div>
<div><br>
</div>
<div>Because there is no "real" dead-lock, I don't understand
why "my" issue would be related to libssl1.1...</div>
<div><br>
</div>
<div>My gdb trap, network capture are available in private
exchange if you need! (please ask me by direct email)</div>
<div><br>
</div>
<div>Tks</div>
</div>
Aymeric
</div></blockquote></div></blockquote></div><div><br></div>-- <br><div dir="ltr" class="gmail_signature"><img src="http://sip.antisip.com/am48.png">Antisip - <a href="http://www.antisip.com" target="_blank">http://www.antisip.com</a><br></div></div></div>