[SR-Users] possible TCP deadlock (tls again?) // pike module not releasing IPs

Daniel-Constantin Mierla miconda at gmail.com
Mon Jan 6 20:52:20 CET 2020


You can download the code from git repository and build the
openssl_mutex_shared.so locally.

Or install from the nightly builts, there should be the version with the
fix embedded -- after installation check kamailio -I and see if it lists
TLS_PTHREAD_MUTEX_SHARED.

Cheers,
Daniel

On 06.01.20 16:48, Andrew Chen wrote:
> I have seen similar issues with pike and running 5.2.5 installed using
> deb repo.  I don't see this openssl_mutex_shared directory:
>
> root at ashmainkama51:/usr/lib/x86_64-linux-gnu/kamailio # ls -lart
> total 400
> -rw-r--r--  1 root root  22528 Oct 10 12:29 libtrie.so.1.0
> lrwxrwxrwx  1 root root     14 Oct 10 12:29 libtrie.so.1 -> libtrie.so.1.0
> lrwxrwxrwx  1 root root     14 Oct 10 12:29 libtrie.so -> libtrie.so.1.0
> -rw-r--r--  1 root root  63864 Oct 10 12:29 libsrutils.so.1.0
> lrwxrwxrwx  1 root root     17 Oct 10 12:29 libsrutils.so.1 ->
> libsrutils.so.1.0
> lrwxrwxrwx  1 root root     17 Oct 10 12:29 libsrutils.so ->
> libsrutils.so.1.0
> -rw-r--r--  1 root root  51472 Oct 10 12:29 libsrdb2.so.1.0
> lrwxrwxrwx  1 root root     15 Oct 10 12:29 libsrdb2.so.1 ->
> libsrdb2.so.1.0
> lrwxrwxrwx  1 root root     15 Oct 10 12:29 libsrdb2.so -> libsrdb2.so.1.0
> -rw-r--r--  1 root root 211264 Oct 10 12:29 libsrdb1.so.1.0
> lrwxrwxrwx  1 root root     15 Oct 10 12:29 libsrdb1.so.1 ->
> libsrdb1.so.1.0
> lrwxrwxrwx  1 root root     15 Oct 10 12:29 libsrdb1.so -> libsrdb1.so.1.0
> drwxr-xr-x  4 root root   4096 Dec  3 21:32 .
> drwxr-xr-x  3 root root   4096 Dec  3 21:33 kamctl
> drwxr-xr-x  2 root root   4096 Dec  3 21:33 modules
> drwxr-xr-x 47 root root  36864 Dec 10 20:13 ..
> root at ashmainkama51:/usr/lib/x86_64-linux-gnu/kamailio # 
>
> and here are my command outputs:
>
> root at ashmainkama51:~ # ldd
> /usr/lib/x86_64-linux-gnu/kamailio/modules/tls.so
>         linux-vdso.so.1 (0x00007ffd28de0000)
>         libssl.so.1.1 => /usr/lib/x86_64-linux-gnu/libssl.so.1.1
> (0x00007f925de6d000)
>         libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f925da7c000)
>         libcrypto.so.1.1 => /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1
> (0x00007f925d5b1000)
>         libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0
> (0x00007f925d392000)
>         /lib64/ld-linux-x86-64.so.2 (0x00007f925e39d000)
>         libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2
> (0x00007f925d18e000)
> root at ashmainkama51:~ # kamailio -I
> Print out of kamailio internals
>   Version: kamailio 5.2.5 (x86_64/linux)
>   Default config: /etc/kamailio/kamailio.cfg
>   Default paths to modules: /usr/lib/x86_64-linux-gnu/kamailio/modules
>   Compile flags: STATS: Off, USE_TCP, USE_TLS, USE_SCTP, TLS_HOOKS,
> USE_RAW_SOCKS, DISABLE_NAGLE, USE_MCAST, DNS_IP_HACK, SHM_MEM,
> SHM_MMAP, PKG_MALLOC, Q_MALLOC, F_MALLOC, TLSF_MALLOC, DBG_SR_MEMORY,
> USE_FUTEX, FAST_LOCK-ADAPTIVE_WAIT, USE_DNS_CACHE, USE_DNS_FAILOVER,
> USE_NAPTR, USE_DST_BLACKLIST, HAVE_RESOLV_RES
>   MAX_RECV_BUFFER_SIZE=262144
>   MAX_URI_SIZE=1024
>   BUF_SIZE=65535
>   DEFAULT PKG_SIZE=8MB
>   DEFAULT SHM_SIZE=64MB
>   ADAPTIVE_WAIT_LOOPS=1024
>   TCP poll methods: poll, epoll_lt, epoll_et, sigio_rt, select
>   Source code revision ID: unknown
>   Compiled with: gcc 7.4.0
>   Compiled on:
> Thank you for flying kamailio!
>
>
> On Mon, Dec 16, 2019 at 12:51 PM Aymeric Moizard <amoizard at gmail.com
> <mailto:amoizard at gmail.com>> wrote:
>
>     Hi Daniel,
>
>     The file openssl_mutex_shared.so is there and same installation
>     time than the other files!
>
>     I have added 
>
>     Environment='LD_PRELOAD=/usr/lib/x86_64-linux-gnu/kamailio/openssl_mutex_shared/openssl_mutex_shared.so'
>
>     To my kamailio.service and I have restarted!
>     I'm sure it will behave better now ;)
>
>     Tks a lot for the help.
>     Regards,
>     Aymeric
>
>
>     Le lun. 16 déc. 2019 à 17:23, Daniel-Constantin Mierla
>     <miconda at gmail.com <mailto:miconda at gmail.com>> a écrit :
>
>         I pinged Victor to see if he can figure out what happens
>         within the deb building process that makes the libssl mutex
>         fix not enabled.
>
>         The extra .so preload object should be still installed, try to
>         see if it is at:
>
>         /usr/lib/x86_64-linux-gnu/kamailio/openssl_mutex_shared/openssl_mutex_shared.so
>
>         Cheers,
>         Daniel
>
>         On 16.12.19 12:09, Aymeric Moizard wrote:
>>         Good catch!
>>
>>         As I said in my first mail, I also add the issue with latest
>>         5.2.X so I suppose the deb package has the same issue for 52X.
>>
>>         Is the extra binary to load still there? I will check that as
>>         soon as I'm online...
>>
>>         Tks a lot!
>>         Aymeric
>>
>>         Le lun. 16 déc. 2019 à 11:16, Daniel-Constantin Mierla
>>         <miconda at gmail.com <mailto:miconda at gmail.com>> a écrit :
>>
>>             Hello,
>>
>>             for some reason the binary doesn't seem to have the
>>             libssl mutex fix, in my system with the libssl 1.1 gives:
>>
>>             # kamailio -I
>>             Print out of kamailio internals
>>               Version: kamailio 5.3.1 (x86_64/linux) f36ac2
>>               Default config: /tmp/kamailio-5.3/etc/kamailio/kamailio.cfg
>>               Default paths to modules:
>>             /tmp/kamailio-5.3/lib64/kamailio/modules
>>               Compile flags: USE_TCP, USE_TLS, USE_SCTP, TLS_HOOKS,
>>             USE_RAW_SOCKS, DISABLE_NAGLE, USE_MCAST, DNS_IP_HACK,
>>             SHM_MMAP, PKG_MALLOC, Q_MALLOC, F_MALLOC, TLSF_MALLOC,
>>             DBG_SR_MEMORY, USE_FUTEX, FAST_LOCK-ADAPTIVE_WAIT,
>>             USE_DNS_CACHE, USE_DNS_FAILOVER, USE_NAPTR,
>>             USE_DST_BLACKLIST, HAVE_RESOLV_RES, TLS_PTHREAD_MUTEX_SHARED
>>               MAX_RECV_BUFFER_SIZE=262144
>>               MAX_URI_SIZE=1024
>>               BUF_SIZE=65535
>>               DEFAULT PKG_SIZE=8MB
>>               DEFAULT SHM_SIZE=64MB
>>               ADAPTIVE_WAIT_LOOPS=1024
>>               TCP poll methods: poll, epoll_lt, epoll_et, sigio_rt,
>>             select
>>               Source code revision ID: f36ac2
>>               Compiled with: gcc 9.2.1
>>               Compiled architecture: x86_64
>>               Compiled on: 11:11:20 Dec 16 2019
>>             Thank you for flying kamailio!
>>
>>             The important part above is the presence of
>>             TLS_PTHREAD_MUTEX_SHARED compile time flag in the output.
>>
>>             Needs to be investigated why the dep packages have the
>>             kamailio binary without the libssl mutex fix enabled.
>>
>>             Cheers,
>>             Daniel
>>
>>             On 16.12.19 09:22, Aymeric Moizard wrote:
>>>             Hi Daniel,
>>>
>>>             Tks a lot for lookint at it.
>>>
>>>             $ ldd /usr/lib/x86_64-linux-gnu/kamailio/modules/tls.so
>>>                     linux-vdso.so.1 (0x00007fff997dd000)
>>>                     libssl.so.1.1 =>
>>>             /usr/lib/x86_64-linux-gnu/libssl.so.1.1 (0x00007fe40b53c000)
>>>                     libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6
>>>             (0x00007fe40b19d000)
>>>                     libcrypto.so.1.1 =>
>>>             /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1
>>>             (0x00007fe40ad03000)
>>>                     libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2
>>>             (0x00007fe40aaff000)
>>>                     libpthread.so.0 =>
>>>             /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fe40a8e2000)
>>>                     /lib64/ld-linux-x86-64.so.2 (0x00007fe40ba4a000)
>>>
>>>             $ /usr/sbin/kamailio -I
>>>             Print out of kamailio internals
>>>               Version: kamailio 5.3.1 (x86_64/linux)
>>>               Default config: /etc/kamailio/kamailio.cfg
>>>               Default paths to modules:
>>>             /usr/lib/x86_64-linux-gnu/kamailio/modules
>>>               Compile flags: USE_TCP, USE_TLS, USE_SCTP, TLS_HOOKS,
>>>             USE_RAW_SOCKS, DISABLE_NAGLE, USE_MCAST, DNS_IP_HACK,
>>>             SHM_MMAP, PKG_MALLOC, Q_MALLOC, F_MALLOC, TLSF_MALLOC,
>>>             DBG_SR_MEMORY, USE_FUTEX, FAST_LOCK-ADAPTIVE_WAIT,
>>>             USE_DNS_CACHE, USE_DNS_FAILOVER, USE_NAPTR,
>>>             USE_DST_BLACKLIST, HAVE_RESOLV_RES
>>>               MAX_RECV_BUFFER_SIZE=262144
>>>               MAX_URI_SIZE=1024
>>>               BUF_SIZE=65535
>>>               DEFAULT PKG_SIZE=8MB
>>>               DEFAULT SHM_SIZE=64MB
>>>               ADAPTIVE_WAIT_LOOPS=1024
>>>               TCP poll methods: poll, epoll_lt, epoll_et, sigio_rt,
>>>             select
>>>               Source code revision ID: unknown
>>>               Compiled with: gcc 6.3.0
>>>               Compiled architecture: x86_64
>>>               Compiled on:
>>>             Thank you for flying kamailio!
>>>
>>>             Additional note:
>>>             I have tried to better understand the pike module and
>>>             after reading the "end" of the module documentation,
>>>             I do better understand the "Tree of IP" and settings.
>>>
>>>             The pike documentation, for each settins and
>>>             description, should refer to the section "Chapter 3.
>>>             Developer Guide",
>>>             otherwise, the parameters cannot be understood. Also,
>>>             it's not possible to understand, according to me, the
>>>             real time
>>>             for removing an IP from the tree (removing it 100% or
>>>             only last node of IP)
>>>
>>>             Looking again at my statistics, I feel the first graph
>>>             is definitly showing an issue.  This graph is showing
>>>             "$stat(location-users)" and "$stat(location-contacts)".
>>>             During the 10 hours, many users are banned, unregistred,
>>>             etc..
>>>             so it is really not expected that the number of
>>>             registred users is maintained. From what I understand,
>>>             the fact
>>>             that the stats went down when deadlock dissapeared
>>>             obviouly means kamailio threads was in a bad state for the
>>>             last 10 hours...
>>>
>>>             https://www.antisip.com/sip-antisip-com-register/status2.htm  
>>>
>>>             If you need more information, let me know...
>>>             Regards
>>>             Aymeric
>>>
>>>             Le lun. 16 déc. 2019 à 08:22, Daniel-Constantin Mierla
>>>             <miconda at gmail.com <mailto:miconda at gmail.com>> a écrit :
>>>
>>>                 Hello,
>>>
>>>                 can you provide output of ldd for tls.so and output
>>>                 of "kamailio -I" (that's an uppercase i)?
>>>
>>>                 Cheers,
>>>                 Daniel
>>>
>>>                 On 13.12.19 16:39, Aymeric Moizard wrote:
>>>>                 Hi List,
>>>>
>>>>                 History:
>>>>                 * In the past, I had deadlock which was, most
>>>>                 probably, related to ssl1.1.
>>>>                   We have discussed this issue, and a fix is
>>>>                 supposed to workaround the issue that was detected.
>>>>                 * With latest 5.2.X, I have experienced ONCE a
>>>>                 similar behavior with TCP and TLS being mostly
>>>>                 stuck. I have not been using this version much, but
>>>>                 the fix was supposed to be in the core of kamailio.
>>>>
>>>>                 The status of the server this night:
>>>>                 * I'm today running version: kamailio 5.3.1
>>>>                 (x86_64/linux), 
>>>>                 * Installed on stretch using
>>>>                 http://deb.kamailio.org/kamailio53 repository.
>>>>                 * This versions use libssl1.1
>>>>                 * A user reported that he can't connect with TCP
>>>>                 * An average of 5000 IPs per 10 minutes are being
>>>>                 banned by the pike module
>>>>                    (could be twice the same)
>>>>                 Yesterday/Today:
>>>>                 * at the end of the outage, I had 2479 IP in my
>>>>                 ipban htable. (which is equivalent to my statistics
>>>>                 showing 2 bans/IP every 10 minutes = 5000)
>>>>                 * looking at my logs, it appears that most (ALL?)
>>>>                 ip being banned... are my regular users.
>>>>                 * looking at my logs, I can't understand why pike
>>>>                 would block them.
>>>>
>>>>                 This is a graph for statistics on my service for
>>>>                 the last 24 hours:
>>>>                 https://www.antisip.com/sip-antisip-com-register/status2.html  
>>>>
>>>>                 Yesterday, at 22:18:39, kamailio started to BAN
>>>>                 some IPs. 52 IPs were banned in a period of 10
>>>>                 minutes. I can confirm this from my logs.
>>>>
>>>>                 My pike configuration is this one:
>>>>
>>>>                 modparam("pike", "sampling_time_unit", 2)
>>>>                 modparam("pike", "reqs_density_per_unit", 64)
>>>>                 modparam("pike", "remove_latency", 4)
>>>>
>>>>                 When detecting the issue, this morning, I typed:
>>>>
>>>>                 $> sudo kamctl stats
>>>>                 $> sudo kamcmd htable.dump ipban
>>>>                 //FAILURE (answer too large...)
>>>>                 $> sudo kamctl trap
>>>>
>>>>                 Then, I started an agent with TCP and it worked...???
>>>>                 Then, a few seconds, may be a minute after:
>>>>
>>>>                 $> sudo kamcmd htable.dump ipban
>>>>                 //SUCCESS and shows 2479 banned ip.
>>>>
>>>>                 and... everything is back to normal in a few minutes.
>>>>
>>>>                 I haven't restarted kamailio, and all statistics
>>>>                 are as expected, as usual.
>>>>
>>>>                 Thus, it looks that " sudo kamctl trap" has
>>>>                 triggered something. I already
>>>>                 experienced a similar behavior -when testing my
>>>>                 ssl1.1 deadlock last year-.
>>>>
>>>>                 2 questions:
>>>>                 1/ I beleive my "pike" configuration should not ban
>>>>                 users. Is my pike configuration wrong?
>>>>                 As an example, pike has banned an IP sending one
>>>>                 message/second. I believe my configuration should
>>>>                 accept that?
>>>>
>>>>                 2/ Could there still be a TLS issue with libssl1.1?
>>>>
>>>>                 This is the result of the "kamctl trap":
>>>>
>>>>                 https://sip.antisip.com/kamailio-pike-or-tls-issue-13-12-2019.kamctl-trap
>>>>
>>>>                 Sorry for the long story & hoping to find a long
>>>>                 term solution or at least a workaround!
>>>>
>>>>                 Regards
>>>>                 Aymeric
>>>>
>>>>                 -- 
>>>>                 Antisip - http://www.antisip.com
>>>>
>>>>                 _______________________________________________
>>>>                 Kamailio (SER) - Users Mailing List
>>>>                 sr-users at lists.kamailio.org <mailto:sr-users at lists.kamailio.org>
>>>>                 https://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-users
>>>
>>>                 -- 
>>>                 Daniel-Constantin Mierla -- www.asipto.com <http://www.asipto.com>
>>>                 www.twitter.com/miconda <http://www.twitter.com/miconda> -- www.linkedin.com/in/miconda <http://www.linkedin.com/in/miconda>
>>>                 Kamailio World Conference - April 27-29, 2020, in Berlin -- www.kamailioworld.com <http://www.kamailioworld.com>
>>>
>>>
>>>
>>>             -- 
>>>             Antisip - http://www.antisip.com
>>
>>             -- 
>>             Daniel-Constantin Mierla -- www.asipto.com <http://www.asipto.com>
>>             www.twitter.com/miconda <http://www.twitter.com/miconda> -- www.linkedin.com/in/miconda <http://www.linkedin.com/in/miconda>
>>             Kamailio World Conference - April 27-29, 2020, in Berlin -- www.kamailioworld.com <http://www.kamailioworld.com>
>>
>         -- 
>         Daniel-Constantin Mierla -- www.asipto.com <http://www.asipto.com>
>         www.twitter.com/miconda <http://www.twitter.com/miconda> -- www.linkedin.com/in/miconda <http://www.linkedin.com/in/miconda>
>         Kamailio World Conference - April 27-29, 2020, in Berlin -- www.kamailioworld.com <http://www.kamailioworld.com>
>
>
>
>     -- 
>     Antisip - http://www.antisip.com
>     _______________________________________________
>     Kamailio (SER) - Users Mailing List
>     sr-users at lists.kamailio.org <mailto:sr-users at lists.kamailio.org>
>     https://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-users
>
>
>
> -- 
> Andy Chen
> Sr. Telephony Lead Engineer
> 415 516 5535 (M)
> achen@ <mailto:achen at thinkingphones.com>fuze.com <http://fuze.com>
>
>
> *Confidentiality Notice: The information contained in this e-mail and any
> attachments may be confidential. If you are not an intended recipient, you
> are hereby notified that any dissemination, distribution or copying of
> this
> e-mail is strictly prohibited. If you have received this e-mail in error,
> please notify the sender and permanently delete the e-mail and any
> attachments immediately. You should not retain, copy or use this e-mail or
> any attachment for any purpose, nor disclose all or any part of the
> contents to any other person. Thank you.*
>
> _______________________________________________
> Kamailio (SER) - Users Mailing List
> sr-users at lists.kamailio.org
> https://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-users

-- 
Daniel-Constantin Mierla -- www.asipto.com
www.twitter.com/miconda -- www.linkedin.com/in/miconda
Kamailio World Conference - April 27-29, 2020, in Berlin -- www.kamailioworld.com

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.kamailio.org/pipermail/sr-users/attachments/20200106/4e3bbee8/attachment.html>


More information about the sr-users mailing list