Hello Daniel,

Thanks for the feedback!

Maybe i should have explained better regarding the leak. Indeed, i couldn't clearly identify the leak itself but there is an indication that the leak is happening in the SSL_CTX_load_verify_locations() function.

When i comment out that part of the load_crl() function there is no more exhaustion of memory, so it's probably something SSL related. 

Multiple issues suggest this also, look at:
https://groups.google.com/g/envoy-dev/c/JnWnH6HcsDU
https://github.com/pyca/pyopenssl/issues/1120
https://mta.openssl.org/pipermail/openssl-users/2015-April/001255.html
https://www.mail-archive.com/openssl-users@openssl.org/msg66240.html
https://www.mail-archive.com/openssl-users@openssl.org/msg57199.html
https://groups.google.com/g/mailing.openssl.users/c/R7bzJx167V4/m/lAGAPcDVmSMJ
https://github.com/twisted/twisted/issues/12125
https://stackoverflow.com/questions/29845527/how-to-properly-uninitialize-openssl

But i can't say for sure because the exact same function is called for CA list (maybe exhaustion here is much slower). 

I don't know the internal of SSL but when i enable the memory debugs, i see a lot of mallocs and reallocs in that call, and no free() for the respective memories when reloading each new tls.reload. Freeing() happens only when kamailio is terminated.

The idea of reusing instead of recreating came from this https://stackoverflow.com/questions/67868098/how-to-duplicate-a-ssl-ctx-object-in-a-tls-application

>>  (e.g., by libssl internally on handling traffic)

SSL_CTX should not be chagned once you have initialized an SSL connection according to this https://github.com/openssl/openssl/discussions/24203. maybe it was never needed but then again i am not deeply familiar with how kamailio uses these ctx.

Make configurable what? whether to initialize it once and reuse it vs initialize per process?

Thanks,
Xenofon

From: Daniel-Constantin Mierla <miconda@gmail.com>
Sent: Tuesday, May 14, 2024 15:05
To: Kamailio (SER) - Development Mailing List <sr-dev@lists.kamailio.org>
Cc: Xenofon Karamanos <xk@gilawa.com>
Subject: Re: [sr-dev] TLS.reload and memory usage of SSL_CTX structs
 

Hello,


thanks for digging in it!


As I understand, this is somehow first about reducing the size of used memory, not clearly identifying the leak itself.


The duplication done per process was to avoid races if local changes has to be performed (e.g., by libssl internally on handling traffic), but maybe with the libssl3.x and new threading approach is no longer needed, if ever was necessary.


What I would suggest is to make it configurable for now, so in case behaviour becomes unstable on high load, one can switch between modes.


Cheers,
Daniel


On 14.05.24 10:36, Xenofon Karamanos via sr-dev wrote:
Hello all,

I am currently looking into issue https://github.com/kamailio/kamailio/issues/3823 regarding the tls.reload and the constant increase in memory usage when it's called.

I tried to look something up related to CRL and related functions for what causing it but nothing to obvious. The only culprit i could find was this function call in  SSL_verify_load_locations in load_crl(). But the same exact function is also called in load_ca_list() with no such behaviour. 

Next thing i notice, is that on the first rpc call of tls.reload, there is a "significant" memory allocation. 
What i mean by this is that when a fresh kamailio (5.8) is started up (with -m 7, for low shared memory availability to try and replicate the issue and children=2) and then check free shared memory usage using `kamcmd core.shmmem` is around 2196096 (bytes i am assuming) for my system. 

On the first call of tls.reload, it drops to 539248. That's a heavy drop so i started digging why on so much allocation is caused. When loading a tls configuration, ksr_fix_domain allocates a new SSL_CTX per process. From what i am seeing this SSL_CTX, is basically the same for each one, with same configuration without any difference on settings.

So, what i tried is to allocated it once (see tls_reload branch in my repo), and just share this context to all of the process and the memory used is way less and also the unbounding memory seems to fixed or at least it's slowed that's barely visible.

What are your thoughts on these? Is this something we can done? Is the assumption that tls config will and be the same for every process or can one somehow differentiate from another?

Thanks,
Xenofon

Some numbers for anyone interested:

Before the patch 5.8 branch and with settings such as -m 7 (shared memory) and children=2 (with children=8 it doesn't even start, failed to load cyphers in SSL_CTX):
Fresh kamailio start shared memory:          total: 7340032
                                                   free:  2196096

After one tls.reload:                           free:   539248
After 100 tls.reload:                         free:    483776
                                             

After patch on master (kamailio -m 7 and children=2) (with children=8 it starts normally): 
Fresh kamailio start shared memory:          total: 7340032
                                                   free:  3772000

After one tls.reload:                            free:  3690192
After 100 tls.reload:                            free:  3681056
                                         
Kamailio - The Open Source SIP Server for large VoIP and real-time communication platforms - - Comparing kamailio:master...xkaraman:tls_reload · kamailio/kamailio
github.com

Kamailio - The Open Source SIP Server for large VoIP and real-time communication platforms - - xkaraman/kamailio
github.com

Kamailio - The Open Source SIP Server for large VoIP and real-time communication platforms - - kamailio/kamailio
github.com

Description We are using Kamailio 5.7.4 on Debian 12 (from http://deb.kamailio.org/kamailio57) with rtpengine as an Edgeproxy for our clients. The instance terminates SIP/TLS (with Cliencertificate...
github.com


_______________________________________________ Kamailio (SER) - Development Mailing List To unsubscribe send an email to sr-dev-leave@lists.kamailio.org
-- Daniel-Constantin Mierla (@ asipto.com) twitter.com/miconda -- linkedin.com/in/miconda Kamailio Consultancy, Training and Development Services -- asipto.com