Hello Xenofon,

 

thanks for the additional information regarding the load_crl(..) function. Regarding the question about making it configurable, I think the idea was to have a new configuration parameter to disable the SSL context duplication per process, to make it easier testable and provide a way to deactivate if its makes problems. It could be then maybe made the default later on, if its works stable. It will be certainly some code duplication, but you can probably comment better how easy or difficult it could be implemented.

 

Cheers,

 

Henning

 

--

Henning Westerholt – https://skalatan.de/blog/

Kamailio services – https://gilawa.com

 

From: Xenofon Karamanos via sr-dev <sr-dev@lists.kamailio.org>
Sent: Dienstag, 14. Mai 2024 15:54
To: Kamailio (SER) - Development Mailing List <sr-dev@lists.kamailio.org>; miconda@gmail.com
Cc: Xenofon Karamanos <xk@gilawa.com>
Subject: [sr-dev] Re: TLS.reload and memory usage of SSL_CTX structs

 

Hello Daniel,

 

Thanks for the feedback!

 

Maybe i should have explained better regarding the leak. Indeed, i couldn't clearly identify the leak itself but there is an indication that the leak is happening in the SSL_CTX_load_verify_locations() function.

 

When i comment out that part of the load_crl() function there is no more exhaustion of memory, so it's probably something SSL related. 

 

Multiple issues suggest this also, look at:

https://groups.google.com/g/envoy-dev/c/JnWnH6HcsDU

https://github.com/pyca/pyopenssl/issues/1120

https://mta.openssl.org/pipermail/openssl-users/2015-April/001255.html

https://www.mail-archive.com/openssl-users@openssl.org/msg66240.html

https://www.mail-archive.com/openssl-users@openssl.org/msg57199.html

https://groups.google.com/g/mailing.openssl.users/c/R7bzJx167V4/m/lAGAPcDVmSMJ

https://github.com/twisted/twisted/issues/12125

https://stackoverflow.com/questions/29845527/how-to-properly-uninitialize-openssl

 

But i can't say for sure because the exact same function is called for CA list (maybe exhaustion here is much slower). 

 

I don't know the internal of SSL but when i enable the memory debugs, i see a lot of mallocs and reallocs in that call, and no free() for the respective memories when reloading each new tls.reload. Freeing() happens only when kamailio is terminated.

 

The idea of reusing instead of recreating came from this https://stackoverflow.com/questions/67868098/how-to-duplicate-a-ssl-ctx-object-in-a-tls-application

 

>>  (e.g., by libssl internally on handling traffic)

 

SSL_CTX should not be chagned once you have initialized an SSL connection according to this https://github.com/openssl/openssl/discussions/24203. maybe it was never needed but then again i am not deeply familiar with how kamailio uses these ctx.

 

Make configurable what? whether to initialize it once and reuse it vs initialize per process?

 

Thanks,

Xenofon


From: Daniel-Constantin Mierla <miconda@gmail.com>
Sent: Tuesday, May 14, 2024 15:05
To: Kamailio (SER) - Development Mailing List <sr-dev@lists.kamailio.org>
Cc: Xenofon Karamanos <xk@gilawa.com>
Subject: Re: [sr-dev] TLS.reload and memory usage of SSL_CTX structs

 

Hello,

 

thanks for digging in it!

 

As I understand, this is somehow first about reducing the size of used memory, not clearly identifying the leak itself.

 

The duplication done per process was to avoid races if local changes has to be performed (e.g., by libssl internally on handling traffic), but maybe with the libssl3.x and new threading approach is no longer needed, if ever was necessary.

 

What I would suggest is to make it configurable for now, so in case behaviour becomes unstable on high load, one can switch between modes.

 

Cheers,
Daniel

 

On 14.05.24 10:36, Xenofon Karamanos via sr-dev wrote:

Hello all,

 

I am currently looking into issue https://github.com/kamailio/kamailio/issues/3823 regarding the tls.reload and the constant increase in memory usage when it's called.

 

I tried to look something up related to CRL and related functions for what causing it but nothing to obvious. The only culprit i could find was this function call in  SSL_verify_load_locations in load_crl(). But the same exact function is also called in load_ca_list() with no such behaviour. 

 

Next thing i notice, is that on the first rpc call of tls.reload, there is a "significant" memory allocation. 

What i mean by this is that when a fresh kamailio (5.8) is started up (with -m 7, for low shared memory availability to try and replicate the issue and children=2) and then check free shared memory usage using `kamcmd core.shmmem` is around 2196096 (bytes i am assuming) for my system. 

 

On the first call of tls.reload, it drops to 539248. That's a heavy drop so i started digging why on so much allocation is caused. When loading a tls configuration, ksr_fix_domain allocates a new SSL_CTX per process. From what i am seeing this SSL_CTX, is basically the same for each one, with same configuration without any difference on settings.

 

So, what i tried is to allocated it once (see tls_reload branch in my repo), and just share this context to all of the process and the memory used is way less and also the unbounding memory seems to fixed or at least it's slowed that's barely visible.

 

What are your thoughts on these? Is this something we can done? Is the assumption that tls config will and be the same for every process or can one somehow differentiate from another?

 

Thanks,

Xenofon

 

Some numbers for anyone interested:

 

Before the patch 5.8 branch and with settings such as -m 7 (shared memory) and children=2 (with children=8 it doesn't even start, failed to load cyphers in SSL_CTX):

Fresh kamailio start shared memory:          total: 7340032

                                                   free:  2196096

 

After one tls.reload:                           free:   539248

After 100 tls.reload:                         free:    483776

                                             

 

After patch on master (kamailio -m 7 and children=2) (with children=8 it starts normally): 

Fresh kamailio start shared memory:          total: 7340032

                                                   free:  3772000

 

After one tls.reload:                            free:  3690192

After 100 tls.reload:                            free:  3681056

                                         

Image removed by sender.

Kamailio - The Open Source SIP Server for large VoIP and real-time communication platforms - - Comparing kamailio:master...xkaraman:tls_reload · kamailio/kamailio

github.com

 

Image removed by sender.

Kamailio - The Open Source SIP Server for large VoIP and real-time communication platforms - - xkaraman/kamailio

github.com

 

Image removed by sender.

Kamailio - The Open Source SIP Server for large VoIP and real-time communication platforms - - kamailio/kamailio

github.com

 

Image removed by sender.

Description We are using Kamailio 5.7.4 on Debian 12 (from http://deb.kamailio.org/kamailio57) with rtpengine as an Edgeproxy for our clients. The instance terminates SIP/TLS (with Cliencertificate...

github.com

 



_______________________________________________
Kamailio (SER) - Development Mailing List
To unsubscribe send an email to sr-dev-leave@lists.kamailio.org
-- 
Daniel-Constantin Mierla (@ asipto.com)
twitter.com/miconda -- linkedin.com/in/miconda
Kamailio Consultancy, Training and Development Services -- asipto.com