On 16/12/14 12:12, Øyvind Kolbu wrote:

On 16.12.2014 11:51, Daniel-Constantin Mierla wrote:

TLS module initializes libssl when it is loaded. db_postgress creates connections at mod init, so it is later and inside same process (no fork at that moment).

Based on the new details, I understand that some connections succeed, some not, and even sometimes all connection succeed (when having lot of log messages that add delay). What is the operating system you have? Is selinux enabled? You can try to play with fork_delay and modinit_delay core parameters:

- http://www.kamailio.org/wiki/cookbooks/4.2.x/core#fork_delay
Tried both fork_delay and modinit_delay, nothing changed. The OS in RHEL6.6 with selinux enabled, which
is mandatory for us to use.

Don't think fork limit is an issue, because tried children=40 without tls and that was no problem. children=2
with TLS failed as usual. children=40 with TLS created a core dump...

Those parameters are not about number of children, but how fast to do some operations. I had troubles with selinux, because it has some obscure limits on how fast new connections can be opened. TLS is using quite some memory, I wonder if selinux have a limit on how fast one process keeps allocating... Have you tried with children=1?


/usr/sbin/kamailio[23252]: : <core> [mem/q_malloc.c:468]: qm_free(): BUG: qm_free: freeing already freed pointer (0x7f9cbcfff030), called from tls: tls_init.c: ser_free(291), first free tls: tls_init.c: ser_free(291) - aborting
(gdb) bt
#0  0x0000003feea32625 in raise () from /lib64/libc.so.6
#1  0x0000003feea33e05 in abort () from /lib64/libc.so.6
#2  0x0000000000548e74 in qm_free (qm=0x7f9cbbe30000, p=0x7f9cbcfff030, file=0x7f9cc027ea50 "tls: tls_init.c", func=0x7f9cc0280153 "ser_free", line=291) at mem/q_malloc.c:470
#3  0x00007f9cc025d33d in ser_free (ptr=0x7f9cbcfff030) at tls_init.c:291
#4  0x000000349a66ad1d in CRYPTO_free () from /usr/lib64/libcrypto.so.10
#5  0x000000349a6e88b1 in ERR_clear_error () from /usr/lib64/libcrypto.so.10
#6  0x000000349a739099 in CONF_modules_load () from /usr/lib64/libcrypto.so.10
#7  0x000000349a739147 in CONF_modules_load_file () from /usr/lib64/libcrypto.so.10
#8  0x000000349a73922e in OPENSSL_config () from /usr/lib64/libcrypto.so.10
#9  0x0000003e4a81a457 in ?? () from /usr/lib64/libpq.so.5
#10 0x0000003e4a80c5db in PQconnectPoll () from /usr/lib64/libpq.so.5
#11 0x0000003e4a80c8ee in ?? () from /usr/lib64/libpq.so.5
#12 0x0000003e4a80d574 in PQsetdbLogin () from /usr/lib64/libpq.so.5
#13 0x00007f9cc06c1603 in db_postgres_new_connection (id=0x7f9cc1cbba28) at km_pg_con.c:77
#14 0x00007f9cc08e1eaa in db_do_init2 (url=0x7f9cc1cb3178, new_connection=0x7f9cc06c0bb7 <db_postgres_new_connection>, pooling=DB_POOLING_PERMITTED) at db.c:320
#15 0x00007f9cc08e16d5 in db_do_init (url=0x7f9cc1cb3178, new_connection=0x7f9cc06c0bb7 <db_postgres_new_connection>) at db.c:273
#16 0x00007f9cc06bab2c in db_postgres_init (_url=0x7f9cc1cb3178) at km_dbase.c:133
#17 0x00007f9cc0afaee8 in sql_connect () at sql_api.c:162
#18 0x00007f9cc0b035ae in child_init (rank=40) at sqlops.c:145
#19 0x00000000004f854a in init_mod_child (m=0x7f9cc1cafa78, rank=40) at sr_module.c:924
#20 0x00000000004f83ed in init_mod_child (m=0x7f9cc1cb00a0, rank=40) at sr_module.c:921
#21 0x00000000004f83ed in init_mod_child (m=0x7f9cc1cb0780, rank=40) at sr_module.c:921
#22 0x00000000004f86d0 in init_child (rank=40) at sr_module.c:948
#23 0x0000000000491c5f in fork_process (child_id=40, desc=0x7fff94773ab0 "udp receiver child=39 sock=127.0.0.1:5060", make_sock=1) at pt.c:343
#24 0x000000000046d470 in main_loop () at main.c:1609
#25 0x00000000004706a7 in main (argc=21, argv=0x7fff94773d18) at main.c:2547

Need anything else from the core?
It looks like a double free done inside libssl, it cannot be controlled by us, you can set mem_safety=1

Cheers,
Daniel
-- 
Daniel-Constantin Mierla
http://twitter.com/#!/miconda - http://www.linkedin.com/in/miconda