Hello,
We're running three instances of Kamailo v5.14 as registrars handling registrations from ~2000 SIP clients, with one instance being primary and the other two as backups.
The three of them are using the dmq and dmq_usrloc modules to synchronize user locations, however after a couple of days of operation the two failover instances show memory leak behaviors, with mem usage assigned to the core taking all available resources.
When this happens we've noticed that: - The shared memory used by the function "sip_msg_shm_clone" spikes (from 1kb to 1.5GB). - The shared memory used by the function "dmq:worker.c:job_queue_push" also increases, but not as much (from 1kb to 1MB) - DMQ request are not being answered (with a 200 OK) by the affected instance during this memory leak, which make us think that DMQ module becomes unresponsive.
A few more notes: - The failover instances are doing nothing except receiving replicated contacts. - The shared memory grows at the same rate on both instances, but the critical behavior never happens at the same time. - We are allocating 1GB memory on startup to each instance. - We store the location DB in a psql DB and we load it at startup. - We didn't find any errors in syslog, even at debug level.
Has anyone experienced a similar issue who can suggest a possible solution?
Thanks, Rogelio Perez Telnyx