Hello,
We're running three instances of Kamailo v5.14 as registrars handling registrations from ~2000 SIP clients, with one instance being primary and the other two as backups.
The three of them are using the dmq and dmq_usrloc modules to synchronize user locations, however after a couple of days of operation the two failover instances show memory leak behaviors, with mem usage assigned to the core taking all available resources.
When this happens we've noticed that:
- The shared memory used by the function "sip_msg_shm_clone" spikes (from 1kb to 1.5GB).
- The shared memory used by the function
"dmq:worker.c:job_queue_push" also increases, but not as much (from 1kb to 1MB)
- DMQ request are not being answered
(with a 200 OK) by the affected instance during this memory leak, which make us
think that DMQ module becomes unresponsive.
A few more notes:
- The failover instances are doing nothing except receiving replicated contacts.
- The shared memory grows at the same rate on both instances, but the critical behavior never happens at the same time.
- We are allocating 1GB memory on startup to each instance.
- We store the location DB in a psql DB and we load it at startup.
- We didn't find any errors in syslog, even at debug level.
Has anyone experienced a similar issue who can suggest a possible solution?
Thanks,
Rogelio Perez
Telnyx