Hello, we have kamailio 1.4 (r. 5728) running in production. We compiled kamailio with 4 times the default pkg memory pool size (#define PKG_MEM_POOL_SIZE 4*1024*1024 ).
We use snmpstats to monitor registration and number of calls using Nagios and PRTG. Some days ago, after almost 30 days of continuous operation without problems, in the shell we executed "snmpwalk -v2c -c public 192.168.88.22 .1.3.6.1.4.1.27483" as we routinely do for a quick check of SNMP in the command line and it failed. Then we checked kamailio logs and we found this:
May 21 16:25:35 ipx022 /usr/local/sbin/kamailio[8781]: ERROR:snmpstats:handle_openserSIPServiceStartTime: failed to read sysUpTime file at /tmp/openSER_SNMPAgent.txt
I don't know what caused the above (maybe the file was mistakenly rm'd by someone. I failed to check it at that moment) but it was followed with this:
May 21 16:25:35 ipx022 /usr/local/sbin/kamailio[8781]: ERROR:snmpstats:executeInterprocessBufferCmd: Received a request for contact: [protected]@[protected] for user: sip:[protected]@[protected] who doesn't exists May 21 16:25:35 ipx022 /usr/local/sbin/kamailio[8781]: ERROR:snmpstats:executeInterprocessBufferCmd: Received a request to delete contact: [protected]@[protected] for user: sip:[protected]@[protected] who doesn't exist
... then the same log "contact XXX who doesn't exist" happened several times per second till:
May 21 16:36:30 ipx022 /usr/local/sbin/kamailio[8781]: ERROR:snmpstats:createRegUserRow: failed to create a row for openserSIPRegUserTable May 21 16:36:30 ipx022 /usr/local/sbin/kamailio[8781]: ERROR:snmpstats:updateUser: openserSIPRegUserTable ran out of memory. Not able to add user: [protected]@[protected] May 21 16:36:30 ipx022 /usr/local/sbin/kamailio[8781]: ERROR:snmpstats:executeInterprocessBufferCmd: Received a request for contact: [protected]@[protected] for user: sip:[protected]@[protected] who doesn't exists May 21 16:36:30 ipx022 /usr/local/sbin/kamailio[8781]: ERROR:snmpstats:insertContactRecord: no more pkg memory May 21 16:36:30 ipx022 /usr/local/sbin/kamailio[8781]: ERROR:snmpstats:executeInterprocessBufferCmd: openserSIPRegUserTable was unable to allocate memory for adding contact: [protected]@[protected] to user sip:[protected]@[protected].
During this time, snmpget queries issued by Nagios and PRTG were failing and the process spawned by module snmpstats was making heavy use of CPU. No other parts of kamailio were affected (calls and registration were OK), but snmpstats didn't normalize by itself so we decided to restart kamailio.
I'm trying to recreate this in a lab machine but no luck so far. I am keeping the lab server busy with registration and calls while running snmpwalk in a loop. I've also ran "rm /tmp/openSER_SNMPAgent.txt" to check if could trigger the problem but the problem could not be recreated. If I push the lab server enough, I can see similar logs like this:
May 27 20:16:54 ipx029 /usr/local/sbin/kamailio[14487]: ERROR:snmpstats:updateUser: openserSIPRegUserTable ran out of memory. Not able to add user: [protected]@[protected] May 27 20:16:54 ipx029 /usr/local/sbin/kamailio[14487]: ERROR:snmpstats:executeInterprocessBufferCmd: Received a request for contact: [protected]@[protected] for user: sip:[protected]@[protected] who doesn't exists May 27 20:17:05 ipx029 /usr/local/sbin/kamailio[14487]: ERROR:snmpstats:get_socket_list_from_proto: no more pkg memory May 27 20:17:05 ipx029 last message repeated 9 times May 27 20:17:59 ipx029 /usr/local/sbin/kamailio[14487]: ERROR:snmpstats:handle_openserSIPServiceStartTime: failed to read sysUpTime file at /tmp/openSER_SNMPAgent.txt May 27 20:40:25 ipx029 /usr/local/sbin/kamailio[14487]: ERROR:snmpstats:handle_openserSIPServiceStartTime: failed to read sysUpTime file at /tmp/openSER_SNMPAgent.txt
But even with those messages, snmpget/snmpwalk doesn't stop to return responses.
I was hoping someone could take a look at snmpstats code; the logs for snmpstats:executeInterprocessBufferCmd are marked with comments like : /* This should never happen. This is more of a sanity check. */ so probably my server hit some bug.
regards, takeshi