On Freitag, 6. November 2009, Robin Vleij wrote:
Since 1.3.0 (now running 1.4.4) I'm seeing a very
slow uptake of SHM
memory on our low traffic setup (less than 5 cps per machine). I'm
looking for some basis to go further on in my research to the cause. :)
I compiled Kamailio 1.4.4-notls with #define SHM_MEM_SIZE 4*32 in
config.h in production. For my testing setup I'm running on the standard
32 there.
After about 3 weeks uptime I start top, sort on memory size and find the
kamailio processes (I'm running with 16 children) to all have about 40mb
in the SHR column. My understanding is that this should also go down,
but it only goes up, slowly. More CPS (for example a benchmark using
sipp) makes this go up faster, but it never seems to go down this
figure. I think this is wrong, but I could be wrong myself. :)
On a seperate machine with no traffic I compiled the memory debugging
according to the "memory troubleshooting" page on the wiki. LOTS of info
in the logs. Also ran with valgrind, didn't find anything interesting
(but I'm no dev myself really).
My plan now is to take away our acc module (compiled with radius
support) and see if it's maybe that module that's causing this. My test
on this traffic-less machine is as follows: start, run 20cps for a while
(we do no registers, just routing and auth) and note the SHR data from
top. Then according to my understanding this figure should drop down
after a period of 20 minutes with no traffic. Is this a right assumption?
[..]
As far as I know, it never goes down, the SHR entries. When running with
very little SHM i config.h, the process goes out of shm memory and
complains, as expected.
Are my assumptions about all of this correct?
Hello Robin,
do you experience any problems in your setup when you use a reasonable SHM mem
size? In my experience the size of the SHM memory (as displayed from top)
depends on the load of the machine. But there is a certain level of shared
memory that is used regardless of the load. Even if the machine has been
completely passive over a longer time, it will not reclaim this memory. On a
certain test system for example there is one process that has 11MB SHM at the
moment, even if its completely idle.
For the VIRT column (again top) its another story, here it will just show
something like SHM + PKG memory size, regardless of the actual load.
If you've a real memory leak in shared memory then after a certain time
interval the server will report memory allocation errors. Otherwise i don't
think its something to worry about.
Regards,
Henning