On Freitag, 6. November 2009, Robin Vleij wrote:
> Since 1.3.0 (now running 1.4.4) I'm seeing a very slow uptake of SHM
> memory on our low traffic setup (less than 5 cps per machine). I'm
> looking for some basis to go further on in my research to the cause. :)
>
> I compiled Kamailio 1.4.4-notls with #define SHM_MEM_SIZE 4*32 in
> config.h in production. For my testing setup I'm running on the standard
> 32 there.
>
> After about 3 weeks uptime I start top, sort on memory size and find the
> kamailio processes (I'm running with 16 children) to all have about 40mb
> in the SHR column. My understanding is that this should also go down,
> but it only goes up, slowly. More CPS (for example a benchmark using
> sipp) makes this go up faster, but it never seems to go down this
> figure. I think this is wrong, but I could be wrong myself. :)
>
> On a seperate machine with no traffic I compiled the memory debugging
> according to the "memory troubleshooting" page on the wiki. LOTS of info
> in the logs. Also ran with valgrind, didn't find anything interesting
> (but I'm no dev myself really).
>
> My plan now is to take away our acc module (compiled with radius
> support) and see if it's maybe that module that's causing this. My test
> on this traffic-less machine is as follows: start, run 20cps for a while
> (we do no registers, just routing and auth) and note the SHR data from
> top. Then according to my understanding this figure should drop down
> after a period of 20 minutes with no traffic. Is this a right assumption?
> [..]
> As far as I know, it never goes down, the SHR entries. When running with
> very little SHM i config.h, the process goes out of shm memory and
> complains, as expected.
>
> Are my assumptions about all of this correct?
Hello Robin,
do you experience any problems in your setup when you use a reasonable SHM mem size? In my experience the size of the SHM memory (as displayed from top) depends on the load of the machine. But there is a certain level of shared memory that is used regardless of the load. Even if the machine has been completely passive over a longer time, it will not reclaim this memory. On a certain test system for example there is one process that has 11MB SHM at the moment, even if its completely idle.
For the VIRT column (again top) its another story, here it will just show something like SHM + PKG memory size, regardless of the actual load.
If you've a real memory leak in shared memory then after a certain time interval the server will report memory allocation errors. Otherwise i don't think its something to worry about.
Regards,
Henning