[sr-dev] SHM memory usage Kamailio 1.4.4

Mon Nov 9 14:48:25 CET 2009

Henning Westerholt wrote:

Hi Henning!

> do you experience any problems in your setup when you use a reasonable
> SHM mem size? In my experience the size of the SHM memory (as displayed

I've had problems finding a "reasonable" shm mem size. :) Standard is
like 32MB, which runs out quickly when customers do "funny stuff" (read:
loops). Now I'm compiling with #define SHM_MEM_SIZE 4*32. 128MB should
be enough to hold pretty long. So there's no immediate memory problem or
crashes (when it's full, it gets errors and stops processing traffic the
right way). But right now for example, after a "funny" customer, I'm
seeing over 40mb per child in top (16 children). That won't go down
anymore, so we'll have to see how long it holds.
What do you suggest for SHM sizes?

> machine has been completely passive over a longer time, it will not
> reclaim this memory. On a certain test system for example there is one
> process that has 11MB SHM at the moment, even if its completely idle.

OK. We often run very long on 10-20MB per process (all processes have
about the same, at least the children that process UDP), but like today
when someone has a problem and it becomes sip-spaghetti it jumps up to
40MB and then continues to slowly rise from there. Doesn't feel good to
be able to hit some kind of roof with the same traffic load.

> For the VIRT column (again top) its another story, here it will just
> show something like SHM + PKG memory size, regardless of the actual load.

Virt shows 421MB right now for me. I figured out that's what you write,
the PKG memory of each process + the SHM.

> If you've a real memory leak in shared memory then after a certain time
> interval the server will report memory allocation errors. Otherwise i
> don't think its something to worry about.

It does, if I don't make the limit higher. So say that I'm running on
32, then if I would hit that after some weeks uptime it would start
reporting memory allocation errors in different parts of my config and
stop doing important stuff. I also reproduced this assigning a small
amount to a dev machine and then sending 20cps to the machine.

On a test machine I have like 4 processes all using 600kb or so, then
after 20 calls it'll go up to something like

31409 root      15   0 94672 1936 1052 R  0.0  0.7   0:00.00 kamailio
31408 root      15   0 94784 3072 2068 S  0.0  1.2   0:00.00 kamailio
31407 root      15   0 94784 3072 2068 S  0.0  1.2   0:00.00 kamailio
31406 root      25   0 94672 5428 4556 S  0.0  2.1   0:00.02 kamailio

And go back in size only a little after 15-20 minutes or so (often a bit
faster is load is low).

If this is a leak, it'll be almost impossible to find. I can't run
production with memlog or debug on, and in dev it's quite hard to
reproduce it seems. Not sure what to expect. :)

-- 
Robin Vleij
robin at swip.net