[SR-Users] Memory Leak on DB Errors?

Daniel-Constantin Mierla miconda at gmail.com
Mon Oct 31 17:28:23 CET 2011


Hello,

On 10/31/11 7:56 AM, Klaus Darilion wrote:
> Hi Daniel!
>
> The "out of memory" happened again. This time they were able to dump 
> the memory statistics before restarting the server.
>
> There are almost no allocations from other modules, but lot of 
> allocations from usrloc and snmpstats:
>
> # grep 'd from usrloc' syslog_core_dump|wc -l
> 138083
> # grep 'd from snmpstats' syslog_core_dump|wc -l
> 2837533
>
is this command catching freed chunks as well?

Can you send the name of the files and lines that allocates memory 
chunks and repeat a lot?

> Thus, snmpstats seems guilty. What about usrloc? Around 2000 clients 
> are registered to this Kamailio. I think 138.000 allocations for just 
> 2000 clients is too much. Are those usrloc allocations related to the 
> snmpstats problem you mentioned?
>
> AFAIS, your patch was done before 3.2 branch, thus updating to 3.2 
> should fix the issue (as default=turned off), correct?

Yes, it is in 3.2.0 and I hope I caught it all, at least that looked as 
a problem.

Cheers,
Daniel

>
> Thanks
> Klaus
>
> On 17.10.2011 09:33, Daniel-Constantin Mierla wrote:
>> Hi Klaus,
>>
>> over the weekend I looked a bit at snmpstats module. These allocated
>> chunks are for exporting location records. Are you pulling them over
>> snmp? At the first sight, there should be a free of the memory when the
>> records are consumed.
>>
>> The fact is that they are not pulled from usrloc module at the time of
>> the request over snmp, but cached in snmp when registration happens.
>> Practically, it is a partial clone of usrloc commands, which is not the
>> best solution IMO, but I am not the developer. For the moment, I added a
>> parameter to control whether the location records should be cached by
>> snmpstats module or not (if not, they cannot be exported), to fix this
>> issue. If you actually pull the location records over snmp, let me know.
>>
>> I could not test, but if you can give a try (maybe you have a testbed
>> for 3.2 with snmpstats) and see if the memory is steady with
>> export_registrar set to 0 (which is default):
>>
>> http://kamailio.org/docs/modules/devel/modules_k/snmpstats.html#id2539456 
>>
>>
>> Cheers,
>> Daniel
>>
>> On 10/6/11 6:03 PM, Daniel-Constantin Mierla wrote:
>>> Hello,
>>>
>>> seem the leak is in snmpstats, I see lot of allocations like:
>>>
>>> ALERT: qm_status: 37599. N address=0xf30cdf74 frag=0xf30cdf5c size=20
>>> used=1
>>> ALERT: qm_status: alloc'd from snmpstats: interprocess_buffer.c:
>>> handleContactCallbacks(143)
>>> ALERT: qm_status: start check=f0f0f0f0, end check= c0c0c0c0, abcdefed
>>> ALERT: qm_status: 37600. N address=0xf30cdfb8 frag=0xf30cdfa0 size=16
>>> used=1
>>> ALERT: qm_status: alloc'd from snmpstats: utilities.c:
>>> convertStrToCharString(62)
>>> ALERT: qm_status: start check=f0f0f0f0, end check= c0c0c0c0, abcdefed
>>>
>>> There are some from usrloc, but very likely they are ok, because they
>>> are persistent in shm for long time, unless snmpstats asks for some
>>> clones of the structures from usrloc and forgets to free them (i see
>>> one allocation is from handleContactCallbacks).
>>>
>>> No time to look in the sources, but this is a lead to follow if you
>>> want to investigate further.
>>>
>>> In general, fr a memleak you have to look at allocated chunks that are
>>> done from same place in the code and there are many of them. The
>>> decide whether it is something that should be there for long time
>>> (like usrloc records) or they should be freed quicker comparing with
>>> the number of allocations.
>>>
>>> Pkg log looks very clean, allocations only from startup time (maybe is
>>> the main process).
>>>
>>> Cheers,
>>> Daniel
>>>
>>> On 10/6/11 5:31 PM, Klaus Darilion wrote:
>>>> Indeed, DBG_QM_MALLOC is defined. So I have set memlog=1 and dumped
>>>> mem_info with:
>>>> sercmd cfg.set_now_int core mem_dump_pkg 13286
>>>> sercmd cfg.set_now_int core mem_dump_shm 13286
>>>>
>>>> The dumps were done after ~1h uptime. I can not offload the traffic
>>>> and wait until transactions are freed, thus the logs are quite huge
>>>> (~15MByte)
>>>>
>>>> http://pernau.at/kd/memlog.zip
>>>>
>>>> I have no idea for what I should look for - any hints how to analyze
>>>> the mem_dump?
>>>>
>>>> Thanks
>>>> Klaus
>>>>
>>>>
>>>> On 06.10.2011 13:07, Daniel-Constantin Mierla wrote:
>>>>> Hello,
>>>>>
>>>>> On 10/5/11 11:18 AM, Klaus Darilion wrote:
>>>>>>
>>>>>>
>>>>>> On 04.10.2011 14:03, Daniel-Constantin Mierla wrote:
>>>>>>> Hello,
>>>>>>>
>>>>>>> On 10/4/11 12:27 PM, Klaus Darilion wrote:
>>>>>>>> Meanwhile the server was restarted and the DB problems were
>>>>>>>> fixed. As
>>>>>>>> it is a production server I can not reproduce anymore.
>>>>>>>
>>>>>>> So, once it started it didn't recovered, continued always with that
>>>>>>> error? How much of shm did you configure?
>>>>>>>
>>>>>>> You can try to attach from time to time to one process (can be
>>>>>>> even the
>>>>>>> main one to avoid blocking a sip worker) and walk through the shm
>>>>>>> allocated chunks, in order to see if there are some unexpected
>>>>>>> repetitions of allocation from same place in sources.
>>>>>>>
>>>>>>> I posted the gdb script for walking through pkg at some point, the
>>>>>>> difference will be to start from the head of shm list (i.e., 
>>>>>>> starting
>>>>>>> with shm_block->first_frag instead of mem_block->first_frag):
>>>>>>>
>>>>>>> http://www.kamailio.org/dokuwiki/doku.php/troubleshooting:memory#walking_through_pkg_with_gdb 
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> Hi Daniel!
>>>>>>
>>>>>> After reading this wiki page I came to the conclusion that for 
>>>>>> further
>>>>>> debugging I have to recompile Kamailio (using DBG_QM_MALLOC memory
>>>>>> manager instead of F_MALLOC). With the default memory manager it is
>>>>>> not possible to debug the problem. Is it correct?
>>>>> in 3.1 malloc debug was left on (with the goal of catching buffer
>>>>> overflows quickly after several years of development of no using this
>>>>> flag in production), so unless you switched if off, you should get 
>>>>> the
>>>>> reports. you can check in the output of kamailio -V
>>>>>
>>>>> Cheers,
>>>>> Daniel
>>>>>
>>>
>>
>
> _______________________________________________
> SIP Express Router (SER) and Kamailio (OpenSER) - sr-users mailing list
> sr-users at lists.sip-router.org
> http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-users

-- 
Daniel-Constantin Mierla -- http://www.asipto.com
Kamailio Advanced Training, Dec 5-8, Berlin: http://asipto.com/u/kat
http://linkedin.com/in/miconda -- http://twitter.com/miconda




More information about the sr-users mailing list