Hi,
we have some Kamailio instances running (currently latest 5.4 release), and we need to restart it from time to time. We have a grafana graph showing the pkg memory usage of one random tcp listener, and it increases slowly over time. Config is pure python KEMI.
A mem dump directly after restarting Kamailio says this:
SipSeppBook22:tmp sdamm$ grep alloc pkgmem_before.log | awk '{ print substr( $0, 16, length($0) ) }' | sort | uniq -c | sort -k1n | tail -10 16 sipproxy qm_status(): alloc'd from core: core/re.c: subst_parser(301) 31 sipproxy qm_status(): alloc'd from core: core/sr_module.c: load_module(436) 31 sipproxy qm_status(): alloc'd from core: core/sr_module.c: register_module(236) 31 sipproxy qm_status(): alloc'd from core: core/sr_module.c: register_module(253) 40 sipproxy qm_status(): alloc'd from core: core/pvapi.c: pv_init_buffer(2139) 58 sipproxy qm_status(): alloc'd from core: core/cfg.lex: pp_define(1827) 133 sipproxy qm_status(): alloc'd from core: core/rpc_lookup.c: rpc_hash_add(101) 162 sipproxy qm_status(): alloc'd from core: core/counters.c: cnt_hash_add(339) 211 sipproxy qm_status(): alloc'd from core: core/cfg.lex: addstr(1448) 265 sipproxy qm_status(): alloc'd from core: core/pvapi.c: pv_table_add(236)
And after running for some weeks, the same dump looks like this:
SipSeppBook22:tmp sdamm$ grep alloc prod_pkgmem.log | awk '{ print substr( $0, 16, length($0) ) }' | sort | uniq -c | sort -k1n | tail -10 31 ifens5 qm_status(): alloc'd from core: core/sr_module.c: register_module(253) 40 ifens5 qm_status(): alloc'd from core: core/pvapi.c: pv_init_buffer(2139) 59 ifens5 qm_status(): alloc'd from core: core/cfg.lex: pp_define(1827) 133 ifens5 qm_status(): alloc'd from core: core/rpc_lookup.c: rpc_hash_add(101) 161 ifens5 qm_status(): alloc'd from core: core/counters.c: cnt_hash_add(339) 203 ifens5 qm_status(): alloc'd from core: core/cfg.lex: addstr(1448) 265 ifens5 qm_status(): alloc'd from core: core/pvapi.c: pv_table_add(236) 686 ifens5 qm_status(): alloc'd from core: core/pvapi.c: pv_parse_format(1173) 694 ifens5 qm_status(): alloc'd from htable: ht_var.c: pv_parse_ht_name(158) 707 ifens5 qm_status(): alloc'd from core: core/pvapi.c: pv_cache_add(349)
I know, currently there are a few lines in the code which look like this:
self.instance_name = KSR.pv.get("$sht(pbxdata=>ip.%s)" % (ip,))
This has been an issue in the past and I have replaced the code with something like this:
KSR.pv.sets("$var(tmpInstanceIp)", ip) self.instance_name = KSR.pv.get("$sht(pbxdata=>ip.$var(tmpInstanceIp))")
However, even after changing this, the memory still grows slowly but steadily.
The usage scenario is TLS-only on one side (clients) and TCP-only on the other side (pbxes).
Does anybody have a hint for me how to debug this? Looks like there's a lot of pv stuff in the memory, but I don't really know where.
Thanks for any hints, Sebastian
Hello,
the logs still suggest that you define a lot of htable-related variables:
694 ifens5 qm_status(): alloc'd from htable: ht_var.c: pv_parse_ht_name(158)
The number 694 suggests you have defined like $sht(x=>1) ... $sht(x=>694) variables, and that's probably grow based on what you do in the kemi script.
Then, you said:
"""
I know, currently there are a few lines in the code which look like this:
self.instance_name = KSR.pv.get("$sht(pbxdata=>ip.%s)" % (ip,))
"""
If you still have a few of them, then it is still the issue.
The latest version export dedicated functions to kemi for managing items in htables, see:
- https://www.kamailio.org/docs/tutorials/devel/kamailio-kemi-framework/module...
There are functions to get/set values without using the PV module and kamailo-style variable. It is better to switch to them. Probably the are in 5.4.x as well.
The assignment above should be something like:
self.instance_name = KSR.htable.sht_get("pbxdata", "ip." + ip)
Cheers, Daniel
On 14.06.22 13:56, Sebastian Damm wrote:
Hi,
we have some Kamailio instances running (currently latest 5.4 release), and we need to restart it from time to time. We have a grafana graph showing the pkg memory usage of one random tcp listener, and it increases slowly over time. Config is pure python KEMI.
A mem dump directly after restarting Kamailio says this:
SipSeppBook22:tmp sdamm$ grep alloc pkgmem_before.log | awk '{ print substr( $0, 16, length($0) ) }' | sort | uniq -c | sort -k1n | tail -10 16 sipproxy qm_status(): alloc'd from core: core/re.c: subst_parser(301) 31 sipproxy qm_status(): alloc'd from core: core/sr_module.c: load_module(436) 31 sipproxy qm_status(): alloc'd from core: core/sr_module.c: register_module(236) 31 sipproxy qm_status(): alloc'd from core: core/sr_module.c: register_module(253) 40 sipproxy qm_status(): alloc'd from core: core/pvapi.c: pv_init_buffer(2139) 58 sipproxy qm_status(): alloc'd from core: core/cfg.lex: pp_define(1827) 133 sipproxy qm_status(): alloc'd from core: core/rpc_lookup.c: rpc_hash_add(101) 162 sipproxy qm_status(): alloc'd from core: core/counters.c: cnt_hash_add(339) 211 sipproxy qm_status(): alloc'd from core: core/cfg.lex: addstr(1448) 265 sipproxy qm_status(): alloc'd from core: core/pvapi.c: pv_table_add(236)
And after running for some weeks, the same dump looks like this:
SipSeppBook22:tmp sdamm$ grep alloc prod_pkgmem.log | awk '{ print substr( $0, 16, length($0) ) }' | sort | uniq -c | sort -k1n | tail -10 31 ifens5 qm_status(): alloc'd from core: core/sr_module.c: register_module(253) 40 ifens5 qm_status(): alloc'd from core: core/pvapi.c: pv_init_buffer(2139) 59 ifens5 qm_status(): alloc'd from core: core/cfg.lex: pp_define(1827) 133 ifens5 qm_status(): alloc'd from core: core/rpc_lookup.c: rpc_hash_add(101) 161 ifens5 qm_status(): alloc'd from core: core/counters.c: cnt_hash_add(339) 203 ifens5 qm_status(): alloc'd from core: core/cfg.lex: addstr(1448) 265 ifens5 qm_status(): alloc'd from core: core/pvapi.c: pv_table_add(236) 686 ifens5 qm_status(): alloc'd from core: core/pvapi.c: pv_parse_format(1173) 694 ifens5 qm_status(): alloc'd from htable: ht_var.c: pv_parse_ht_name(158) 707 ifens5 qm_status(): alloc'd from core: core/pvapi.c: pv_cache_add(349)
I know, currently there are a few lines in the code which look like this:
self.instance_name = KSR.pv.get("$sht(pbxdata=>ip.%s)" % (ip,))
This has been an issue in the past and I have replaced the code with something like this:
KSR.pv.sets("$var(tmpInstanceIp)", ip) self.instance_name = KSR.pv.get("$sht(pbxdata=>ip.$var(tmpInstanceIp))")
However, even after changing this, the memory still grows slowly but steadily.
The usage scenario is TLS-only on one side (clients) and TCP-only on the other side (pbxes).
Does anybody have a hint for me how to debug this? Looks like there's a lot of pv stuff in the memory, but I don't really know where.
Thanks for any hints, Sebastian
Kamailio - Users Mailing List - Non Commercial Discussions
- sr-users@lists.kamailio.org
Important: keep the mailing list in the recipients, do not reply only to the sender! Edit mailing list options or unsubscribe: