<!-- Kamailio Project uses GitHub Issues only for bugs in the code or feature requests. Please use this template only for bug reports.
If you have questions about using Kamailio or related to its configuration file, ask on sr-users mailing list:
* http://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-users
If you have questions about developing extensions to Kamailio or its existing C code, ask on sr-dev mailing list:
* http://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-dev
Please try to fill this template as much as possible for any issue. It helps the developers to troubleshoot the issue.
If there is no content to be filled in a section, the entire section can be removed.
You can delete the comments from the template sections when filling.
You can delete next line and everything above before submitting (it is a comment). -->
### Description
Usage of `modparam("presence", "publ_cache", 2)` causes kamailio to crash shortly after starting up. This only happens when subscribe traffic is high (400 requests per second avg).
### Troubleshooting
#### Reproduction
The issue can be reproduced by starting up kamailio with the option above enabled and ~400 subscribe requests per second. The crash happens ~20 seconds after startup.
#### Debugging Data
<!-- If you got a core dump, use gdb to extract troubleshooting data - full backtrace, local variables and the list of the code at the issue location.
gdb /path/to/kamailio /path/to/corefile bt full info locals list
If you are familiar with gdb, feel free to attach more of what you consider to be relevant. -->
``` (paste your debugging data here) ```
#### Log Messages
<!-- Check the syslog file and if there are relevant log messages printed by Kamailio, add them next, or attach to issue, or provide a link to download them (e.g., to a pastebin site). -->
``` kernel: kamailio[25401]: segfault at 0 ip 00007f4a0a22d4d7 sp 00007fff6064a370 error 4 in presence.so[7f4a0a21e000+fe000] ```
#### SIP Traffic
<!-- If the issue is exposed by processing specific SIP messages, grab them with ngrep or save in a pcap file, then add them next, or attach to issue, or provide a link to download them (e.g., to a pastebin site). -->
``` (paste your sip traffic here) ```
### Possible Solutions
Running with the default
### Additional Information
* **Kamailio Version** - output of `kamailio -v`
``` (paste your output here) ```
* **Operating System**:
<!-- Details about the operating system, the type: Linux (e.g.,: Debian 8.4, Ubuntu 16.04, CentOS 7.1, ...), MacOS, xBSD, Solaris, ...; Kernel details (output of `lsb_release -a` and `uname -a`) -->
``` Linux hostName 2.6.32-754.11.1.el6.x86_64 #1 SMP Tue Feb 26 15:38:56 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
version: kamailio 5.4.3 (x86_64/linux) e19ae3 flags: USE_TCP, USE_TLS, USE_SCTP, TLS_HOOKS, USE_RAW_SOCKS, DISABLE_NAGLE, USE_MCAST, DNS_IP_HACK, SHM_MMAP, PKG_MALLOC, Q_MALLOC, F_MALLOC, TLSF_MALLOC, DBG_SR_MEMORY, USE_FUTEX, FAST_LOCK-ADAPTIVE_WAIT, USE_DNS_CACHE, USE_DNS_FAILOVER, USE_NAPTR, USE_DST_BLACKLIST, HAVE_RESOLV_RES ADAPTIVE_WAIT_LOOPS 1024, MAX_RECV_BUFFER_SIZE 262144, MAX_URI_SIZE 1024, BUF_SIZE 65535, DEFAULT PKG_SIZE 8MB poll method support: poll, epoll_lt, epoll_et, sigio_rt, select. id: e19ae3 compiled on 18:24:43 Dec 14 2020 with gcc 4.4.7 ```
I realize this is an older version so if it's resolved in a newer one let me know.
If you can reproduce the crash, then enable core dumping and attach here the gdb backtrace. Some hints can be found at:
* http://www.kamailio.org/wiki/tutorials/troubleshooting/coredumpfile
``` #0 0x00007f6ecceca4d7 in core_case_hash (s1=0x7ffc48039be8, s2=0x7ffc48039bf8, size=0) at ../../core/hashes.h:317 #1 0x00007f6ecced4fe9 in ps_ptable_search (ptm=0x7ffc48039be0, mmode=1, rmode=0) at hash.c:1250 #2 0x00007f6eccee6347 in ps_cache_get_p_notify_body (pres_uri=..., event=0x7f6dcc35da78, etag=0x0, contact=0x7ffc4803a730) at notify.c:894 #3 0x00007f6ecceea400 in get_p_notify_body (pres_uri=..., event=0x7f6dcc35da78, etag=0x0, contact=0x7ffc4803a730) at notify.c:1068 #4 0x00007f6eccef2841 in send_notify_request (subs=0x7ffc4803a660, watcher_subs=0x0, n_body=0x0, force_null_body=0) at notify.c:1658 #5 0x00007f6eccef577a in notify (subs=0x7ffc4803a660, watcher_subs=0x0, n_body=0x0, force_null_body=0, aux_body_processing=0) at notify.c:1817 #6 0x00007f6eccf77e3a in update_subscription (msg=0x7f6ed2abfdc0, subs=0x7ffc4803a660, to_tag_gen=1, sent_reply=0x7ffc4803a49c) at subscribe.c:679 #7 0x00007f6eccf7f541 in handle_subscribe (msg=0x7f6ed2abfdc0, watcher_user=..., watcher_domain=...) at subscribe.c:1216 #8 0x00007f6eccf7b17d in handle_subscribe0 (msg=0x7f6ed2abfdc0) at subscribe.c:993 #9 0x000000000043cab3 in do_action (h=0x7ffc4803d160, a=0x7f6ed29974f8, msg=0x7f6ed2abfdc0) at core/action.c:1076 #10 0x000000000044a137 in run_actions (h=0x7ffc4803d160, a=0x7f6ed29974f8, msg=0x7f6ed2abfdc0) at core/action.c:1581 #11 0x000000000043ca1c in do_action (h=0x7ffc4803d160, a=0x7f6ed2997880, msg=0x7f6ed2abfdc0) at core/action.c:1067 #12 0x000000000044a137 in run_actions (h=0x7ffc4803d160, a=0x7f6ed2997880, msg=0x7f6ed2abfdc0) at core/action.c:1581 #13 0x000000000043ca75 in do_action (h=0x7ffc4803d160, a=0x7f6ed29979d0, msg=0x7f6ed2abfdc0) at core/action.c:1071 #14 0x000000000044a137 in run_actions (h=0x7ffc4803d160, a=0x7f6ed297eea0, msg=0x7f6ed2abfdc0) at core/action.c:1581 #15 0x0000000000439032 in do_action (h=0x7ffc4803d160, a=0x7f6ed28e2e88, msg=0x7f6ed2abfdc0) at core/action.c:700 #16 0x000000000044a137 in run_actions (h=0x7ffc4803d160, a=0x7f6ed28d5ac8, msg=0x7f6ed2abfdc0) at core/action.c:1581 #17 0x000000000044a86e in run_top_route (a=0x7f6ed28d5ac8, msg=0x7f6ed2abfdc0, c=0x0) at core/action.c:1666 #18 0x000000000056d8c5 in receive_msg ( buf=0xb6ed80 "SUBSCRIBE sip:mydomain SIP/2.0\r\nVia: SIP/2.0/UDP 10.10.0.54:5062;rport;branch=z9hG4bK1741788869\r\nFrom: "Open Classroom" sip:230-midvt@mydomain;tag=248246191\r\nTo: sip:mydomain\r\nCall-ID: 46"..., len=425, rcv_info=0x7ffc4803d8c0) at core/receive.c:423 #19 0x00000000006a3e7e in udp_rcv_loop () at core/udp_server.c:543 #20 0x0000000000429cfb in main_loop () at main.c:1683 #21 0x00000000004338ca in main (argc=13, argv=0x7ffc4803e1c8) at main.c:2856 ```
It looks like that subscribe message above caused the issue?
I also have another core dump which shows a subscribe from another endpoint: ``` #0 0x00007f6ecceca4d7 in core_case_hash (s1=0x7ffc48039be8, s2=0x7ffc48039bf8, size=0) at ../../core/hashes.h:317 #1 0x00007f6ecced4fe9 in ps_ptable_search (ptm=0x7ffc48039be0, mmode=1, rmode=0) at hash.c:1250 #2 0x00007f6eccee6347 in ps_cache_get_p_notify_body (pres_uri=..., event=0x7f6dcc35da78, etag=0x0, contact=0x7ffc4803a730) at notify.c:894 #3 0x00007f6ecceea400 in get_p_notify_body (pres_uri=..., event=0x7f6dcc35da78, etag=0x0, contact=0x7ffc4803a730) at notify.c:1068 #4 0x00007f6eccef2841 in send_notify_request (subs=0x7ffc4803a660, watcher_subs=0x0, n_body=0x0, force_null_body=0) at notify.c:1658 #5 0x00007f6eccef577a in notify (subs=0x7ffc4803a660, watcher_subs=0x0, n_body=0x0, force_null_body=0, aux_body_processing=0) at notify.c:1817 #6 0x00007f6eccf77e3a in update_subscription (msg=0x7f6ed2abfdc0, subs=0x7ffc4803a660, to_tag_gen=1, sent_reply=0x7ffc4803a49c) at subscribe.c:679 #7 0x00007f6eccf7f541 in handle_subscribe (msg=0x7f6ed2abfdc0, watcher_user=..., watcher_domain=...) at subscribe.c:1216 #8 0x00007f6eccf7b17d in handle_subscribe0 (msg=0x7f6ed2abfdc0) at subscribe.c:993 #9 0x000000000043cab3 in do_action (h=0x7ffc4803d160, a=0x7f6ed29974f8, msg=0x7f6ed2abfdc0) at core/action.c:1076 #10 0x000000000044a137 in run_actions (h=0x7ffc4803d160, a=0x7f6ed29974f8, msg=0x7f6ed2abfdc0) at core/action.c:1581 #11 0x000000000043ca1c in do_action (h=0x7ffc4803d160, a=0x7f6ed2997880, msg=0x7f6ed2abfdc0) at core/action.c:1067 #12 0x000000000044a137 in run_actions (h=0x7ffc4803d160, a=0x7f6ed2997880, msg=0x7f6ed2abfdc0) at core/action.c:1581 #13 0x000000000043ca75 in do_action (h=0x7ffc4803d160, a=0x7f6ed29979d0, msg=0x7f6ed2abfdc0) at core/action.c:1071 #14 0x000000000044a137 in run_actions (h=0x7ffc4803d160, a=0x7f6ed297eea0, msg=0x7f6ed2abfdc0) at core/action.c:1581 #15 0x0000000000439032 in do_action (h=0x7ffc4803d160, a=0x7f6ed28e2e88, msg=0x7f6ed2abfdc0) at core/action.c:700 #16 0x000000000044a137 in run_actions (h=0x7ffc4803d160, a=0x7f6ed28d5ac8, msg=0x7f6ed2abfdc0) at core/action.c:1581 #17 0x000000000044a86e in run_top_route (a=0x7f6ed28d5ac8, msg=0x7f6ed2abfdc0, c=0x0) at core/action.c:1666 #18 0x000000000056d8c5 in receive_msg ( buf=0xb6ed80 "SUBSCRIBE sip:mydomain:8060 SIP/2.0\r\nVia: SIP/2.0/UDP 10.10.0.34:5062;rport;branch=z9hG4bK2447480306\r\nFrom: "Cherie Blessing" sip:229-midvt@mydomain:8060;tag=3505784675\r\nTo: <sip:mydomain:8"..., len=444, rcv_info=0x7ffc4803d8c0) at core/receive.c:423 #19 0x00000000006a3e7e in udp_rcv_loop () at core/udp_server.c:543 #20 0x0000000000429cfb in main_loop () at main.c:1683 #21 0x00000000004338ca in main (argc=13, argv=0x7ffc4803e1c8) at main.c:2856 ```
I replaced my actual domain with `mydomain` above. Let me know if you need more info. I also have the `bt full` but seems to verbose to post here?
Thanks!
I do notice that with these 2 examples, there is no `To User` in the subscribe request (after looking in the `bt full` output). The To header looks like this: ``` To: sip:mydomain:8060 ``` Which could cause an issue. I can add a check for that where no `$tU` exists and filter those out if needed.
Do you need the `bt full` in order to troubleshoot further here?
Thanks.
I am not sure when I will have time to look deeper at it, travelling for a while, but `bt full` output is useful to have.
I pushed a safety check commit to master branch, can you try and see if it's ok now?
Perfect, this will work. I forgot to mention that I was able to work around this by checking for user before handling the subscription and that did resolve it. So this will definitely get us going.
Thanks!
The patch was backported to 5.5 branch. Closing this one, thanks!
Closed #3008.