stefan-mititelu-idt created an issue (kamailio/kamailio#4345)
<!-- Kamailio Project uses GitHub Issues only for bugs in the code or feature requests. Please use this template only for bug reports.
If you have questions about using Kamailio or related to its configuration file, ask on sr-users mailing list:
* https://lists.kamailio.org/mailman3/postorius/lists/sr-users.lists.kamailio....
If you have questions about developing extensions to Kamailio or its existing C code, ask on sr-dev mailing list:
* https://lists.kamailio.org/mailman3/postorius/lists/sr-dev.lists.kamailio.or...
Please try to fill this template as much as possible for any issue. It helps the developers to troubleshoot the issue.
Note that an issue report may be closed automatically after about 2 months if there is no interest from developers or community users on pursuing it, being considered expired. In such case, it can be reopened by writing a comment that includes the token `/notexpired`. About two weeks before considered expired, the issue is marked with the label `stale`, trying to notify the submitter and everyone else that might be interested in it. To remove the label `stale`, write a comment that includes the token `/notstale`. Also, any comment postpone the `expire` timeline, being considered that there is interest in pursuing the issue.
If there is no content to be filled in a section, the entire section can be removed.
You can delete the comments from the template sections when filling.
You can delete next line and everything above before submitting (it is a comment). -->
### Description Crash in kamailio 5.8.5 due to broken "next" pointer of circular double linked list of expired timers. <!-- Explain what you did, what you expected to happen, and what actually happened. -->
### Troubleshooting -
#### Reproduction Not easily reproducible <!-- If the issue can be reproduced, describe how it can be done. -->
#### Debugging Data
<!-- If you got a core dump, use gdb to extract troubleshooting data - full backtrace, local variables and the list of the code at the issue location.
gdb /path/to/kamailio /path/to/corefile bt full info locals list
If you are familiar with gdb, feel free to attach more of what you consider to be relevant. -->
``` (gdb) bt full #0 timer_list_expire (slow_mark=<optimized out>, slow_l=0x7ff891b59d60, h=0x7ff891b56ff0, t=<optimized out>) at core/timer.c:850 tl = 0x7ff89d214810 ret = <optimized out> tl = <optimized out> ret = <optimized out> #1 timer_handler () at core/timer.c:925 saved_ticks = <optimized out> run_slow_timer = 0 i = <optimized out> saved_ticks = <optimized out> run_slow_timer = <optimized out> i = <optimized out> __func__ = "timer_handler" __llevel = <optimized out> __kld = {v_facility = <optimized out>, v_level = <optimized out>, v_lname = <optimized out>, v_fname = <optimized out>, v_fline = <optimized out>, v_mname = <optimized out>, v_func = <optimized out>, v_locinfo = <optimized out>, v_pid = <optimized out>, v_pidx = <optimized out>} __llevel = <optimized out> __kld = {v_facility = <optimized out>, v_level = <optimized out>, v_lname = <optimized out>, v_fname = <optimized out>, v_fline = <optimized out>, v_mname = <optimized out>, v_func = <optimized out>, v_locinfo = <optimized out>, v_pid = <optimized out>, v_pidx = <optimized out>} #2 timer_main () at core/timer.c:963 No locals. #3 0x000055e671830461 in main_loop () at main.c:1933 i = <optimized out> pid = <optimized out> si = 0x0 si_desc = "udp receiver child=15 sock=x.x.x.x:5080 (y.y.y.y:5080)", '\000' <repeats 16 times>, "\002\000\000\000\004\000\000\000\n_\022:\372\177\000\000\003\000\000\000\000\000\000\000\000kt\234s8\b\rThu Jul \a\000\000\000\000\000\000" nrprocs = <optimized out> woneinit = 1 __func__ = "main_loop" error = <optimized out> #4 0x000055e671824ff2 in main (argc=<optimized out>, argv=<optimized out>) at main.c:3257 cfg_stream = <optimized out> c = <optimized out> r = <optimized out> tmp = 0x7ffe2850be85 "" tmp_len = 0 port = 5060 proto = 0 aproto = 0 ahost = 0x0 aport = 0 options = 0x55e671bacd30 ":f:cm:M:dVIhEeb:B:l:L:n:vKrRDTN:W:w:t:u:g:P:G:SQ:O:a:A:x:X:Y:" ret = -1 seed = 1220594137 rfd = <optimized out> debug_save = <optimized out> debug_flag = <optimized out> --Type <RET> for more, q to quit, c to continue without paging-- dont_fork_cnt = <optimized out> n_lst = <optimized out> p = <optimized out> st = {st_dev = 22, st_ino = 1081, st_nlink = 2, st_mode = 16888, st_uid = 109, st_gid = 115, __pad0 = 0, st_rdev = 0, st_size = 40, st_blksize = 4096, st_blocks = 0, st_atim = {tv_sec = 1751525619, tv_nsec = 801181720}, st_mtim = {tv_sec = 1751525619, tv_nsec = 117175705}, st_ctim = {tv_sec = 1751525636, tv_nsec = 937332387}, __glibc_reserved = {0, 0, 0}} l1 = <optimized out> tbuf = "\000\000\000\000\000\000\000\000\030"U(\376\177\000\000\000\000\000\000 ", '\000' <repeats 27 times>, "\001\000\000\000\000\000\000\000\366u\256\003\001", '\000' <repeats 67 times>, "\060Wj\242\372\177\000\000\004\000\000\024\000\000\000\000@!k\242\372\177", '\000' <repeats 138 times>, "\020\000\000\000\000\000\000\000 \265P(\376\177\000\000\020\000\000\000\376\177\000\000\060\265P(\376\177\000\000\370\264P(\376\177\000\000x\221\205\242\372\177\000\000\300\t\000\000\300\t\000\000x\221\205\242\372\177\000\000\300\t\000\000\300\t\000\000L\206\204\242\372\177\000\000\300\t\000\000\300\t\000\000L\206\204\242\372\177\000\000\300\t\000\000\300\t\000\000\300\t\000\000\300\t\000\000\377\377\377\377\000\000\000\000\020_\202\242\372\177\000\000H\000\000\000\000\000\000\000\312<k\242\372\177\000\000`\200\205\242d\000\000\000\000kt\234s8\b\r\377\377\377\377\000\000\000\000j[j\242\372\177\000\000\000\000\000\000\000\000\000\000@\000\000\000\000\000\000\000\000\000\200\000\000\000\000\000\377\377\377\377\377\377\377\377\377\265\360\000\000\000\000\000\302\000\000\000\000\000\000" option_index = 12 long_options = {{name = 0x55e671bab333 "help", has_arg = 0, flag = 0x0, val = 104}, {name = 0x55e671bb4a8e "version", has_arg = 0, flag = 0x0, val = 118}, { name = 0x55e671bc48d1 "alias", has_arg = 1, flag = 0x0, val = 1024}, {name = 0x55e671bab338 "subst", has_arg = 1, flag = 0x0, val = 1025}, { name = 0x55e671bab33e "substdef", has_arg = 1, flag = 0x0, val = 1026}, {name = 0x55e671bab347 "substdefs", has_arg = 1, flag = 0x0, val = 1027}, { name = 0x55e671bab351 "server-id", has_arg = 1, flag = 0x0, val = 1028}, {name = 0x55e671bab35b "loadmodule", has_arg = 1, flag = 0x0, val = 1029}, { name = 0x55e671bab366 "modparam", has_arg = 1, flag = 0x0, val = 1030}, {name = 0x55e671bab36f "log-engine", has_arg = 1, flag = 0x0, val = 1031}, { name = 0x55e671bb4bab "debug", has_arg = 1, flag = 0x0, val = 1032}, {name = 0x55e671bab37a "cfg-print", has_arg = 0, flag = 0x0, val = 1033}, { name = 0x55e671bab384 "atexit", has_arg = 1, flag = 0x0, val = 1034}, {name = 0x55e671bab38b "all-errors", has_arg = 0, flag = 0x0, val = 1035}, {name = 0x0, has_arg = 0, flag = 0x0, val = 0}} __func__ = "main" ```
Some more debugging info: ``` (gdb) p tl $203 = (struct timer_ln *) 0x7ff89d214810 (gdb) p h $202 = (struct timer_head *) 0x7ff891b56ff0
(gdb) p *(struct timer_ln *) 0x7ff891e21af0 $6 = {next = 0x7ff891e24b00, prev = 0x7ff89d214810, expire = 1057066680, initial_timeout = 480, data = 0x7ff891e21af0, f = 0x55e671a570a0 <compat_old_handler>, flags = 512, slow_idx = 0} (gdb) p *tl $205 = {next = 0x0, prev = 0x7ff891b56ff0, expire = 1057066648, initial_timeout = 48, data = 0xfffffffe, f = 0x7ffa91f77e30 <retr_buf_handler>, flags = 512, slow_idx = 0} (gdb) p *h $204 = {next = 0x0, prev = 0x7ff891e00e10} (gdb) p *(struct timer_ln *) 0x7ff891e00e10 $7 = {next = 0x7ff891b56ff0, prev = 0x7ff89e47d538, expire = 1057066680, initial_timeout = 16, data = 0x7ff891e00e10, f = 0x55e671a570a0 <compat_old_handler>, flags = 512, slow_idx = 0}
(gdb) p h->prev->prev->prev->prev->prev->prev->prev->prev->prev->prev->prev->prev->prev $198 = (struct timer_ln *) 0x7ff891e21af0 (gdb) p h->prev->prev->prev->prev->prev->prev->prev->prev->prev->prev->prev->prev->prev->prev $199 = (struct timer_ln *) 0x7ff89d214810 (gdb) p h->prev->prev->prev->prev->prev->prev->prev->prev->prev->prev->prev->prev->prev->prev->prev $200 = (struct timer_ln *) 0x7ff891b56ff0 (gdb) p h->prev->prev->prev->prev->prev->prev->prev->prev->prev->prev->prev->prev->prev->prev->prev->prev $201 = (struct timer_ln *) 0x7ff891e00e10 ```
#### Log Messages
<!-- Check the syslog file and if there are relevant log messages printed by Kamailio, add them next, or attach to issue, or provide a link to download them (e.g., to a pastebin site). -->
``` (paste your log messages here) ```
#### SIP Traffic
<!-- If the issue is exposed by processing specific SIP messages, grab them with ngrep or save in a pcap file, then add them next, or attach to issue, or provide a link to download them (e.g., to a pastebin site). -->
``` (gdb) p *(struct retr_buf *)((char *)(tl)-((size_t)((char *)&((struct retr_buf*)(0))->timer - (char *)0))) $13 = {rbtype = 0, flags = 164, t_active = 0, branch = 0, buffer_len = 1021, buffer = 0x7ff897731ef0 "KDMQ sip:usrloc@x.x.x.x:5040;transport=tcp SIP/2.0\r\nVia: SIP/2.0/TCP y.y.y.y:5045;branch=z9hG4bK6411.2eec64c5", '0' <repeats 24 times>, ".0\r\nTo: sip:usrloc@x.x.x.x:5040;transport=tcp\r\nFrom: sip:usrloc@y.y.y.y:5040;transport=tcp;tag=ea5c1e4b006551edd6973f0c22cc35ec-9c6994fe\r\nCSeq: 10 KDMQ\r\nCall-ID: 3bc128ce41f9d5e5-3737@x.x.x.x\r\nContent-Length: 574\r\nUser-Agent: UNITE 3.0\r\nMax-Forwards: 1\r\nContent-Type: application/json\r\n\r\n{"action":1,"aor":"01537833832a5ccc75058ef5839f","ruid":"uloc-a71cb2-68662908-e99-e6b85","c":"sip:01537833832a5ccc75058ef5839f@z.z.z.z:11916;transport=TCP","received":"sip:x.x.x.x:11916;transport=tcp","path":"sip:z.z.z.z:5040;transport=tcp;lr;received=sip:w.w.w.w:11916%3Btransport%3Dtcp;socket=clientTCPListener","callid":"0_912990869@192.168.1.7","user_agent":"Yealink SIP-T30 t.t.t.t","instance":"","expires":1751822448,"cseq":28,"flags":0,"cflags":96,"q":-1,"last_modified":1751822148,"methods":16383,"reg_id":0,"server_id":10951858,"xavps":{}}", my_T = 0x7ff89d214510, timer = {next = 0x0, prev = 0x7ff891b56ff0, expire = 1057066648, initial_timeout = 48, data = 0xfffffffe, f = 0x7ffa91f77e30 <retr_buf_handler>, flags = 512, slow_idx = 0}, dst = {send_sock = 0x7ffa9264fae0, to = {s = {sa_family = 2, sa_data = "\023\260\n_\020\210\000\000\000\000\000\000\000"}, sin = {sin_family = 2, sin_port = 45075, sin_addr = {s_addr = 2282774282}, sin_zero = "\000\000\000\000\000\000\000"}, sin6 = {sin6_family = 2, sin6_port = 45075, sin6_flowinfo = 2282774282, sin6_addr = {__in6_u = { __u6_addr8 = '\000' <repeats 15 times>, __u6_addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, __u6_addr32 = {0, 0, 0, 0}}}, sin6_scope_id = 0}, sas = {ss_family = 2, __ss_padding = "\023\260\n_\020\210", '\000' <repeats 111 times>, __ss_align = 0}}, id = 0, send_flags = {f = 4, blst_imask = 0}, proto = 2 '\002', proto_pad0 = 0 '\000', proto_pad1 = 0}, retr_expire = 1057066599, fr_expire = 1057066648} ```
### Possible Solutions Check if "next" list is broken and recover the pointers form "prev" list, and viceversa.
<!-- If you found a solution or workaround for the issue, describe it. Ideally, provide a pull request with a fix. -->
### Additional Information Using 1s transaction timeout for KDMQs.
* **Kamailio Version** - output of `kamailio -v`
``` 5.8.5 kamailio version ```
* **Operating System**: <!-- Details about the operating system, the type: Linux (e.g.,: Debian 8.4, Ubuntu 16.04, CentOS 7.1, ...), MacOS, xBSD, Solaris, ...; Kernel details (output of `lsb_release -a` and `uname -a`) -->
```
Debian 11 ```