### Description
While a INVITE is been processed the CANCEL of that transaction is been processed in other process almost at the same time
### Troubleshooting
#### Reproduction
Quite difficult to reproduce I would say
#### Debugging Data
``` Reading symbols from /usr/sbin/kamailio... Reading symbols from /usr/lib/debug/.build-id/e3/9bd8ad0900c980149b00579cce26035c9cb118.debug...
warning: Can't open file /dev/zero (deleted) during file-backed mapping note processing [New LWP 4136750] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". Core was generated by `/usr/sbin/kamailio -P /run/kamailio/kamailio.proxy.pid -f /etc/kamailio/proxy/k'. Program terminated with signal SIGSEGV, Segmentation fault. #0 atomic_dec_and_test_int (var=0x20) at ../../core/parser/../mem/../atomic/atomic_x86.h:222 222 ../../core/parser/../mem/../atomic/atomic_x86.h: No such file or directory. (gdb) backtrace #0 atomic_dec_and_test_int (var=0x20) at ../../core/parser/../mem/../atomic/atomic_x86.h:222 0000001 t_unref (p_msg=<optimized out>) at t_lookup.c:1514 #2 0x000055eee962d69b in exec_post_script_cb (msg=msg@entry=0x7f7e265822c8, type=type@entry=REQUEST_CB_TYPE) at core/script_cb.c:182 0000003 0x000055eee95dfcbe in receive_msg ( buf=buf@entry=0x55eee9a00740 <buf> "INVITE sip:...@fake.dom:5060;transport=udp SIP/2.0\r\nRecord- Route: sip:X.X.X.X;r2=on;lr=on;ftag=1E7F0816-6299BCF10003883E-7FFFF700;ngcplb=yes;socket=udp:Y.Y.Y.Y:5060\r \nRe"..., len=<optimized out>, rcv_info=rcv_info@entry=0x7ffc96d12af0) at core/receive.c:520 #4 0x000055eee96d4c40 in udp_rcv_loop () at core/udp_server.c:543 #5 0x000055eee94db634 in main_loop () at main.c:1730 #6 0x000055eee94d2d2c in main (argc=<optimized out>, argv=<optimized out>) at main.c:3053 ```
#### Log Messages
Notice that both messages came in the same second ( IPs and numbers have been changed )
``` Jun 3 09:49:05 sipwise2-prx03a proxy[4136750]: NOTICE: DEFAULT_ROUTE <script>: New request on proxy - M=INVITE R=«sip:+49......@X.X.X.X :5060;transport=udp» F=«sip:+49...@fake.dom» T=«sip:+49...@Y.Y.Y.Z» IP=«Y.Y.Y.Z»:«5060» («Y.Y.Y.Y»:«5060») ID=«31668-TN-030322e6-0904f8ba1@fake.dom_b2b-1» UA='Cirpack/v4.88 (gw_sip)' DESTIP=«Y.Y.Y.G»:«5062» ``` ``` Jun 3 09:49:05 sipwise2-prx03a proxy[4136755]: NOTICE: DEFAULT_ROUTE <script>: New request on proxy - M=CANCEL R=«sip:+49...@X.X.X.X :5060;transport=udp» F=«sip:+49...@fake.dom» T=«sip:+49...@Y.Y.Y.Z» IP=«Y.Y.Y.Y»:«5060» («Y.Y.Y.Y»:«5060 ») ID=«31668-TN-030322e6-0904f8ba1@fake.dom_b2b-1» UA='<null>' DESTIP=«Z.Z.Z.Z»:«5062» Jun 3 09:49:05 sipwise2-prx03a proxy[4136755]: NOTICE: DEFAULT_ROUTE <script>: Sending reply S=100 Trying M=CANCEL fs='«Z.Z.Z.Z»:«5062»' du ='«Y.Y.Y.Y»:«5060»' - R=«sip:+49...@X.X.X.X:5060;transport=udp» ID=«31668-TN-030322e6-0904f8ba1@fake.dom_b2b-1» UA='<n ull>' ```
### Additional Information
This is the Sipwise flavor 5.5.1-1+0~mr9.5.3.2
Thanks for the report, it crashed during execution of the post script callback. Are there any code patches includes in the sipwise version that are not in the main branch?
Yes, we have [this is the list](https://github.com/sipwise/kamailio/blob/mr9.5.3/debian/patches/series) for this particular version https://github.com/sipwise/kamailio/tree/mr9.5.3/debian/patches
There's not changes on tm in this version. Is exactly the same code as vanilla 5.5.1
Thanks, this are a lot of patches in many modules. Does this happened once or you observe it regularly on this particular version?
This was reported by several different customers same particular version
``` (gdb) f 1 #1 t_unref (p_msg=<optimized out>) at t_lookup.c:1515 1515 t_lookup.c: No such file or directory. (gdb) print T $1 = (struct cell *) 0x0 ```
``` (gdb) f 2 #2 t_unref (p_msg=<optimized out>) at t_lookup.c:1484 1484 in t_lookup.c (gdb) print T $2 = (struct cell *) 0x0 ```
maybe this will help?
``` diff --git a/src/modules/tm/t_funcs.h b/src/modules/tm/t_funcs.h index 6830b13..dbbdc19 100644 --- a/src/modules/tm/t_funcs.h +++ b/src/modules/tm/t_funcs.h @@ -110,7 +110,7 @@ int send_pr_buffer( struct retr_buf *rb, void *buf, int len);
#define UNREF_NOSTATS(_T_cell) \ do{\ - if (atomic_dec_and_test(&(_T_cell)->ref_count)){ \ + if (_T_cell && atomic_dec_and_test(&(_T_cell)->ref_count)){ \ unlink_timers((_T_cell)); \ free_cell((_T_cell)); \ }\ ```
You can add such safety check, but it would be good to understand why it gets there.
Can you paste the parameters set for tm and tmx modules? Also, list the event route blocks that are defined in the configuration file.
``` modparam("tm", "auto_inv_100", 0) modparam("tm", "reparse_on_dns_failover", 0) modparam("tm", "fr_timer", 9000) modparam("tm", "wt_timer", 5000) modparam("tm", "fr_inv_timer", 60000) modparam("tm", "max_inv_lifetime", 180000) modparam("tm", "restart_fr_on_each_reply", 0) modparam("tm", "failure_reply_mode", 3) modparam("tm", "contacts_avp", "tm_contacts") modparam("tm", "contact_flows_avp", "tm_contact_flows") ``` No tmx parameters
``` root@prx01b:/etc/kamailio/proxy# grep -r event_route headerrules.cfg:event_route[xhttp:request] { registrar.cfg:event_route[usrloc:contact-expired] probe.cfg:event_route[dispatcher:dst-down] { probe.cfg:event_route[dispatcher:dst-up] { kamailio.cfg:event_route[sl:filtered-ack] dialog.cfg:event_route[dialog:start] dialog.cfg:event_route[dialog:end] dialog.cfg:event_route[dialog:failed] ```
Could be a race of calling the post script callbacks, either the ones for request_route or from one of the event routes, because I notices some of them can trigger also these callbacks execution.
You can push the above patch to narrow does the race time frame for the moment. I will have to analyze a bit more the code and see what else has to be done there.
One more question, are you using for INVITE functions that do async processing, like http_async_query() (or other functions from async, evapi, ...)?
no, there's no async processing involve here.
Hmm, a little bit strange, because t_unref() has safety checks at the beginning:
```c int t_unref( struct sip_msg* p_msg ) { enum kill_reason kr;
if (T==T_UNDEFINED || T==T_NULL_CELL) return -1; ... ```
`T` being a local global cannot be overwritten by another process...
can it be something like this messing around? https://github.com/kamailio/kamailio/blob/master/src/modules/pv_headers/pv_h...
I am not sure it is the reason for this case, but something is wrong there, because if the global `T` is already set, then there is no `REF` done to `T`, but the function does always to `UNREF` at the end.
It should do first `t = tmb.t_gett();` and then `if(t == NULL || t == T_UNDEFINED)` do `tmb.t_check(msg, &_branch)` and on success retrieve `T` with another `tmb.t_gett()`. If `t_check()` had to be done, then `REF` was done and `UNREF` has to be done at the end.
Being something that should be useful in other places, I added a new inter-module API function that implements the logic above, respectively `t_find(...)` via the commit a9cf4577c25d7933531b8969a1941bac4faf8d68 .
It is completely independent, does not touch other internal functions, so should be fine to backport as well.
Closed #3156 as completed.
Closing since no more reports after backporting fixes were done, Thank you!!