### Description After upgrading from the 5.6.1 release by applying 5.6 branch changes through ebda78c92c, the `acc_cdr` entries for `start_time` and `end_time` in my PostgreSQL database are incorrect; the syslog output is also incorrect. The entries in the `acc` table contain the correct timestamps.
In PostgreSQL, each of these timestamp columns are `timestamp without time zone`.
``` kamailio=# SELECT id, start_time, end_time, duration, callid FROM acc_cdrs ORDER BY id DESC LIMIT 1; id | start_time | end_time | duration | callid -------+---------------------+---------------------+----------+----------------------------------------- 24867 | 1969-12-31 18:00:25 | 1969-12-31 18:00:25 | 25.41 | 0_1083748456@fd00:a::215:65ff:fe9e:e168 (1 row)
kamailio=# SELECT id, method, time, callid FROM acc ORDER BY id DESC LIMIT 2; id | method | time | callid --------+--------+---------------------+----------------------------------------- 108653 | BYE | 2022-09-17 20:54:43 | 0_1083748456@fd00:a::215:65ff:fe9e:e168 108652 | INVITE | 2022-09-17 20:54:18 | 0_1083748456@fd00:a::215:65ff:fe9e:e168 ``` #### Log Messages It appears that `duration` is being copied into `start_time` and `end_time`.
``` INFO: acc [acc_cdr.c:367]: log_write_cdr(): start_time=25.410; end_time=25.410; duration=25.410; callid=0_1083748456@fd00:a::215:65ff:fe9e:e168; from_tag=1137456145; to_tag=645d9eee-9c3a-420a-ac37-8d3c741a5ce9; ```
### Additional Information
* **Kamailio Version** - output of `kamailio -v`
``` version: kamailio 5.6.1-2.20220917gitebda78c9.fc36 (x86_64/linux) b36a13 flags: USE_TCP, USE_TLS, USE_SCTP, TLS_HOOKS, USE_RAW_SOCKS, DISABLE_NAGLE, USE_MCAST, DNS_IP_HACK, SHM_MMAP, PKG_MALLOC, Q_MALLOC, F_MALLOC, TLSF_MALLOC, DBG_SR_MEMORY, USE_FUTEX, FAST_LOCK-ADAPTIVE_WAIT, USE_DNS_CACHE, USE_DNS_FAILOVER, USE_NAPTR, USE_DST_BLOCKLIST, HAVE_RESOLV_RES, TLS_PTHREAD_MUTEX_SHARED ADAPTIVE_WAIT_LOOPS 1024, MAX_RECV_BUFFER_SIZE 262144, MAX_URI_SIZE 1024, BUF_SIZE 65535, DEFAULT PKG_SIZE 8MB poll method support: poll, epoll_lt, epoll_et, sigio_rt, select. id: b36a13 compiled on 00:00:00 Sep 18 2022 with gcc 12.2.1 ```
* **Operating System**:
<!-- Details about the operating system, the type: Linux (e.g.,: Debian 8.4, Ubuntu 16.04, CentOS 7.1, ...), MacOS, xBSD, Solaris, ...; Kernel details (output of `lsb_release -a` and `uname -a`) -->
``` Fedora 36 Linux 5.19.9-200.fc36.x86_64 #1 SMP PREEMPT_DYNAMIC Thu Sep 15 09:49:52 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux ```
Looks like a regression from the patches cloning dialog vars to be safer on race conditions. I will review those patches.
Can you try with master branch or backporting the next commits?
* c5f71a45c7c55af0cf4a6aa482d8a57018af7900 * d616dc46b5373b1474bf855afde4c5687d2a2d09
If all ok, I will backport.
Unfortunately I see the same issue with those two commits backported to 5.6: ``` kamailio[352625]: INFO: acc [acc_cdr.c:392]: log_write_cdr(): start_time=14.234; end_time=14.234; duration=14.234; callid=201768_rel174ZTA3ODQyZWYyOTBmYWY4NTg3M2ZlZjI5M2I1MmM4NWU; ```
``` kamailio=# SELECT id, start_time, end_time, duration, callid FROM acc_cdrs ORDER BY id DESC LIMIT 1; id | start_time | end_time | duration | callid -------+---------------------+---------------------+----------+---------------------------------------------------------- 24869 | 1969-12-31 18:00:14 | 1969-12-31 18:00:14 | 14.234 | 201768_rel174ZTA3ODQyZWYyOTBmYWY4NTg3M2ZlZjI5M2I1MmM4NWU (1 row) ```
Can you try with master branch? I pushed a group of commits, if works fine, then I will backport.
Can you try with master branch? I pushed a group of commits, if works fine, then I will backport.
I was hoping not to have to pull down master on my (one and only) Kamailio production server, though I see the dialog changes in master are quite invasive -- difficult to backport. How soon will master turn into a new release?
Well, the changes are related to getting dialog variable value, they seem larger because I had to make a couple of variants to cover different use case. I do not use dialog variables myself (typically I rely on htable for "call-context" data), but given that there are quite some reports on random crashes with dialog, I looked over and noticed that access to dlg variable value is unprotected.
Practically, it was returning the pointer to the value in the memory, without keeping any lock on or having reference counters. That makes it vulnerable to race conditions (e.g., when another process changes the value or dialog gets destroyed) and it could have been the reason for those unpredictable crashes.
I tried a quick fix with a static variable and clone in pv buffer, but it had side effects to other modules based on how they used to access (and keep references) many variables at the same time. So I decided to go for a more elaborate solution with a couple for getter functions to cover better use cases.
A new major release from master branch is not yet planned, it could take a few months.
@miconda, thank you for the detailed follow up. I am looking at what it would take on my end to package up an install based on master to provide better feedback on this.
In the meantime, I do think it's interesting that the acc_cdrs (on the 5.6 branch with tip ebda78c92c) contains the correct call duration, and that the duration is duplicated into the start_time and end_time -- wondering if it's related to 60a62d6f7df233d61d4ccecc39504e77e25421b8.
The following are the only changes to the acc or dialog modules between the 5.6.1 release (which works) and ebda78c92c (which doesn't).
`$ git log --oneline 5.6.1..ebda78c92c -- src/modules/{acc,dialog}/` ad753f4657 modules: readme files regenerated - modules ... [skip ci] 644938a6be dialog: remove strlen() checks when setting a variable 2427f1951a dialog: wrapper function to set a dlg var with unsigned integer parameter 38cbf04afa dialog: fetch directly the int value for dlg variable used in dlg_cseq_update() 363a6267c4 dialog: safer version for getting dlg variable value 948f066718 dialog: updated to get the int value for cseq update operations 60a62d6f7d acc: duplicate dlg var values used in extra2strar_dlg_only() 5bcb685d19 dialog: dont reset dlg_db_mode in POSTCHILDINIT for DB_MODE_SHUTDOWN e8d797cb26 acc: add some explanation that dialogs need to be tracked for CDR accounting
I'm thinking I can try to revert these one-at-a-time and see what happens (probably starting with 60a62d6f7d). If there isn't a 5.7 version planned for some time, perhaps a 5.6 revert and/or other fix is necessary (assuming the master changes are too invasive to backport directly).
The changes are not very intrusive, they change only how the value for dialog variables can be retrieved. Nothing else in terms of setting variables or dialog management is changed, only more variants to get dlg var value were added.
If you revert, then you get back to the unsafe version which can lead to a crash at any time.
If the changes are not intrusive, they we probably would like to backport them to 5.6 branch, to be part of the 5.6.2 release. This probably should address the concerns about using git master from @amessina
It would have been good to have some basic testing based on master, but given that the state on 5.6 is broken in terms of acc-cdr core attributes, I back ported the commits and pushed to 5.6. Now testing with 5.6 branch should be easier.
Agree, its not the optimal solution. Lets hope they will get a bit more testing on 5.6 branch.
@miconda and @henningw, with 5.6 branch updates through bfc5c2aa, all is well again. I'll continue to watch and will create new issues if any arise. Otherwise, since the "issue" in this report is now resolved in the 5.6 branch, I will close. Thank you for the backport.
``` kamailio[214299]: INFO: acc [acc_cdr.c:392]: log_write_cdr(): start_time=1664327780.622; end_time=1664327803.902; duration=23.280; callid=0_1343402902@fd00:a::215:65ff:fe9e:e168 ``` ``` kamailio=# SELECT id, start_time, end_time, duration, callid FROM acc_cdrs ORDER BY id DESC LIMIT 1; id | start_time | end_time | duration | callid -------+---------------------+---------------------+----------+----------------------------------------- 24881 | 2022-09-27 20:16:20 | 2022-09-27 20:16:43 | 23.28 | 0_1343402902@fd00:a::215:65ff:fe9e:e168 ```
Closed #3243 as completed.