### Description
Carrierroute module takes now 16 minutes to load about 50'000 entries from postgres-db. In version 5.0.3 it was 10 seconds from the same db.
### Troubleshooting
#### Reproduction
Install kamailio 5.0.3 and fill your carrierroute table with thousands of entries. See how fast its loaded. Then install 5.0.4 and see how long it takes now. You can observe the query "SELECT * FROM carrierroute WHERE carrier=1 and domain=1 and prob>0" multiple times in the postgres-db log.
#### Debugging Data
none.
#### Log Messages
not needed.
#### SIP Traffic
not needed.
### Possible Solutions
Perhaps try to do the check "in-memory" instead generate a new sql-query.
### Additional Information
This bug exists in Kamilio versions >=5.0.4 and is (potentially) introduced by:
commit 9800aba65146b72623bb512049300d1beb8c8ec4 Author: Huseyin Dikme hueseyin.dikme@1und1.de Date: Tue Sep 12 15:37:17 2017 +0200 carrierroute: warning for the same carrier/domain having routes with only 0 probability - While starting kamailio or reloading the routes, if the same carrier/domain pairs do not have any route with a probability other than 0 (zero) then an error log will be printed on the screen. Besides, the log "invalid dice_max value" in the cr_func.c has been made more clear. (cherry picked from commit 9741bee7af8136b35af8e6279e530aa0ad54f574)
* **Operating System**:
Ubuntu 20.04.2 LTS Linux 5.4.0-66-generic #74-Ubuntu SMP Wed Jan 27 22:54:38 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
@lbalaceanu - do you have any remarks on this report? Any change to carrierroute module that you are aware of and may result in such behaviour?
Or maybe @hdikme as the patch author can comment as well.
For reference, the same issue was reported in #2613 -- referencing it here, as it provides some other details that could be useful for troubleshooting.
@miconda @henningw sorry for the late reply, we have started working on the issue as well.
Hallo. We have difficulties in reproducing this. We have ~500.000 entries in the carrierroute table and it does not take so long for us to load them. The difference is, we are using MariaDB (not Postgres). Maybe @kristina258 can tell what DB engine was used in the previous bug report? We are not sure whether it is a DBengine issue or some kind of special table content. Do you maybe have many entries where where the prob is set 0?
test-proxy2-bs ~ # **mysql ser**
Reading table information for completion of table and column names You can turn off this feature to get a quicker startup with -A Welcome to the MariaDB monitor. Commands end with ; or \g. Your MariaDB connection id is 1411 Server version: 10.3.27-MariaDB-0+deb10u1-log Debian 10 Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. MariaDB [ser]> **select count(*) from carrierroute;** +----------+ | count(*) | +----------+ | **501684** | +----------+ 1 row in set (0.000 sec) MariaDB [ser]> Bye test-proxy2-bs ~ # **time kamcmd cr.reload_routes** real **0m35.289s** user 0m0.005s sys 0m0.000s test-proxy2-bs ~ # **kamailio -v** version: **kamailio 5.3.8** (x86_64/linux) flags: USE_TCP, USE_TLS, USE_SCTP, TLS_HOOKS, USE_RAW_SOCKS, DISABLE_NAGLE, USE_MCAST, DNS_IP_HACK, SHM_MMAP, PKG_MALLOC, Q_MALLOC, F_MALLOC, TLSF_MALLOC, DBG_SR_MEMORY, USE_FUTEX, FAST_LOCK-ADAPTIVE_WAIT, USE_DNS_CACHE, USE_DNS_FAILOVER, USE_NAPTR, USE_DST_BLACKLIST, HAVE_RESOLV_RES, TLS_PTHREAD_MUTEX_SHARED ADAPTIVE_WAIT_LOOPS 1024, MAX_RECV_BUFFER_SIZE 262144, MAX_URI_SIZE 1024, BUF_SIZE 65535, DEFAULT PKG_SIZE 8MB poll method support: poll, epoll_lt, epoll_et, sigio_rt, select. id: compiled with gcc 8.3.0 test-proxy2-bs ~ # **grep -r " and prob>0" /var/log/mysql/mysql.log** 210310 11:18:49 27 Query SELECT * FROM carrierroute WHERE carrier=9101 and domain=5 and prob>0 27 Query SELECT * FROM carrierroute WHERE carrier=9101 and domain=6 and prob>0 27 Query SELECT * FROM carrierroute WHERE carrier=9101 and domain=7 and prob>0 27 Query SELECT * FROM carrierroute WHERE carrier=9101 and domain=0 and prob>0 27 Query SELECT * FROM carrierroute WHERE carrier=9101 and domain=1 and prob>0 27 Query SELECT * FROM carrierroute WHERE carrier=9101 and domain=2 and prob>0
select count(*) from carrierroute where prob=0; 12822
select count(*) from carrierroute where prob=1; 30102
Thank you for fast reply. We will follow this path first and adapt our entries. We keep you informed.
Hi,
**select count(*) from carrierroute where prob=0;** 0
**select count(*) from carrierroute where prob=1;** 1673990
**select count(*) from carrierroute;** 1688229
**kamailio -v** version: kamailio 5.4.3 (x86_64/linux) 06bd17 flags: USE_TCP, USE_TLS, USE_SCTP, TLS_HOOKS, USE_RAW_SOCKS, DISABLE_NAGLE, USE_MCAST, DNS_IP_HACK, SHM_MMAP, PKG_MALLOC, Q_MALLOC, F_MALLOC, TLSF_MALLOC, DBG_SR_MEMORY, USE_FUTEX, FAST_LOCK-ADAPTIVE_WAIT, USE_DNS_CACHE, USE_DNS_FAILOVER, USE_NAPTR, USE_DST_BLACKLIST, HAVE_RESOLV_RES, TLS_PTHREAD_MUTEX_SHARED ADAPTIVE_WAIT_LOOPS 1024, MAX_RECV_BUFFER_SIZE 262144, MAX_URI_SIZE 1024, BUF_SIZE 65535, DEFAULT PKG_SIZE 8MB poll method support: poll, epoll_lt, epoll_et, sigio_rt, select. id: 06bd17 compiled on 16:10:30 Jan 25 2021 with gcc 8.3.0
**mysql** mysql Ver 15.1 Distrib 10.2.17-MariaDB, for debian-linux-gnu (x86_64) using readline 5.2
Thanks.
Hi @NiteCrwlr , just pushed commit 087c00a276623adad6b87242bc58f7c0b4d8ff33 as a possible fix for this bug. For us it worked since we were able to reproduce the situation by having a lot of entries in the carrierroute table with prob = 0.
Hi,
I have tested the with the new commit, but unfortunately the carrierroute preloading is still the same long for me.
Do you have any other advices for me how to debug such behaviour ?
Thank you.
Hi @kristina258, I will try to reproduce with postgres and will come back to you.
Hi @lbalaceanu: Sorry for late answer, but I did not have time to test it earlier.
I compiled kamailio with your commit. kamailio -v version: kamailio 5.5.0-dev4 (x86_64/linux)
It seems to be ok now! It loads again in 18 seconds. Same DB and Data. For crosschecking I compiled 5.0.3 and 5.0.4 on the same machine: - 5.0.3 -> 18s - 5.0.4 -> 19min - 5.5.0-dev4 -> 18s
Your fix seems to help. I did not perform any further tests.
Thanks to all for your work on kamailio project!
@kristina258 the problem seems to be fixed now for another reporter with the bug fix done. Can you maybe double check if you included the correct commit in your test setup?
Hi,
I am quite sure I tested with the correct commit. My problem seems to be a bit different, because I have no entries with 0 probability.
I tested the commit again, don't know if it helps:
One with prob 0: count | prob -------+------ 12822 | 0 3 | 0.5 6 | 0.01 30102 | 1
And one with prob 999: count | prob -------+------ 12822 | 999 3 | 0.5 6 | 0.01 30102 | 1
Same result in startup speed, around 17s.
Hello @kristina258 ,
Unfortunately, I can't reproduce your issue. I have built Postgres carrierroute tables using kamctlrc/kamdbctl and am testing using main branch (including new commit).
**time ./utils/kamcmd/kamcmd cr.reload_routes real 0m52.270s**
ser=# select count(*) from carrierroute where prob != 0; count -------- 501684
ser=# select count(*) from carrierroute where prob = 0; count ------- 0 (1 row)
Are you also using the carrierroute tables structures provided by the kamdbctl tool? Could you provide some logs from the start/reload process?
Thank you, Lucian
Hi @lbalaceanu,
probably you reproduced it. You had about 500 000 records in carrierroute table and it tooks to load 52s. I have about 1 700 000 and it took 3min. 3 times more records so the loading time is 3 times higher.
When I run kamailio on debian 8 with kamailio version 4.4.7 with the same DB and same carrierroute records it took to load aproximetly 20s.
I would like to kindly ask you, if it is possible for you to perform the test with kamailio version 4.4.7 and then check the time carrierroute needed for reloading.
Thank you for your patience.
@kristina258 in the initial report it was about 16 minutes for 50000 entries, which is clearly not optimal and a non-linear increase. This has been fixed, as i understood.
I think the times that you are observing could be more or less expected and are growing linearly with the number of records. I remember from many years ago that i saw similar load times for that large databases with carrierroute. It could be also caused from changes in the core, e.g. in the memory management area. In the end kamailio will allocate many millions of memory chunks just to store this data. It will also highly depend on the performance of the database etc..
But its in the end of course @lbalaceanu call to make, how he sees the situation for the module.
Hi @kristina258,
I have also tried with kamailio 4.4 but unfortunately this time I get 1min7 secs for the approx 500k entries in carrierroute tables created with kamdbctl tool. I don't know if I am doing something wrong as I don't have experience with postgres, but for me kamailio dev vs kamailio 4.4 are more ore less similar as far as load times are involved.
This being said, @kristina258, I would close the issue.
Closed #2653.