Hi Daniel,
After changing PROC_NOCHLDINIT for PROC_RPC in dmq.c no more errors!
One thing I'm curious, I do still see in the debug logs:
Feb 24 17:42:29 csbc02 csbc[17784]: DEBUG: dmq_usrloc [usrloc_sync.c:423]: usrloc_dmq_execute_action(): Received DMQ_UPDATE. Update contact info... Feb 24 17:42:29 csbc02 csbc[17784]: DEBUG: dmq_usrloc [usrloc_sync.c:75]: add_contact(): aor: 1063362 Feb 24 17:42:29 csbc02 csbc[17784]: DEBUG: dmq_usrloc [usrloc_sync.c:76]: add_contact(): ci->ruid: uloc-2-5e542c06-34ac-1 Feb 24 17:42:29 csbc02 csbc[17784]: DEBUG: dmq_usrloc [usrloc_sync.c:77]: add_contact(): aorhash: 928261667 Feb 24 17:42:29 csbc02 csbc[17784]: DEBUG: dmq_usrloc [usrloc_sync.c:83]: add_contact(): Found contact Feb 24 17:42:29 csbc02 csbc[17784]: DEBUG: usrloc [ucontact.c:1697]: update_ucontact(): exists callback for type= UL_CONTACT_UPDATE Feb 24 17:42:29 csbc02 csbc[17784]: DEBUG: usrloc [ul_callback.h:83]: run_ul_callbacks(): contact=0x7fc404bc6d20, callback type 2/15, id 0 entered Feb 24 17:42:29 csbc02 csbc[17784]: DEBUG: dmq_usrloc [usrloc_sync.c:800]: dmq_ul_cb_contact(): Callback from usrloc with type=2 Feb 24 17:42:29 csbc02 csbc[17784]: DEBUG: dmq_usrloc [usrloc_sync.c:826]: dmq_ul_cb_contact(): Contact received from DMQ... skip Feb 24 17:42:29 csbc02 csbc[17784]: DEBUG: usrloc [ucontact.c:1009]: db_update_ucontact_ruid(): ruid:uloc-2-5e542c06-34ac-1 Feb 24 17:42:29 csbc02 csbc[17784]: DEBUG: usrloc [ucontact.c:1138]: db_update_ucontact_ruid(): contact:sip:1063362@84.121.150.224:61183 Feb 24 17:42:29 csbc02 csbc[17784]: DEBUG: <core> [db_query.c:72]: db_do_submit_query(): submitted query: update `location` set `expires`='2020-02-24 17:43:29',`q`=-1.000000 ,`cseq`=33578,`flags`=0,`cflags`=0 ,`user_agent`='Linksys/SPA941-5.1.8',`received`=NULL,`path`=NULL,`socket`='udp: 138.99.136.180:5060',`methods`=4767,`last_modified`='2020-02-24 17:42:29',`callid`='296d8dde-627ad048@84.121.150.224',`instance`= NULL,`reg_id`=0,`server_id`=1,`connection_id`=-1,`keepalive`=0,`contact`=' sip:1063362@84.121.150.224:61183' where `ruid`='uloc-2-5e542c06-34ac-1', result 0 Feb 24 17:42:29 csbc02 csbc[17784]: DEBUG: dmq_usrloc [usrloc_sync.c:85]: add_contact(): Release record Feb 24 17:42:29 csbc02 csbc[17784]: DEBUG: dmq_usrloc [usrloc_sync.c:87]: add_contact(): Unlock udomain
Having set modparam("usrloc", "server_id_filter", 1), shouldn't the node receiving locations records via DMQ *not* update the database (as the server_id param won't match with the one in the DMQ request)?
Thanks again! Joel.
H
On Mon, Feb 24, 2020 at 12:43 PM Joel Serrano joel@textplus.com wrote:
Absolutely, I'll report back shortly!
On Mon, Feb 24, 2020 at 12:19 PM Daniel-Constantin Mierla < miconda@gmail.com> wrote:
Hello,
the database connection is not initialized for that process based on the c code, as it is started with no child init callback, where the db connection is usually created: src/modules/dmq/dmq.c
283: newpid = fork_process(PROC_NOCHLDINIT, "DMQ WORKER", 0);
Can you try with PROC_RPC instead of PROC_NOCHLDINIT and see if works ok?
Cheers, Daniel On 24.02.20 21:04, Joel Serrano wrote:
Hi Daniel,
The process desc says: DMQ WORKER
Joel.
On Mon, Feb 24, 2020 at 11:55 AM Daniel-Constantin Mierla < miconda@gmail.com> wrote:
Can you get the type of the process throwing the error? The pid is printed in the log message and the type you can find in the description corresponding to the pid in the output of:
kamctl ps
Cheers, Daniel On 24.02.20 20:42, Joel Serrano wrote:
Ok so my findings so far:
1- db_mode=2 --> No errors. 2- db_mode=1 --> The reported error. 3- modparam("usrloc", "server_id_filter", 1) --> Irrelevant, errors happen with or without it on db_mode=1.
Anything else I can try in order to find out what is happening? My next step was to enable mysql logs or capture traffic and see what exactly is Kam sending...
Let me know your thoughts.
Thanks, Joel.
On Mon, Feb 24, 2020 at 11:26 AM Joel Serrano joel@textplus.com wrote:
Hi Henning,
Every time I register, NodeA says OK, sends location by DMQ to NodeB, nodeB logs the error.
I tried moving the VIP from NodeA to NodeB, and the error stops happening on NodeA, and starts happening on NodeB.
I did not see this happening with db_mode=2, and I *think* it started happening when I enabled the server_id related params? I'm not 100% sure of this so I'm going to try some different combinations to see if I can narrow it down a little and provide more accurate details for you guys to find the issue.
I'll report back shortly...
Thanks! Joel.
On Mon, Feb 24, 2020 at 11:20 AM Henning Westerholt hw@skalatan.de wrote:
Hello Joel,
Feb 24 13:58:46 csbc02 csbc[9280]: ERROR: <core> [db.c:481]: db_use_table(): invalid parameter value Feb 24 13:58:46 csbc02 csbc[9280]: ERROR: usrloc [ucontact.c:670]: db_insert_ucontact(): sql use_table failed for: location
So the table name seems to be ok - this function also log this error if the DB connection handle is not initialized. But still the question why this happens.
Cheers,
Henning
--
Henning Westerholt – https://skalatan.de/blog/
Kamailio services – https://gilawa.com
*From:* sr-users sr-users-bounces@lists.kamailio.org *On Behalf Of *Joel Serrano *Sent:* Monday, February 24, 2020 8:10 PM *To:* Daniel-Constantin Mierla miconda@gmail.com *Cc:* Kamailio (SER) - Users Mailing List <sr-users@lists.kamailio.org
*Subject:* Re: [SR-Users] Database errors with usrloc db_mode=1 and dmq_usrloc
Hello Daniel,
I built from master and re-tested, debug log:
Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG: <core> [core/socket_info.c:635]: grep_sock_info(): checking if port 5060 (advertise 0) matches port 5060 Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG: dmq_usrloc [usrloc_sync.c:423]: usrloc_dmq_execute_action(): Received DMQ_UPDATE. Update contact info... Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG: dmq_usrloc [usrloc_sync.c:75]: add_contact(): aor: 1000107 Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG: dmq_usrloc [usrloc_sync.c:76]: add_contact(): ci->ruid: uloc-2-5e507d6a-1bd5-3 Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG: dmq_usrloc [usrloc_sync.c:77]: add_contact(): aorhash: 928379112 Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG: dmq_usrloc [usrloc_sync.c:115]: add_contact(): '1000107' Not found in usrloc Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG: dmq_usrloc [usrloc_sync.c:117]: add_contact(): Insert record Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG: usrloc [ul_callback.h:83]: run_ul_callbacks(): contact=0x7f3a8cfdf170, callback type 1/15, id 0 entered Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG: dmq_usrloc [usrloc_sync.c:800]: dmq_ul_cb_contact(): Callback from usrloc with type=1 Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG: dmq_usrloc [usrloc_sync.c:826]: dmq_ul_cb_contact(): Contact received from DMQ... skip Feb 24 13:58:46 csbc02 csbc[9280]: ERROR: <core> [db.c:481]: db_use_table(): invalid parameter value Feb 24 13:58:46 csbc02 csbc[9280]: ERROR: usrloc [ucontact.c:670]: db_insert_ucontact(): sql use_table failed for: location Feb 24 13:58:46 csbc02 csbc[9280]: ERROR: usrloc [urecord.c:598]: insert_ucontact(): failed to insert in database Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG: dmq_usrloc [usrloc_sync.c:121]: add_contact(): Insert ucontact Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG: dmq_usrloc [usrloc_sync.c:124]: add_contact(): Release record Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG: dmq_usrloc [usrloc_sync.c:126]: add_contact(): Unlock udomain Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG: sl [sl.c:305]: send_reply(): reply in stateless mode (sl) Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG: <core> [core/msg_translator.c:161]: check_via_address(): (10.2.1.181, 10.2.1.181, 0) Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG: dialog [dlg_cseq.c:116]: dlg_cseq_prepare_new_msg(): prepare new msg for cseq update operations Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG: <core> [core/parser/msg_parser.c:620]: parse_msg(): SIP Reply (status): Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG: <core> [core/parser/msg_parser.c:622]: parse_msg(): version: <SIP/2.0> Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG: <core> [core/parser/msg_parser.c:624]: parse_msg(): status: <200> Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG: <core> [core/parser/msg_parser.c:626]: parse_msg(): reason: <OK>
Thanks,
Joel.
On Mon, Feb 24, 2020 at 1:36 AM Daniel-Constantin Mierla < miconda@gmail.com> wrote:
Hello,
the error message:
Feb 21 00:11:25 csbc02 csbc[13916]: ERROR: <core> [db.c:481]: db_use_table(): invalid parameter value
Is printed if the database table name is not given as parameter (eg, null or empty). I pushed a patch to usrloc to print the table in the log message, can you try with latest master (or patch from next commit) and send again the log messages?
https://github.com/kamailio/kamailio/commit/3843ae18599bfc1806ccb62df44bbc8a...
Cheers, Daniel
On 21.02.20 22:40, Joel Serrano wrote:
Hello,
I have a setup with 2 kamailio in active/idle, locations are shared via dmq_usrloc, but database is also used as a cache where other systems can access and query "real-time" data.
The config we have is:
# ----- usrloc params ----- modparam("usrloc", "db_url", DBURL) modparam("usrloc", "db_mode", 1) modparam("usrloc", "db_load", 0) modparam("usrloc", "db_insert_update", 1) modparam("usrloc", "db_timer_clean", 1) modparam("usrloc", "server_id_filter", 1) modparam("usrloc", "timer_procs", 2) modparam("usrloc", "timer_interval", 60)
# ----- dmq_usrloc params ----- modparam("dmq_usrloc", "enable", 1) modparam("dmq_usrloc", "sync", 1) modparam("dmq_usrloc", "batch_msg_contacts", 50) # 50 contacts / message modparam("dmq_usrloc", "batch_size", 10000) # 10000 contacts / batch modparam("dmq_usrloc", "batch_usleep", 500000) # one batch every 500ms
With this setup, we assume:
1- almost real-time updated location info in the database.
2- as long as one of the two Kam is up, location information will not be lost (as they will sync with each other via DMQ)
3- if both servers go down, locations are lost (as they will not reload from database on startup).
Our tests work well with the above config, with one exception, on the idle node, we see a lot of (only happens on the node receiving location via dmq):
Feb 21 00:11:25 csbc02 csbc[13916]: ERROR: <core> [db.c:481]: db_use_table(): invalid parameter value Feb 21 00:11:25 csbc02 csbc[13916]: ERROR: usrloc [ucontact.c:1136]: db_update_ucontact_ruid(): sql use_table failed Feb 21 00:11:25 csbc02 csbc[13916]: ERROR: usrloc [ucontact.c:1657]: update_contact_db(): failed to update database
I added debug logs to see if I could understand the reason:
Feb 21 14:38:34 csbc02 csbc[30796]: DEBUG: dmq_usrloc [usrloc_sync.c:478]: usrloc_dmq_handle_msg(): dmq message received from sip:usrloc@10.2.1.181:5050 Feb 21 14:38:34 csbc02 csbc[30796]: DEBUG: dmq_usrloc [usrloc_sync.c:423]: usrloc_dmq_execute_action(): Received DMQ_UPDATE. Update contact info... Feb 21 14:38:34 csbc02 csbc[30796]: DEBUG: dmq_usrloc [usrloc_sync.c:75]: add_contact(): aor: 1063362 Feb 21 14:38:34 csbc02 csbc[30796]: DEBUG: dmq_usrloc [usrloc_sync.c:76]: add_contact(): ci->ruid: uloc-1-5e502cd3-5629-1 Feb 21 14:38:34 csbc02 csbc[30796]: DEBUG: dmq_usrloc [usrloc_sync.c:77]: add_contact(): aorhash: 928261667 Feb 21 14:38:34 csbc02 csbc[30796]: DEBUG: dmq_usrloc [usrloc_sync.c:115]: add_contact(): '1063362' Not found in usrloc Feb 21 14:38:34 csbc02 csbc[30796]: DEBUG: dmq_usrloc [usrloc_sync.c:117]: add_contact(): Insert record Feb 21 14:38:34 csbc02 csbc[30796]: DEBUG: usrloc [ul_callback.h:83]: run_ul_callbacks(): contact=0x7fe41ff41f40, callback type 1/15, id 0 entered Feb 21 14:38:34 csbc02 csbc[30796]: DEBUG: dmq_usrloc [usrloc_sync.c:800]: dmq_ul_cb_contact(): Callback from usrloc with type=1 Feb 21 14:38:34 csbc02 csbc[30796]: DEBUG: dmq_usrloc [usrloc_sync.c:826]: dmq_ul_cb_contact(): Contact received from DMQ... skip Feb 21 14:38:34 csbc02 csbc[30796]: ERROR: <core> [db.c:481]: db_use_table(): invalid parameter value Feb 21 14:38:34 csbc02 csbc[30796]: ERROR: usrloc [ucontact.c:669]: db_insert_ucontact(): sql use_table failed Feb 21 14:38:34 csbc02 csbc[30796]: ERROR: usrloc [urecord.c:598]: insert_ucontact(): failed to insert in database Feb 21 14:38:34 csbc02 csbc[30796]: DEBUG: dmq_usrloc [usrloc_sync.c:121]: add_contact(): Insert ucontact Feb 21 14:38:34 csbc02 csbc[30796]: DEBUG: dmq_usrloc [usrloc_sync.c:124]: add_contact(): Release record Feb 21 14:38:34 csbc02 csbc[30796]: DEBUG: dmq_usrloc [usrloc_sync.c:126]: add_contact(): Unlock udomain Feb 21 14:38:34 csbc02 csbc[30796]: DEBUG: sl [sl.c:305]: send_reply(): reply in stateless mode (sl) Feb 21 14:38:34 csbc02 csbc[30796]: DEBUG: <core> [core/msg_translator.c:161]: check_via_address(): (10.2.1.181, 10.2.1.181, 0)
Can anyone help me understand what's the problem? I have the server_id setting, and the modparam server_id_filter, so shouldn't the receiving node just ignore any database tasks?
Thanks,
Joel.
Kamailio (SER) - Users Mailing List
sr-users@lists.kamailio.org
https://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-users
--
Daniel-Constantin Mierla -- www.asipto.com
www.twitter.com/miconda -- www.linkedin.com/in/miconda
Kamailio Advanced Training - March 9-11, 2020, Berlin - www.asipto.com
Kamailio World Conference - April 27-29, 2020, in Berlin -- www.kamailioworld.com
--
Daniel-Constantin Mierla -- www.asipto.comwww.twitter.com/miconda -- www.linkedin.com/in/miconda Kamailio Advanced Training - March 9-11, 2020, Berlin - www.asipto.com Kamailio World Conference - April 27-29, 2020, in Berlin -- www.kamailioworld.com
--
Daniel-Constantin Mierla -- www.asipto.comwww.twitter.com/miconda -- www.linkedin.com/in/miconda Kamailio Advanced Training - March 9-11, 2020, Berlin - www.asipto.com Kamailio World Conference - April 27-29, 2020, in Berlin -- www.kamailioworld.com