[SR-Users] Database errors with usrloc db_mode=1 and dmq_usrloc
Daniel-Constantin Mierla
miconda at gmail.com
Tue Feb 25 08:45:42 CET 2020
Hello,
ok, thanks fr testing, I pushed a commit for it in master and 5.3
branches for now.
Regarding the server id, I haven't implemented the dmq replication for
location records and I guess that the main scope was to work independent
database servers for a geo-distributed deployment, so inserting all
records was considered necessary.
If you want some other behaviour, describe it in a feature request and
it can be implemented.
Cheers,
Daniel
On 24.02.20 23:45, Joel Serrano wrote:
> Hi Daniel,
>
> After changing PROC_NOCHLDINIT for PROC_RPC in dmq.c no more errors!
>
> One thing I'm curious, I do still see in the debug logs:
>
> Feb 24 17:42:29 csbc02 csbc[17784]: DEBUG: dmq_usrloc
> [usrloc_sync.c:423]: usrloc_dmq_execute_action(): Received DMQ_UPDATE.
> Update contact info...
> Feb 24 17:42:29 csbc02 csbc[17784]: DEBUG: dmq_usrloc
> [usrloc_sync.c:75]: add_contact(): aor: 1063362
> Feb 24 17:42:29 csbc02 csbc[17784]: DEBUG: dmq_usrloc
> [usrloc_sync.c:76]: add_contact(): ci->ruid: uloc-2-5e542c06-34ac-1
> Feb 24 17:42:29 csbc02 csbc[17784]: DEBUG: dmq_usrloc
> [usrloc_sync.c:77]: add_contact(): aorhash: 928261667
> Feb 24 17:42:29 csbc02 csbc[17784]: DEBUG: dmq_usrloc
> [usrloc_sync.c:83]: add_contact(): Found contact
> Feb 24 17:42:29 csbc02 csbc[17784]: DEBUG: usrloc [ucontact.c:1697]:
> update_ucontact(): exists callback for type= UL_CONTACT_UPDATE
> Feb 24 17:42:29 csbc02 csbc[17784]: DEBUG: usrloc [ul_callback.h:83]:
> run_ul_callbacks(): contact=0x7fc404bc6d20, callback type 2/15, id 0
> entered
> Feb 24 17:42:29 csbc02 csbc[17784]: DEBUG: dmq_usrloc
> [usrloc_sync.c:800]: dmq_ul_cb_contact(): Callback from usrloc with type=2
> Feb 24 17:42:29 csbc02 csbc[17784]: DEBUG: dmq_usrloc
> [usrloc_sync.c:826]: dmq_ul_cb_contact(): Contact received from DMQ...
> skip
> Feb 24 17:42:29 csbc02 csbc[17784]: DEBUG: usrloc [ucontact.c:1009]:
> db_update_ucontact_ruid(): ruid:uloc-2-5e542c06-34ac-1
> Feb 24 17:42:29 csbc02 csbc[17784]: DEBUG: usrloc [ucontact.c:1138]:
> db_update_ucontact_ruid(): contact:sip:1063362 at 84.121.150.224:61183
> <http://contact:sip:1063362@84.121.150.224:61183>
> Feb 24 17:42:29 csbc02 csbc[17784]: DEBUG: <core> [db_query.c:72]:
> db_do_submit_query(): submitted query: update `location` set
> `expires`='2020-02-24 17:43:29',`q`=-1.000000
> ,`cseq`=33578,`flags`=0,`cflags`=0
> ,`user_agent`='Linksys/SPA941-5.1.8',`received`=NULL,`path`=NULL,`socket`='udp:138.99.136.180:5060
> <http://138.99.136.180:5060>',`methods`=4767,`last_modified`='2020-02-24
> 17:42:29',`callid`='296d8dde-627ad048 at 84.121.150.224
> <mailto:296d8dde-627ad048 at 84.121.150.224>',`instance`=
> NULL,`reg_id`=0,`server_id`=1,`connection_id`=-1,`keepalive`=0,`contact`='sip:1063362 at 84.121.150.224:61183
> <http://sip:1063362@84.121.150.224:61183>' where
> `ruid`='uloc-2-5e542c06-34ac-1', result 0
> Feb 24 17:42:29 csbc02 csbc[17784]: DEBUG: dmq_usrloc
> [usrloc_sync.c:85]: add_contact(): Release record
> Feb 24 17:42:29 csbc02 csbc[17784]: DEBUG: dmq_usrloc
> [usrloc_sync.c:87]: add_contact(): Unlock udomain
>
> Having set modparam("usrloc", "server_id_filter", 1), shouldn't the
> node receiving locations records via DMQ *not* update the database (as
> the server_id param won't match with the one in the DMQ request)?
>
> Thanks again!
> Joel.
>
>
> H
>
> On Mon, Feb 24, 2020 at 12:43 PM Joel Serrano <joel at textplus.com
> <mailto:joel at textplus.com>> wrote:
>
> Absolutely, I'll report back shortly!
>
> On Mon, Feb 24, 2020 at 12:19 PM Daniel-Constantin Mierla
> <miconda at gmail.com <mailto:miconda at gmail.com>> wrote:
>
> Hello,
>
> the database connection is not initialized for that process
> based on the c code, as it is started with no child init
> callback, where the db connection is usually created:
>
> src/modules/dmq/dmq.c
>
> 283: newpid = fork_process(PROC_NOCHLDINIT, "DMQ
> WORKER", 0);
>
> Can you try with PROC_RPC instead of PROC_NOCHLDINIT and see
> if works ok?
>
> Cheers,
> Daniel
>
> On 24.02.20 21:04, Joel Serrano wrote:
>> Hi Daniel,
>>
>> The process desc says: DMQ WORKER
>>
>> Joel.
>>
>> On Mon, Feb 24, 2020 at 11:55 AM Daniel-Constantin Mierla
>> <miconda at gmail.com <mailto:miconda at gmail.com>> wrote:
>>
>> Can you get the type of the process throwing the error?
>> The pid is printed in the log message and the type you
>> can find in the description corresponding to the pid in
>> the output of:
>>
>> kamctl ps
>>
>> Cheers,
>> Daniel
>>
>> On 24.02.20 20:42, Joel Serrano wrote:
>>> Ok so my findings so far:
>>>
>>> 1- db_mode=2 --> No errors.
>>> 2- db_mode=1 --> The reported error.
>>> 3- modparam("usrloc", "server_id_filter", 1) -->
>>> Irrelevant, errors happen with or without it on db_mode=1.
>>>
>>> Anything else I can try in order to find out what is
>>> happening? My next step was to enable mysql logs or
>>> capture traffic and see what exactly is Kam sending...
>>>
>>> Let me know your thoughts.
>>>
>>> Thanks,
>>> Joel.
>>>
>>>
>>>
>>> On Mon, Feb 24, 2020 at 11:26 AM Joel Serrano
>>> <joel at textplus.com <mailto:joel at textplus.com>> wrote:
>>>
>>> Hi Henning,
>>>
>>> Every time I register, NodeA says OK, sends location
>>> by DMQ to NodeB, nodeB logs the error.
>>>
>>> I tried moving the VIP from NodeA to NodeB, and the
>>> error stops happening on NodeA, and starts happening
>>> on NodeB.
>>>
>>> I did not see this happening with db_mode=2, and I
>>> *think* it started happening when I enabled the
>>> server_id related params? I'm not 100% sure of this
>>> so I'm going to try some different combinations to
>>> see if I can narrow it down a little and provide
>>> more accurate details for you guys to find the issue.
>>>
>>> I'll report back shortly...
>>>
>>> Thanks!
>>> Joel.
>>>
>>>
>>>
>>> On Mon, Feb 24, 2020 at 11:20 AM Henning Westerholt
>>> <hw at skalatan.de <mailto:hw at skalatan.de>> wrote:
>>>
>>> Hello Joel,
>>>
>>>
>>>
>>> Feb 24 13:58:46 csbc02 csbc[9280]: ERROR: <core>
>>> [db.c:481]: db_use_table(): invalid parameter value
>>> Feb 24 13:58:46 csbc02 csbc[9280]: ERROR: usrloc
>>> [ucontact.c:670]: db_insert_ucontact(): sql
>>> use_table failed for: location
>>>
>>>
>>>
>>> So the table name seems to be ok - this function
>>> also log this error if the DB connection handle
>>> is not initialized. But still the question why
>>> this happens.
>>>
>>>
>>>
>>> Cheers,
>>>
>>>
>>>
>>> Henning
>>>
>>>
>>>
>>> --
>>>
>>> Henning Westerholt – https://skalatan.de/blog/
>>>
>>> Kamailio services – https://gilawa.com
>>> <https://gilawa.com/>
>>>
>>>
>>>
>>> *From:* sr-users
>>> <sr-users-bounces at lists.kamailio.org
>>> <mailto:sr-users-bounces at lists.kamailio.org>>
>>> *On Behalf Of *Joel Serrano
>>> *Sent:* Monday, February 24, 2020 8:10 PM
>>> *To:* Daniel-Constantin Mierla
>>> <miconda at gmail.com <mailto:miconda at gmail.com>>
>>> *Cc:* Kamailio (SER) - Users Mailing List
>>> <sr-users at lists.kamailio.org
>>> <mailto:sr-users at lists.kamailio.org>>
>>> *Subject:* Re: [SR-Users] Database errors with
>>> usrloc db_mode=1 and dmq_usrloc
>>>
>>>
>>>
>>> Hello Daniel,
>>>
>>>
>>>
>>> I built from master and re-tested, debug log:
>>>
>>>
>>>
>>> Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG: <core>
>>> [core/socket_info.c:635]: grep_sock_info():
>>> checking if port 5060 (advertise 0) matches port
>>> 5060
>>> Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG:
>>> dmq_usrloc [usrloc_sync.c:423]:
>>> usrloc_dmq_execute_action(): Received
>>> DMQ_UPDATE. Update contact info...
>>> Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG:
>>> dmq_usrloc [usrloc_sync.c:75]: add_contact():
>>> aor: 1000107
>>> Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG:
>>> dmq_usrloc [usrloc_sync.c:76]: add_contact():
>>> ci->ruid: uloc-2-5e507d6a-1bd5-3
>>> Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG:
>>> dmq_usrloc [usrloc_sync.c:77]: add_contact():
>>> aorhash: 928379112
>>> Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG:
>>> dmq_usrloc [usrloc_sync.c:115]: add_contact():
>>> '1000107' Not found in usrloc
>>> Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG:
>>> dmq_usrloc [usrloc_sync.c:117]: add_contact():
>>> Insert record
>>> Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG: usrloc
>>> [ul_callback.h:83]: run_ul_callbacks():
>>> contact=0x7f3a8cfdf170, callback type 1/15, id 0
>>> entered
>>> Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG:
>>> dmq_usrloc [usrloc_sync.c:800]:
>>> dmq_ul_cb_contact(): Callback from usrloc with
>>> type=1
>>> Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG:
>>> dmq_usrloc [usrloc_sync.c:826]:
>>> dmq_ul_cb_contact(): Contact received from
>>> DMQ... skip
>>> Feb 24 13:58:46 csbc02 csbc[9280]: ERROR: <core>
>>> [db.c:481]: db_use_table(): invalid parameter value
>>> Feb 24 13:58:46 csbc02 csbc[9280]: ERROR: usrloc
>>> [ucontact.c:670]: db_insert_ucontact(): sql
>>> use_table failed for: location
>>> Feb 24 13:58:46 csbc02 csbc[9280]: ERROR: usrloc
>>> [urecord.c:598]: insert_ucontact(): failed to
>>> insert in database
>>> Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG:
>>> dmq_usrloc [usrloc_sync.c:121]: add_contact():
>>> Insert ucontact
>>> Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG:
>>> dmq_usrloc [usrloc_sync.c:124]: add_contact():
>>> Release record
>>> Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG:
>>> dmq_usrloc [usrloc_sync.c:126]: add_contact():
>>> Unlock udomain
>>> Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG: sl
>>> [sl.c:305]: send_reply(): reply in stateless
>>> mode (sl)
>>> Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG: <core>
>>> [core/msg_translator.c:161]:
>>> check_via_address(): (10.2.1.181, 10.2.1.181, 0)
>>> Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG: dialog
>>> [dlg_cseq.c:116]: dlg_cseq_prepare_new_msg():
>>> prepare new msg for cseq update operations
>>> Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG: <core>
>>> [core/parser/msg_parser.c:620]: parse_msg(): SIP
>>> Reply (status):
>>> Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG: <core>
>>> [core/parser/msg_parser.c:622]: parse_msg():
>>> version: <SIP/2.0>
>>> Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG: <core>
>>> [core/parser/msg_parser.c:624]: parse_msg():
>>> status: <200>
>>> Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG: <core>
>>> [core/parser/msg_parser.c:626]: parse_msg():
>>> reason: <OK>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> Thanks,
>>>
>>> Joel.
>>>
>>>
>>>
>>> On Mon, Feb 24, 2020 at 1:36 AM
>>> Daniel-Constantin Mierla <miconda at gmail.com
>>> <mailto:miconda at gmail.com>> wrote:
>>>
>>> Hello,
>>>
>>> the error message:
>>>
>>> Feb 21 00:11:25 csbc02 csbc[13916]: ERROR:
>>> <core> [db.c:481]: db_use_table(): invalid
>>> parameter value
>>>
>>> Is printed if the database table name is not
>>> given as parameter (eg, null or empty). I
>>> pushed a patch to usrloc to print the table
>>> in the log message, can you try with latest
>>> master (or patch from next commit) and send
>>> again the log messages?
>>>
>>> *
>>> https://github.com/kamailio/kamailio/commit/3843ae18599bfc1806ccb62df44bbc8ad41d60dd
>>>
>>> Cheers,
>>> Daniel
>>>
>>> On 21.02.20 22:40, Joel Serrano wrote:
>>>
>>> Hello,
>>>
>>>
>>>
>>> I have a setup with 2 kamailio in
>>> active/idle, locations are shared via
>>> dmq_usrloc, but database is also used as
>>> a cache where other systems can access
>>> and query "real-time" data.
>>>
>>>
>>>
>>> The config we have is:
>>>
>>>
>>>
>>> # ----- usrloc params -----
>>> modparam("usrloc", "db_url", DBURL)
>>> modparam("usrloc", "db_mode", 1)
>>> modparam("usrloc", "db_load", 0)
>>> modparam("usrloc", "db_insert_update", 1)
>>> modparam("usrloc", "db_timer_clean", 1)
>>> modparam("usrloc", "server_id_filter", 1)
>>> modparam("usrloc", "timer_procs", 2)
>>> modparam("usrloc", "timer_interval", 60)
>>>
>>>
>>>
>>> # ----- dmq_usrloc params -----
>>> modparam("dmq_usrloc", "enable", 1)
>>> modparam("dmq_usrloc", "sync", 1)
>>> modparam("dmq_usrloc",
>>> "batch_msg_contacts", 50) # 50 contacts
>>> / message
>>> modparam("dmq_usrloc", "batch_size",
>>> 10000) # 10000 contacts / batch
>>> modparam("dmq_usrloc", "batch_usleep",
>>> 500000) # one batch every 500ms
>>>
>>>
>>>
>>> With this setup, we assume:
>>>
>>>
>>>
>>> 1- almost real-time updated location
>>> info in the database.
>>>
>>> 2- as long as one of the two Kam is up,
>>> location information will not be lost
>>> (as they will sync with each other via DMQ)
>>>
>>> 3- if both servers go down, locations
>>> are lost (as they will not reload from
>>> database on startup).
>>>
>>>
>>>
>>> Our tests work well with the above
>>> config, with one exception, on the idle
>>> node, we see a lot of (only happens on
>>> the node receiving location via dmq):
>>>
>>>
>>>
>>> Feb 21 00:11:25 csbc02 csbc[13916]:
>>> ERROR: <core> [db.c:481]:
>>> db_use_table(): invalid parameter value
>>> Feb 21 00:11:25 csbc02 csbc[13916]:
>>> ERROR: usrloc [ucontact.c:1136]:
>>> db_update_ucontact_ruid(): sql use_table
>>> failed
>>> Feb 21 00:11:25 csbc02 csbc[13916]:
>>> ERROR: usrloc [ucontact.c:1657]:
>>> update_contact_db(): failed to update
>>> database
>>>
>>>
>>>
>>> I added debug logs to see if I could
>>> understand the reason:
>>>
>>>
>>>
>>> Feb 21 14:38:34 csbc02 csbc[30796]:
>>> DEBUG: dmq_usrloc [usrloc_sync.c:478]:
>>> usrloc_dmq_handle_msg(): dmq message
>>> received from sip:usrloc at 10.2.1.181:5050
>>> <http://sip:usrloc@10.2.1.181:5050>
>>> Feb 21 14:38:34 csbc02 csbc[30796]:
>>> DEBUG: dmq_usrloc [usrloc_sync.c:423]:
>>> usrloc_dmq_execute_action(): Received
>>> DMQ_UPDATE. Update contact info...
>>> Feb 21 14:38:34 csbc02 csbc[30796]:
>>> DEBUG: dmq_usrloc [usrloc_sync.c:75]:
>>> add_contact(): aor: 1063362
>>> Feb 21 14:38:34 csbc02 csbc[30796]:
>>> DEBUG: dmq_usrloc [usrloc_sync.c:76]:
>>> add_contact(): ci->ruid:
>>> uloc-1-5e502cd3-5629-1
>>> Feb 21 14:38:34 csbc02 csbc[30796]:
>>> DEBUG: dmq_usrloc [usrloc_sync.c:77]:
>>> add_contact(): aorhash: 928261667
>>> Feb 21 14:38:34 csbc02 csbc[30796]:
>>> DEBUG: dmq_usrloc [usrloc_sync.c:115]:
>>> add_contact(): '1063362' Not found in usrloc
>>> Feb 21 14:38:34 csbc02 csbc[30796]:
>>> DEBUG: dmq_usrloc [usrloc_sync.c:117]:
>>> add_contact(): Insert record
>>> Feb 21 14:38:34 csbc02 csbc[30796]:
>>> DEBUG: usrloc [ul_callback.h:83]:
>>> run_ul_callbacks():
>>> contact=0x7fe41ff41f40, callback type
>>> 1/15, id 0 entered
>>> Feb 21 14:38:34 csbc02 csbc[30796]:
>>> DEBUG: dmq_usrloc [usrloc_sync.c:800]:
>>> dmq_ul_cb_contact(): Callback from
>>> usrloc with type=1
>>> Feb 21 14:38:34 csbc02 csbc[30796]:
>>> DEBUG: dmq_usrloc [usrloc_sync.c:826]:
>>> dmq_ul_cb_contact(): Contact received
>>> from DMQ... skip
>>> Feb 21 14:38:34 csbc02 csbc[30796]:
>>> ERROR: <core> [db.c:481]:
>>> db_use_table(): invalid parameter value
>>> Feb 21 14:38:34 csbc02 csbc[30796]:
>>> ERROR: usrloc [ucontact.c:669]:
>>> db_insert_ucontact(): sql use_table failed
>>> Feb 21 14:38:34 csbc02 csbc[30796]:
>>> ERROR: usrloc [urecord.c:598]:
>>> insert_ucontact(): failed to insert in
>>> database
>>> Feb 21 14:38:34 csbc02 csbc[30796]:
>>> DEBUG: dmq_usrloc [usrloc_sync.c:121]:
>>> add_contact(): Insert ucontact
>>> Feb 21 14:38:34 csbc02 csbc[30796]:
>>> DEBUG: dmq_usrloc [usrloc_sync.c:124]:
>>> add_contact(): Release record
>>> Feb 21 14:38:34 csbc02 csbc[30796]:
>>> DEBUG: dmq_usrloc [usrloc_sync.c:126]:
>>> add_contact(): Unlock udomain
>>> Feb 21 14:38:34 csbc02 csbc[30796]:
>>> DEBUG: sl [sl.c:305]: send_reply():
>>> reply in stateless mode (sl)
>>> Feb 21 14:38:34 csbc02 csbc[30796]:
>>> DEBUG: <core>
>>> [core/msg_translator.c:161]:
>>> check_via_address(): (10.2.1.181,
>>> 10.2.1.181, 0)
>>>
>>>
>>>
>>> Can anyone help me understand what's the
>>> problem? I have the server_id setting,
>>> and the modparam server_id_filter, so
>>> shouldn't the receiving node just ignore
>>> any database tasks?
>>>
>>>
>>>
>>>
>>>
>>> Thanks,
>>>
>>> Joel.
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>>
>>> Kamailio (SER) - Users Mailing List
>>>
>>> sr-users at lists.kamailio.org <mailto:sr-users at lists.kamailio.org>
>>>
>>> https://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-users
>>>
>>> --
>>>
>>> Daniel-Constantin Mierla -- www.asipto.com <http://www.asipto.com>
>>>
>>> www.twitter.com/miconda <http://www.twitter.com/miconda> -- www.linkedin.com/in/miconda <http://www.linkedin.com/in/miconda>
>>>
>>> Kamailio Advanced Training - March 9-11, 2020, Berlin - www.asipto.com <http://www.asipto.com>
>>>
>>> Kamailio World Conference - April 27-29, 2020, in Berlin -- www.kamailioworld.com <http://www.kamailioworld.com>
>>>
>> --
>> Daniel-Constantin Mierla -- www.asipto.com <http://www.asipto.com>
>> www.twitter.com/miconda <http://www.twitter.com/miconda> -- www.linkedin.com/in/miconda <http://www.linkedin.com/in/miconda>
>> Kamailio Advanced Training - March 9-11, 2020, Berlin - www.asipto.com <http://www.asipto.com>
>> Kamailio World Conference - April 27-29, 2020, in Berlin -- www.kamailioworld.com <http://www.kamailioworld.com>
>>
> --
> Daniel-Constantin Mierla -- www.asipto.com <http://www.asipto.com>
> www.twitter.com/miconda <http://www.twitter.com/miconda> -- www.linkedin.com/in/miconda <http://www.linkedin.com/in/miconda>
> Kamailio Advanced Training - March 9-11, 2020, Berlin - www.asipto.com <http://www.asipto.com>
> Kamailio World Conference - April 27-29, 2020, in Berlin -- www.kamailioworld.com <http://www.kamailioworld.com>
>
--
Daniel-Constantin Mierla -- www.asipto.com
www.twitter.com/miconda -- www.linkedin.com/in/miconda
Kamailio Advanced Training - March 9-11, 2020, Berlin - www.asipto.com
Kamailio World Conference - April 27-29, 2020, in Berlin -- www.kamailioworld.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.kamailio.org/pipermail/sr-users/attachments/20200225/dfc0b89c/attachment.html>
More information about the sr-users
mailing list