[SR-Users] Database errors with usrloc db_mode=1 and dmq_usrloc

Joel Serrano joel at textplus.com
Mon Feb 24 23:45:02 CET 2020


Hi Daniel,

After changing PROC_NOCHLDINIT for PROC_RPC in dmq.c no more errors!

One thing I'm curious, I do still see in the debug logs:

Feb 24 17:42:29 csbc02 csbc[17784]: DEBUG: dmq_usrloc [usrloc_sync.c:423]:
usrloc_dmq_execute_action(): Received DMQ_UPDATE. Update contact info...
Feb 24 17:42:29 csbc02 csbc[17784]: DEBUG: dmq_usrloc [usrloc_sync.c:75]:
add_contact(): aor: 1063362
Feb 24 17:42:29 csbc02 csbc[17784]: DEBUG: dmq_usrloc [usrloc_sync.c:76]:
add_contact(): ci->ruid: uloc-2-5e542c06-34ac-1
Feb 24 17:42:29 csbc02 csbc[17784]: DEBUG: dmq_usrloc [usrloc_sync.c:77]:
add_contact(): aorhash: 928261667
Feb 24 17:42:29 csbc02 csbc[17784]: DEBUG: dmq_usrloc [usrloc_sync.c:83]:
add_contact(): Found contact
Feb 24 17:42:29 csbc02 csbc[17784]: DEBUG: usrloc [ucontact.c:1697]:
update_ucontact(): exists callback for type= UL_CONTACT_UPDATE
Feb 24 17:42:29 csbc02 csbc[17784]: DEBUG: usrloc [ul_callback.h:83]:
run_ul_callbacks(): contact=0x7fc404bc6d20, callback type 2/15, id 0 entered
Feb 24 17:42:29 csbc02 csbc[17784]: DEBUG: dmq_usrloc [usrloc_sync.c:800]:
dmq_ul_cb_contact(): Callback from usrloc with type=2
Feb 24 17:42:29 csbc02 csbc[17784]: DEBUG: dmq_usrloc [usrloc_sync.c:826]:
dmq_ul_cb_contact(): Contact received from DMQ... skip
Feb 24 17:42:29 csbc02 csbc[17784]: DEBUG: usrloc [ucontact.c:1009]:
db_update_ucontact_ruid(): ruid:uloc-2-5e542c06-34ac-1
Feb 24 17:42:29 csbc02 csbc[17784]: DEBUG: usrloc [ucontact.c:1138]:
db_update_ucontact_ruid(): contact:sip:1063362 at 84.121.150.224:61183
Feb 24 17:42:29 csbc02 csbc[17784]: DEBUG: <core> [db_query.c:72]:
db_do_submit_query(): submitted query: update `location` set
`expires`='2020-02-24 17:43:29',`q`=-1.000000
,`cseq`=33578,`flags`=0,`cflags`=0
,`user_agent`='Linksys/SPA941-5.1.8',`received`=NULL,`path`=NULL,`socket`='udp:
138.99.136.180:5060',`methods`=4767,`last_modified`='2020-02-24
17:42:29',`callid`='296d8dde-627ad048 at 84.121.150.224',`instance`=
NULL,`reg_id`=0,`server_id`=1,`connection_id`=-1,`keepalive`=0,`contact`='
sip:1063362 at 84.121.150.224:61183' where `ruid`='uloc-2-5e542c06-34ac-1',
result 0
Feb 24 17:42:29 csbc02 csbc[17784]: DEBUG: dmq_usrloc [usrloc_sync.c:85]:
add_contact(): Release record
Feb 24 17:42:29 csbc02 csbc[17784]: DEBUG: dmq_usrloc [usrloc_sync.c:87]:
add_contact(): Unlock udomain

Having set modparam("usrloc", "server_id_filter", 1), shouldn't the node
receiving locations records via DMQ *not* update the database (as the
server_id param won't match with the one in the DMQ request)?

Thanks again!
Joel.


H

On Mon, Feb 24, 2020 at 12:43 PM Joel Serrano <joel at textplus.com> wrote:

> Absolutely, I'll report back shortly!
>
> On Mon, Feb 24, 2020 at 12:19 PM Daniel-Constantin Mierla <
> miconda at gmail.com> wrote:
>
>> Hello,
>>
>> the database connection is not initialized for that process based on the
>> c code, as it is started with no child init callback, where the db
>> connection is usually created:
>> src/modules/dmq/dmq.c
>>
>> 283:            newpid = fork_process(PROC_NOCHLDINIT, "DMQ WORKER", 0);
>>
>> Can you try with PROC_RPC instead of PROC_NOCHLDINIT and see if works ok?
>>
>> Cheers,
>> Daniel
>> On 24.02.20 21:04, Joel Serrano wrote:
>>
>> Hi Daniel,
>>
>> The process desc says: DMQ WORKER
>>
>> Joel.
>>
>> On Mon, Feb 24, 2020 at 11:55 AM Daniel-Constantin Mierla <
>> miconda at gmail.com> wrote:
>>
>>> Can you get the type of the process throwing the error? The pid is
>>> printed in the log message and the type you can find in the description
>>> corresponding to the pid in the output of:
>>>
>>> kamctl ps
>>>
>>> Cheers,
>>> Daniel
>>> On 24.02.20 20:42, Joel Serrano wrote:
>>>
>>> Ok so my findings so far:
>>>
>>> 1- db_mode=2 --> No errors.
>>> 2- db_mode=1 --> The reported error.
>>> 3- modparam("usrloc", "server_id_filter", 1) --> Irrelevant, errors
>>> happen with or without it on db_mode=1.
>>>
>>> Anything else I can try in order to find out what is happening? My next
>>> step was to enable mysql logs or capture traffic and see what exactly is
>>> Kam sending...
>>>
>>> Let me know your thoughts.
>>>
>>> Thanks,
>>> Joel.
>>>
>>>
>>>
>>> On Mon, Feb 24, 2020 at 11:26 AM Joel Serrano <joel at textplus.com> wrote:
>>>
>>>> Hi Henning,
>>>>
>>>> Every time I register, NodeA says OK, sends location by DMQ to NodeB,
>>>> nodeB logs the error.
>>>>
>>>> I tried moving the VIP from NodeA to NodeB, and the error stops
>>>> happening on NodeA, and starts happening on NodeB.
>>>>
>>>> I did not see this happening with db_mode=2, and I *think* it started
>>>> happening when I enabled the server_id related params? I'm not 100% sure of
>>>> this so I'm going to try some different combinations to see if I can narrow
>>>> it down a little and provide more accurate details for you guys to find the
>>>> issue.
>>>>
>>>> I'll report back shortly...
>>>>
>>>> Thanks!
>>>> Joel.
>>>>
>>>>
>>>>
>>>> On Mon, Feb 24, 2020 at 11:20 AM Henning Westerholt <hw at skalatan.de>
>>>> wrote:
>>>>
>>>>> Hello Joel,
>>>>>
>>>>>
>>>>>
>>>>> Feb 24 13:58:46 csbc02 csbc[9280]: ERROR: <core> [db.c:481]:
>>>>> db_use_table(): invalid parameter value
>>>>> Feb 24 13:58:46 csbc02 csbc[9280]: ERROR: usrloc [ucontact.c:670]:
>>>>> db_insert_ucontact(): sql use_table failed for: location
>>>>>
>>>>>
>>>>>
>>>>> So the table name seems to be ok - this function also log this error
>>>>> if the DB connection handle is not initialized. But still the question why
>>>>> this happens.
>>>>>
>>>>>
>>>>>
>>>>> Cheers,
>>>>>
>>>>>
>>>>>
>>>>> Henning
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>>
>>>>> Henning Westerholt – https://skalatan.de/blog/
>>>>>
>>>>> Kamailio services – https://gilawa.com
>>>>>
>>>>>
>>>>>
>>>>> *From:* sr-users <sr-users-bounces at lists.kamailio.org> *On Behalf Of *Joel
>>>>> Serrano
>>>>> *Sent:* Monday, February 24, 2020 8:10 PM
>>>>> *To:* Daniel-Constantin Mierla <miconda at gmail.com>
>>>>> *Cc:* Kamailio (SER) - Users Mailing List <sr-users at lists.kamailio.org
>>>>> >
>>>>> *Subject:* Re: [SR-Users] Database errors with usrloc db_mode=1 and
>>>>> dmq_usrloc
>>>>>
>>>>>
>>>>>
>>>>> Hello Daniel,
>>>>>
>>>>>
>>>>>
>>>>> I built from master and re-tested, debug log:
>>>>>
>>>>>
>>>>>
>>>>> Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG: <core>
>>>>> [core/socket_info.c:635]: grep_sock_info(): checking if port 5060
>>>>> (advertise 0) matches port 5060
>>>>> Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG: dmq_usrloc
>>>>> [usrloc_sync.c:423]: usrloc_dmq_execute_action(): Received DMQ_UPDATE.
>>>>> Update contact info...
>>>>> Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG: dmq_usrloc
>>>>> [usrloc_sync.c:75]: add_contact(): aor: 1000107
>>>>> Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG: dmq_usrloc
>>>>> [usrloc_sync.c:76]: add_contact(): ci->ruid: uloc-2-5e507d6a-1bd5-3
>>>>> Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG: dmq_usrloc
>>>>> [usrloc_sync.c:77]: add_contact(): aorhash: 928379112
>>>>> Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG: dmq_usrloc
>>>>> [usrloc_sync.c:115]: add_contact(): '1000107' Not found in usrloc
>>>>> Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG: dmq_usrloc
>>>>> [usrloc_sync.c:117]: add_contact(): Insert record
>>>>> Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG: usrloc [ul_callback.h:83]:
>>>>> run_ul_callbacks(): contact=0x7f3a8cfdf170, callback type 1/15, id 0 entered
>>>>> Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG: dmq_usrloc
>>>>> [usrloc_sync.c:800]: dmq_ul_cb_contact(): Callback from usrloc with type=1
>>>>> Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG: dmq_usrloc
>>>>> [usrloc_sync.c:826]: dmq_ul_cb_contact(): Contact received from DMQ... skip
>>>>> Feb 24 13:58:46 csbc02 csbc[9280]: ERROR: <core> [db.c:481]:
>>>>> db_use_table(): invalid parameter value
>>>>> Feb 24 13:58:46 csbc02 csbc[9280]: ERROR: usrloc [ucontact.c:670]:
>>>>> db_insert_ucontact(): sql use_table failed for: location
>>>>> Feb 24 13:58:46 csbc02 csbc[9280]: ERROR: usrloc [urecord.c:598]:
>>>>> insert_ucontact(): failed to insert in database
>>>>> Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG: dmq_usrloc
>>>>> [usrloc_sync.c:121]: add_contact(): Insert ucontact
>>>>> Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG: dmq_usrloc
>>>>> [usrloc_sync.c:124]: add_contact(): Release record
>>>>> Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG: dmq_usrloc
>>>>> [usrloc_sync.c:126]: add_contact(): Unlock udomain
>>>>> Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG: sl [sl.c:305]: send_reply():
>>>>> reply in stateless mode (sl)
>>>>> Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG: <core>
>>>>> [core/msg_translator.c:161]: check_via_address(): (10.2.1.181, 10.2.1.181,
>>>>> 0)
>>>>> Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG: dialog [dlg_cseq.c:116]:
>>>>> dlg_cseq_prepare_new_msg(): prepare new msg for cseq update operations
>>>>> Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG: <core>
>>>>> [core/parser/msg_parser.c:620]: parse_msg(): SIP Reply  (status):
>>>>> Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG: <core>
>>>>> [core/parser/msg_parser.c:622]: parse_msg():  version: <SIP/2.0>
>>>>> Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG: <core>
>>>>> [core/parser/msg_parser.c:624]: parse_msg():  status:  <200>
>>>>> Feb 24 13:58:46 csbc02 csbc[9280]: DEBUG: <core>
>>>>> [core/parser/msg_parser.c:626]: parse_msg():  reason:  <OK>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Joel.
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Feb 24, 2020 at 1:36 AM Daniel-Constantin Mierla <
>>>>> miconda at gmail.com> wrote:
>>>>>
>>>>> Hello,
>>>>>
>>>>> the error message:
>>>>>
>>>>> Feb 21 00:11:25 csbc02 csbc[13916]: ERROR: <core> [db.c:481]:
>>>>> db_use_table(): invalid parameter value
>>>>>
>>>>> Is printed if the database table name is not given as parameter (eg,
>>>>> null or empty). I pushed a patch to usrloc to print the table in the log
>>>>> message, can you try with latest master (or patch from next commit) and
>>>>> send again the log messages?
>>>>>
>>>>>   *
>>>>> https://github.com/kamailio/kamailio/commit/3843ae18599bfc1806ccb62df44bbc8ad41d60dd
>>>>>
>>>>> Cheers,
>>>>> Daniel
>>>>>
>>>>> On 21.02.20 22:40, Joel Serrano wrote:
>>>>>
>>>>> Hello,
>>>>>
>>>>>
>>>>>
>>>>> I have a setup with 2 kamailio in active/idle, locations are shared
>>>>> via dmq_usrloc, but database is also used as a cache where other systems
>>>>> can access and query "real-time" data.
>>>>>
>>>>>
>>>>>
>>>>> The config we have is:
>>>>>
>>>>>
>>>>>
>>>>> # ----- usrloc params -----
>>>>> modparam("usrloc", "db_url", DBURL)
>>>>> modparam("usrloc", "db_mode", 1)
>>>>> modparam("usrloc", "db_load", 0)
>>>>> modparam("usrloc", "db_insert_update", 1)
>>>>> modparam("usrloc", "db_timer_clean", 1)
>>>>> modparam("usrloc", "server_id_filter", 1)
>>>>> modparam("usrloc", "timer_procs", 2)
>>>>> modparam("usrloc", "timer_interval", 60)
>>>>>
>>>>>
>>>>>
>>>>> # ----- dmq_usrloc params -----
>>>>> modparam("dmq_usrloc", "enable", 1)
>>>>> modparam("dmq_usrloc", "sync", 1)
>>>>> modparam("dmq_usrloc", "batch_msg_contacts", 50)  # 50 contacts /
>>>>> message
>>>>> modparam("dmq_usrloc", "batch_size", 10000)       # 10000 contacts /
>>>>> batch
>>>>> modparam("dmq_usrloc", "batch_usleep", 500000)    # one batch every
>>>>> 500ms
>>>>>
>>>>>
>>>>>
>>>>> With this setup, we assume:
>>>>>
>>>>>
>>>>>
>>>>> 1- almost real-time updated location info in the database.
>>>>>
>>>>> 2- as long as one of the two Kam is up, location information will not
>>>>> be lost (as they will sync with each other via DMQ)
>>>>>
>>>>> 3- if both servers go down, locations are lost (as they will not
>>>>> reload from database on startup).
>>>>>
>>>>>
>>>>>
>>>>> Our tests work well with the above config, with one exception, on the
>>>>> idle node, we see a lot of (only happens on the node receiving location via
>>>>> dmq):
>>>>>
>>>>>
>>>>>
>>>>> Feb 21 00:11:25 csbc02 csbc[13916]: ERROR: <core> [db.c:481]:
>>>>> db_use_table(): invalid parameter value
>>>>> Feb 21 00:11:25 csbc02 csbc[13916]: ERROR: usrloc [ucontact.c:1136]:
>>>>> db_update_ucontact_ruid(): sql use_table failed
>>>>> Feb 21 00:11:25 csbc02 csbc[13916]: ERROR: usrloc [ucontact.c:1657]:
>>>>> update_contact_db(): failed to update database
>>>>>
>>>>>
>>>>>
>>>>> I added debug logs to see if I could understand the reason:
>>>>>
>>>>>
>>>>>
>>>>> Feb 21 14:38:34 csbc02 csbc[30796]: DEBUG: dmq_usrloc
>>>>> [usrloc_sync.c:478]: usrloc_dmq_handle_msg(): dmq message received from
>>>>> sip:usrloc at 10.2.1.181:5050
>>>>> Feb 21 14:38:34 csbc02 csbc[30796]: DEBUG: dmq_usrloc
>>>>> [usrloc_sync.c:423]: usrloc_dmq_execute_action(): Received DMQ_UPDATE.
>>>>> Update contact info...
>>>>> Feb 21 14:38:34 csbc02 csbc[30796]: DEBUG: dmq_usrloc
>>>>> [usrloc_sync.c:75]: add_contact(): aor: 1063362
>>>>> Feb 21 14:38:34 csbc02 csbc[30796]: DEBUG: dmq_usrloc
>>>>> [usrloc_sync.c:76]: add_contact(): ci->ruid: uloc-1-5e502cd3-5629-1
>>>>> Feb 21 14:38:34 csbc02 csbc[30796]: DEBUG: dmq_usrloc
>>>>> [usrloc_sync.c:77]: add_contact(): aorhash: 928261667
>>>>> Feb 21 14:38:34 csbc02 csbc[30796]: DEBUG: dmq_usrloc
>>>>> [usrloc_sync.c:115]: add_contact(): '1063362' Not found in usrloc
>>>>> Feb 21 14:38:34 csbc02 csbc[30796]: DEBUG: dmq_usrloc
>>>>> [usrloc_sync.c:117]: add_contact(): Insert record
>>>>> Feb 21 14:38:34 csbc02 csbc[30796]: DEBUG: usrloc [ul_callback.h:83]:
>>>>> run_ul_callbacks(): contact=0x7fe41ff41f40, callback type 1/15, id 0 entered
>>>>> Feb 21 14:38:34 csbc02 csbc[30796]: DEBUG: dmq_usrloc
>>>>> [usrloc_sync.c:800]: dmq_ul_cb_contact(): Callback from usrloc with type=1
>>>>> Feb 21 14:38:34 csbc02 csbc[30796]: DEBUG: dmq_usrloc
>>>>> [usrloc_sync.c:826]: dmq_ul_cb_contact(): Contact received from DMQ... skip
>>>>> Feb 21 14:38:34 csbc02 csbc[30796]: ERROR: <core> [db.c:481]:
>>>>> db_use_table(): invalid parameter value
>>>>> Feb 21 14:38:34 csbc02 csbc[30796]: ERROR: usrloc [ucontact.c:669]:
>>>>> db_insert_ucontact(): sql use_table failed
>>>>> Feb 21 14:38:34 csbc02 csbc[30796]: ERROR: usrloc [urecord.c:598]:
>>>>> insert_ucontact(): failed to insert in database
>>>>> Feb 21 14:38:34 csbc02 csbc[30796]: DEBUG: dmq_usrloc
>>>>> [usrloc_sync.c:121]: add_contact(): Insert ucontact
>>>>> Feb 21 14:38:34 csbc02 csbc[30796]: DEBUG: dmq_usrloc
>>>>> [usrloc_sync.c:124]: add_contact(): Release record
>>>>> Feb 21 14:38:34 csbc02 csbc[30796]: DEBUG: dmq_usrloc
>>>>> [usrloc_sync.c:126]: add_contact(): Unlock udomain
>>>>> Feb 21 14:38:34 csbc02 csbc[30796]: DEBUG: sl [sl.c:305]:
>>>>> send_reply(): reply in stateless mode (sl)
>>>>> Feb 21 14:38:34 csbc02 csbc[30796]: DEBUG: <core>
>>>>> [core/msg_translator.c:161]: check_via_address(): (10.2.1.181, 10.2.1.181,
>>>>> 0)
>>>>>
>>>>>
>>>>>
>>>>> Can anyone help me understand what's the problem? I have the server_id
>>>>> setting, and the modparam server_id_filter, so shouldn't the receiving node
>>>>> just ignore any database tasks?
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Joel.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>>
>>>>> Kamailio (SER) - Users Mailing List
>>>>>
>>>>> sr-users at lists.kamailio.org
>>>>>
>>>>> https://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-users
>>>>>
>>>>> --
>>>>>
>>>>> Daniel-Constantin Mierla -- www.asipto.com
>>>>>
>>>>> www.twitter.com/miconda -- www.linkedin.com/in/miconda
>>>>>
>>>>> Kamailio Advanced Training - March 9-11, 2020, Berlin - www.asipto.com
>>>>>
>>>>> Kamailio World Conference - April 27-29, 2020, in Berlin -- www.kamailioworld.com
>>>>>
>>>>> --
>>> Daniel-Constantin Mierla -- www.asipto.comwww.twitter.com/miconda -- www.linkedin.com/in/miconda
>>> Kamailio Advanced Training - March 9-11, 2020, Berlin - www.asipto.com
>>> Kamailio World Conference - April 27-29, 2020, in Berlin -- www.kamailioworld.com
>>>
>>> --
>> Daniel-Constantin Mierla -- www.asipto.comwww.twitter.com/miconda -- www.linkedin.com/in/miconda
>> Kamailio Advanced Training - March 9-11, 2020, Berlin - www.asipto.com
>> Kamailio World Conference - April 27-29, 2020, in Berlin -- www.kamailioworld.com
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.kamailio.org/pipermail/sr-users/attachments/20200224/3b73bfcf/attachment.html>


More information about the sr-users mailing list