[SR-Users] DMQ broadcasting crashes kamailio
Charles Chance
charles.chance at sipcentric.com
Fri Apr 24 20:57:35 CEST 2020
Hi,
Did you try the config snippet I provided?
Basically dmq_handle_message() must be called if the message is not your
own, otherwise the node discovery/health check will not work and you will
see nodes disappearing as you described.
Here it is again:
if(is_method("KDMQ")){
if($rU =~ "userOnline"){
//user came online in cluster, resume transactions if-any
suspended
$avp(remoteUser) = $rb;
} else {
dmq_handle_message();
}
}
Notice that we check for your own/custom message first, then call handle
message if not matched.
Let me know if it works.
Cheers,
Charles
On Fri, 24 Apr 2020 at 19:52, SamyGo <govoiper at gmail.com> wrote:
> Yes,
> I did read all(past 3+ years) his replies specific to DMQ and DMQ USRLOC
> and only one matched exact description and there has no resolution to it.
> Github open+closed issues for DMQ didn't have anything similar either.
> Could it be something I'm doing wrong !?
>
> Additional info: One of the server is direct on Public IP and Other one
> is behind NAT. Another test setup where it consistently reproducible is two
> server behind NAT(AWS)
> Here are the mod params. Only usrloc sync is done via DMQ and no other
> module is using DMQ.
>
> listen=udp:LocalIP:5060 advertise PublicIP:5060
>
> modparam("dmq","server_address", DMQ_LOCAL_SERVER)
> modparam("dmq", "notification_address", DMQ_REMOTE_SERVER)
> modparam("dmq", "multi_notify", 0) //1 for DNS SRV
> modparam("dmq", "num_workers", 10)
> modparam("dmq", "ping_interval", 60)
>
> modparam("dmq_usrloc", "enable", 1)
> modparam("dmq_usrloc", "sync", 1)
> modparam("dmq_usrloc", "batch_size", 4000)
> modparam("dmq_usrloc", "batch_usleep", 1000)
> modparam("dmq_usrloc", "usrloc_domain", "location")
>
> Where: DMQ_REMOTE_SERVER = sip:PublicIP2:5060
>
> GDB info as requested:
>
> Core was generated by `/usr/local/sbin/kamailio -w /tmp/kamailio -P
> /var/run/kamailio/kamailio.pid -f'.
> Program terminated with signal SIGSEGV, Segmentation fault.
> #0 0x00007f248c4cef15 in send_reply (msg=0x7f2469f88d40, code=0,
> reason=0x7ffd775e3ab8) at sl.c:276
> 276 if(reason->s[reason->len-1]=='\0') {
> (gdb)
> (gdb)
> (gdb) frame 0
> #0 0x00007f248c4cef15 in send_reply (msg=0x7f2469f88d40, code=0,
> reason=0x7ffd775e3ab8) at sl.c:276
> 276 if(reason->s[reason->len-1]=='\0') {
> (gdb) p *reason
> $1 = {s = 0x0, len = 0}
> (gdb)
> (gdb) frame 1
> #1 0x00007f24656c6549 in worker_loop (id=2) at worker.c:129
> 129 if(slb.freply(current_job->msg,
> peer_response.resp_code,
> (gdb) p *worker
> $3 = {queue = 0x7f2469f240a8, jobs_processed = 5, lock = {val = 2}, pid =
> 935}
> (gdb)
> (gdb)
> (gdb) p *current_job
> $6 = {f = 0x7f24656d6d8d <empty_peer_callback>, msg = 0x7f2469f88d40,
> orig_peer = 0x7f2469f6ed50, next = 0x0, prev = 0x0}
> (gdb)
>
>
> On Fri, Apr 24, 2020 at 1:30 PM Daniel-Constantin Mierla <
> miconda at gmail.com> wrote:
>
>> Hello,
>>
>> have you tried the suggestion from Charles in the other response? It can
>> help figuring out where the problem resides.
>>
>> Now, from C point of view, I would need the following output from gdb of
>> the core file:
>>
>> frame 0
>> p *reason
>>
>> frame 1
>> p *worker
>> p *current_job
>>
>> I would also need to know the modparams for dmq and other dmq_* module,
>> plus the list if modules for which you enabled dmq (eg, htable, dialog,
>> presence, ...).
>>
>> Cheers,
>> Daniel
>> On 24.04.20 18:10, SamyGo wrote:
>>
>> Oops,apologize, missed that:
>>
>> version: kamailio 5.3.3 (x86_64/linux) 44ccb9-dirty
>> flags: USE_TCP, USE_TLS, USE_SCTP, TLS_HOOKS, USE_RAW_SOCKS,
>> DISABLE_NAGLE, USE_MCAST, DNS_IP_HACK, SHM_MMAP, PKG_MALLOC, Q_MALLOC,
>> F_MALLOC, TLSF_MALLOC, DBG_SR_MEMORY, USE_FUTEX, FAST_LOCK-ADAPTIVE_WAIT,
>> USE_DNS_CACHE, USE_DNS_FAILOVER, USE_NAPTR, USE_DST_BLACKLIST,
>> HAVE_RESOLV_RES
>> ADAPTIVE_WAIT_LOOPS 1024, MAX_RECV_BUFFER_SIZE 262144, MAX_URI_SIZE 1024,
>> BUF_SIZE 65535, DEFAULT PKG_SIZE 8MB
>> poll method support: poll, epoll_lt, epoll_et, sigio_rt, select.
>> id: 44ccb9 -dirty
>> compiled on 17:04:55 Apr 17 2020 with gcc 4.9.2
>>
>> Tried this with version 5.0, 5.2, and now 5.3 same situation..
>>
>> Thankyou for looking into this,
>> Sammy
>>
>> On Fri, Apr 24, 2020 at 2:33 AM Daniel-Constantin Mierla <
>> miconda at gmail.com> wrote:
>>
>>> Hello,
>>>
>>> you have to provide the version of kamailio for each reported kamailio
>>> issue, otherwise is hard to match with the source code. Use 'kamailio -v'
>>> to get version details.
>>>
>>> Cheers,
>>> Daniel
>>> On 23.04.20 23:36, SamyGo wrote:
>>>
>>> Hi,
>>>
>>> Is there a way to broadcast KDMQ to the cluster but not expect a reply
>>> back !?as far as I've read the source code dmq_bcast_message is exactly
>>> like dmq_send_message in a way that it expects a callback to be executed on
>>> response i.e expects a reply.
>>>
>>> So, the situation I'm facing is I'm broadcasting message to cluster and
>>> I do not want a reply back. The following two options result in crash &
>>> core dump.
>>>
>>> 1 - If my script doesn't respond back, by use of dmq_handle_message, it
>>> marks the destined servers as "inactive" and stops usrloc sync process
>>> which isn't desirable.
>>> 2 - If I respond back with the dmq_handle_message it crashes the
>>> Kamailio which just received this broadcasted message.
>>>
>>> Here is how its done in script:
>>>
>>> *broadcasting message to cluster:*
>>> dmq_bcast_message("userOnline", "$fu", "text/plain");
>>>
>>> *Receiving and handling a broadcast message:*
>>> route[DMQ_HANDLE] {
>>> if(!(is_method("KDMQ") || $rm == "KDMQ")) return;
>>>
>>> if(is_method("KDMQ") || $rm == "KDMQ"){
>>> if($rU =~ "userOnline"){
>>> //user came online in cluster, resume transactions
>>> if-any suspended
>>> $avp(remoteUser) = $rb;
>>> }
>>> dmq_handle_message();
>>> exit;
>>> }
>>> }
>>>
>>> *Related log lines:*
>>> Apr 23 21:15:48 kamailio[916]: ALERT: <script>: [da2c1-2f499] ------
>>> DMQ_HANDLE: UserOnline Event Received ------
>>> Apr 23 21:15:48 kamailio[916]: DEBUG: dmq [message.c:53]:
>>> ki_dmq_handle_message_rc(): dmq_handle_message [KDMQ
>>> sip:userOnline at 9.8.7.123:5060]
>>> Apr 23 21:15:48 kamailio[916]: DEBUG: dmq [message.c:66]:
>>> ki_dmq_handle_message_rc(): dmq_handle_message peer found: userOnline
>>> Apr 23 21:15:48 kamailio[916]: DEBUG: <core> [core/receive.c:437]:
>>> receive_msg(): request-route executed in: 401461 usec
>>> Apr 23 21:15:48 kamailio[935]: DEBUG: dmq [worker.c:87]: worker_loop():
>>> dmq_worker [2 935] lock acquired
>>> and crash/segfault..
>>>
>>> Core dump: https://pastebin.com/S7ekCPfF
>>>
>>> Any help or pointers to solve this would be really appreciated.
>>>
>>> Best Regards,
>>> Sammy
>>>
>>> _______________________________________________
>>> Kamailio (SER) - Users Mailing Listsr-users at lists.kamailio.orghttps://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-users
>>>
>>> --
>>> Daniel-Constantin Mierla -- www.asipto.comwww.twitter.com/miconda -- www.linkedin.com/in/miconda
>>>
>>> --
>> Daniel-Constantin Mierla -- www.asipto.comwww.twitter.com/miconda -- www.linkedin.com/in/miconda
>>
>> _______________________________________________
> Kamailio (SER) - Users Mailing List
> sr-users at lists.kamailio.org
> https://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-users
>
--
*Charles Chance*
Managing Director
t. 0330 120 1200 m. 07932 063 891
--
Sipcentric Ltd.
Company registered in England & Wales no.
7365592. Registered
office: Faraday Wharf, Innovation
Birmingham Campus, Holt Street, Birmingham Science Park, Birmingham B7 4BB.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.kamailio.org/pipermail/sr-users/attachments/20200424/f9a181a4/attachment.html>
More information about the sr-users
mailing list