[SR-Users] DMQ broadcasting crashes kamailio

Charles Chance charles.chance at sipcentric.com
Fri Apr 24 20:57:35 CEST 2020


Hi,

Did you try the config snippet I provided?

Basically dmq_handle_message() must be called if the message is not your
own, otherwise the node discovery/health check will not work and you will
see nodes disappearing as you described.

Here it is again:

    if(is_method("KDMQ")){

        if($rU =~ "userOnline"){
            //user came online in cluster, resume transactions if-any
suspended
            $avp(remoteUser) = $rb;
        } else {
            dmq_handle_message();
        }
    }

Notice that we check for your own/custom message first, then call handle
message if not matched.

Let me know if it works.

Cheers,

Charles


On Fri, 24 Apr 2020 at 19:52, SamyGo <govoiper at gmail.com> wrote:

> Yes,
> I did read all(past 3+ years) his replies specific to DMQ and DMQ USRLOC
> and only one matched exact description and there has no resolution to it.
> Github open+closed issues for DMQ didn't have anything similar either.
> Could it be something I'm doing wrong !?
>
> Additional info:  One of the server is direct on Public IP and Other one
> is behind NAT. Another test setup where it consistently reproducible is two
> server behind NAT(AWS)
> Here are the mod params.  Only usrloc sync is done via DMQ and no other
> module is using DMQ.
>
> listen=udp:LocalIP:5060 advertise PublicIP:5060
>
> modparam("dmq","server_address", DMQ_LOCAL_SERVER)
> modparam("dmq", "notification_address", DMQ_REMOTE_SERVER)
> modparam("dmq", "multi_notify", 0) //1 for DNS SRV
> modparam("dmq", "num_workers", 10)
> modparam("dmq", "ping_interval", 60)
>
> modparam("dmq_usrloc", "enable", 1)
> modparam("dmq_usrloc", "sync", 1)
> modparam("dmq_usrloc", "batch_size", 4000)
> modparam("dmq_usrloc", "batch_usleep", 1000)
> modparam("dmq_usrloc", "usrloc_domain", "location")
>
> Where:  DMQ_REMOTE_SERVER  = sip:PublicIP2:5060
>
> GDB info as requested:
>
> Core was generated by `/usr/local/sbin/kamailio -w /tmp/kamailio -P
> /var/run/kamailio/kamailio.pid -f'.
> Program terminated with signal SIGSEGV, Segmentation fault.
> #0  0x00007f248c4cef15 in send_reply (msg=0x7f2469f88d40, code=0,
> reason=0x7ffd775e3ab8) at sl.c:276
> 276             if(reason->s[reason->len-1]=='\0') {
> (gdb)
> (gdb)
> (gdb) frame 0
> #0  0x00007f248c4cef15 in send_reply (msg=0x7f2469f88d40, code=0,
> reason=0x7ffd775e3ab8) at sl.c:276
> 276             if(reason->s[reason->len-1]=='\0') {
> (gdb) p *reason
> $1 = {s = 0x0, len = 0}
> (gdb)
> (gdb) frame 1
> #1  0x00007f24656c6549 in worker_loop (id=2) at worker.c:129
> 129                                     if(slb.freply(current_job->msg,
> peer_response.resp_code,
> (gdb) p *worker
> $3 = {queue = 0x7f2469f240a8, jobs_processed = 5, lock = {val = 2}, pid =
> 935}
> (gdb)
> (gdb)
> (gdb) p *current_job
> $6 = {f = 0x7f24656d6d8d <empty_peer_callback>, msg = 0x7f2469f88d40,
> orig_peer = 0x7f2469f6ed50, next = 0x0, prev = 0x0}
> (gdb)
>
>
> On Fri, Apr 24, 2020 at 1:30 PM Daniel-Constantin Mierla <
> miconda at gmail.com> wrote:
>
>> Hello,
>>
>> have you tried the suggestion from Charles in the other response? It can
>> help figuring out where the problem resides.
>>
>> Now, from C point of view, I would need the following output from gdb of
>> the core file:
>>
>> frame 0
>> p *reason
>>
>> frame 1
>> p *worker
>> p *current_job
>>
>> I would also need to know the modparams for dmq and other dmq_* module,
>> plus the list if modules for which you enabled dmq (eg, htable, dialog,
>> presence, ...).
>>
>> Cheers,
>> Daniel
>> On 24.04.20 18:10, SamyGo wrote:
>>
>> Oops,apologize, missed that:
>>
>> version: kamailio 5.3.3 (x86_64/linux) 44ccb9-dirty
>> flags: USE_TCP, USE_TLS, USE_SCTP, TLS_HOOKS, USE_RAW_SOCKS,
>> DISABLE_NAGLE, USE_MCAST, DNS_IP_HACK, SHM_MMAP, PKG_MALLOC, Q_MALLOC,
>> F_MALLOC, TLSF_MALLOC, DBG_SR_MEMORY, USE_FUTEX, FAST_LOCK-ADAPTIVE_WAIT,
>> USE_DNS_CACHE, USE_DNS_FAILOVER, USE_NAPTR, USE_DST_BLACKLIST,
>> HAVE_RESOLV_RES
>> ADAPTIVE_WAIT_LOOPS 1024, MAX_RECV_BUFFER_SIZE 262144, MAX_URI_SIZE 1024,
>> BUF_SIZE 65535, DEFAULT PKG_SIZE 8MB
>> poll method support: poll, epoll_lt, epoll_et, sigio_rt, select.
>> id: 44ccb9 -dirty
>> compiled on 17:04:55 Apr 17 2020 with gcc 4.9.2
>>
>> Tried this with version 5.0, 5.2, and now 5.3 same situation..
>>
>> Thankyou for looking into this,
>> Sammy
>>
>> On Fri, Apr 24, 2020 at 2:33 AM Daniel-Constantin Mierla <
>> miconda at gmail.com> wrote:
>>
>>> Hello,
>>>
>>> you have to provide the version of kamailio for each reported kamailio
>>> issue, otherwise is hard to match with the source code. Use 'kamailio -v'
>>> to get version details.
>>>
>>> Cheers,
>>> Daniel
>>> On 23.04.20 23:36, SamyGo wrote:
>>>
>>> Hi,
>>>
>>> Is there a way to broadcast KDMQ to the cluster but not expect a reply
>>> back !?as far as I've read the source code dmq_bcast_message is exactly
>>> like dmq_send_message in a way that it expects a callback to be executed on
>>> response i.e expects a reply.
>>>
>>> So, the situation I'm facing is I'm broadcasting message to cluster and
>>> I do not want a reply back. The following two options result in crash &
>>> core dump.
>>>
>>> 1 - If my script doesn't respond back, by use of dmq_handle_message, it
>>> marks the destined servers as "inactive" and stops usrloc sync process
>>> which isn't desirable.
>>> 2 - If I respond back with the dmq_handle_message it crashes the
>>> Kamailio which just received this broadcasted message.
>>>
>>> Here is how its done in script:
>>>
>>> *broadcasting message to cluster:*
>>>         dmq_bcast_message("userOnline", "$fu", "text/plain");
>>>
>>> *Receiving and handling a broadcast message:*
>>> route[DMQ_HANDLE] {
>>>     if(!(is_method("KDMQ") || $rm == "KDMQ")) return;
>>>
>>>     if(is_method("KDMQ") || $rm == "KDMQ"){
>>>             if($rU =~ "userOnline"){
>>>                     //user came online in cluster, resume transactions
>>> if-any suspended
>>>                     $avp(remoteUser) = $rb;
>>>             }
>>>             dmq_handle_message();
>>>             exit;
>>>     }
>>> }
>>>
>>> *Related log lines:*
>>> Apr 23 21:15:48  kamailio[916]: ALERT: <script>: [da2c1-2f499] ------
>>> DMQ_HANDLE: UserOnline Event Received ------
>>> Apr 23 21:15:48  kamailio[916]: DEBUG: dmq [message.c:53]:
>>> ki_dmq_handle_message_rc(): dmq_handle_message [KDMQ
>>> sip:userOnline at 9.8.7.123:5060]
>>> Apr 23 21:15:48  kamailio[916]: DEBUG: dmq [message.c:66]:
>>> ki_dmq_handle_message_rc(): dmq_handle_message peer found: userOnline
>>> Apr 23 21:15:48  kamailio[916]: DEBUG: <core> [core/receive.c:437]:
>>> receive_msg(): request-route executed in: 401461 usec
>>> Apr 23 21:15:48  kamailio[935]: DEBUG: dmq [worker.c:87]: worker_loop():
>>> dmq_worker [2 935] lock acquired
>>> and crash/segfault..
>>>
>>> Core dump: https://pastebin.com/S7ekCPfF
>>>
>>> Any help or pointers to solve this would be really appreciated.
>>>
>>> Best Regards,
>>> Sammy
>>>
>>> _______________________________________________
>>> Kamailio (SER) - Users Mailing Listsr-users at lists.kamailio.orghttps://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-users
>>>
>>> --
>>> Daniel-Constantin Mierla -- www.asipto.comwww.twitter.com/miconda -- www.linkedin.com/in/miconda
>>>
>>> --
>> Daniel-Constantin Mierla -- www.asipto.comwww.twitter.com/miconda -- www.linkedin.com/in/miconda
>>
>> _______________________________________________
> Kamailio (SER) - Users Mailing List
> sr-users at lists.kamailio.org
> https://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-users
>
-- 
*Charles Chance*
Managing Director

t. 0330 120 1200    m. 07932 063 891

-- 
Sipcentric Ltd.
                Company registered in England & Wales no. 
7365592. Registered
                office: Faraday Wharf, Innovation 
Birmingham Campus, Holt Street, Birmingham Science Park, Birmingham B7 4BB.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.kamailio.org/pipermail/sr-users/attachments/20200424/f9a181a4/attachment.html>


More information about the sr-users mailing list