Hi,
we run multiple rtpengine servers to share the load. Whenever we need to take an rtpengine server offline, we used to just block the control port via iptables, then no new calls ended up on this instance of rtpengine. This worked pretty well in Kamailio 4.4.5.
However, since Kamailio 5.0, and the problem persists with 5.1.4, Kamailio hangs almost immediately after we block the control port traffic. In the log file there are almost no packets processed except every few seconds, which looks like a timeout thing.
Did we configure anything wrong there? Or is the "dead rtpengine detection" just broken?
Our configuration:
loadmodule "rtpengine.so" modparam("rtpengine", "rtpengine_disable_tout", 120) modparam("rtpengine", "setid_avp", "$avp(rtpsetid)") modparam("rtpengine", "rtpengine_sock", "0 == udp:1.2.3.4:9001=2 udp:1.2.3.5:9001=2 udp:1.2.3.6:9001=2") modparam("rtpengine", "rtpengine_sock", "1 == udp:2.3.4.5:9001=2")
Any help appreciated.
Regards, Sebastian
On Mon, Aug 06, 2018 at 12:58:00PM +0200, Sebastian Damm wrote:
we run multiple rtpengine servers to share the load. Whenever we need to take an rtpengine server offline, we used to just block the control port via iptables, then no new calls ended up on this instance of rtpengine. This worked pretty well in Kamailio 4.4.5.
No answer to you question (which sounds like a legit problem), but why not do it with: kamcmd rtpengine.enable udp:x.y.z.a:port 0/1 on the kamailio machines?
By simply blocking you might interrupt updates in SDP for running calls and prevent hangups from being sent.
Hi Daniel,
On Mon, Aug 6, 2018 at 2:17 PM Daniel Tryba d.tryba@pocos.nl wrote:
No answer to you question (which sounds like a legit problem), but why not do it with: kamcmd rtpengine.enable udp:x.y.z.a:port 0/1 on the kamailio machines?
Of course, that's probably the better way. But as far as I know, that command wasn't available before 5.0. So I guess our blocking of the control port was the way we did it before that.
But anyhow, if an rtpengine crashes, Kamailio shouldn't block.
Sebastian
On Mon, Aug 06, 2018 at 02:37:15PM +0200, Sebastian Damm wrote:
kamcmd rtpengine.enable udp:x.y.z.a:port 0/1 on the kamailio machines?
Of course, that's probably the better way. But as far as I know, that command wasn't available before 5.0. So I guess our blocking of the control port was the way we did it before that.
In the past (4.x) the command was: kamctl fifo nh_enable_rtpp udp:x.y.z.a:port 0/1
On Mon, Aug 06, 2018 at 02:37:15PM +0200, Sebastian Damm wrote:
Of course, that's probably the better way. But as far as I know, that command wasn't available before 5.0. So I guess our blocking of the control port was the way we did it before that.
But anyhow, if an rtpengine crashes, Kamailio shouldn't block.
BTW forgot to ask: are you REJECTing or DROPing packets? A reject should trigger a failover in the rtpengine_* calls immediately. A drop will result in a timeout mechanism triggering, which according to your description blocks the thread.
On Mon, Aug 6, 2018 at 6:56 PM Daniel Tryba d.tryba@pocos.nl wrote:
BTW forgot to ask: are you REJECTing or DROPing packets? A reject should trigger a failover in the rtpengine_* calls immediately.
Of course, we block the traffic with a REJECT rule.
A drop will result in a timeout mechanism triggering, which according to your description blocks the thread.
But since we have configured 120 seconds as disable_timeout, I could expect a short period of blocking but after that it should run again for at least 120 seconds. But from what we see, this does not happen.
Oh, and we tested the disabling and enabling via kamctl before, but as far as I remember, while disabling still worked, Kamailio crashed reproducably when enabling an rtpengine again.
Regards, Sebastian
On Tue, Aug 07, 2018 at 02:49:58PM +0200, Sebastian Damm wrote:
Oh, and we tested the disabling and enabling via kamctl before, but as far as I remember, while disabling still worked, Kamailio crashed reproducably when enabling an rtpengine again.
We had the same problem, this was fixed in a late 4.4.x (6 or 7).
On 2018-08-06 06:58, Sebastian Damm wrote:
Hi,
we run multiple rtpengine servers to share the load. Whenever we need to take an rtpengine server offline, we used to just block the control port via iptables, then no new calls ended up on this instance of rtpengine. This worked pretty well in Kamailio 4.4.5.
However, since Kamailio 5.0, and the problem persists with 5.1.4, Kamailio hangs almost immediately after we block the control port traffic. In the log file there are almost no packets processed except every few seconds, which looks like a timeout thing.
Did we configure anything wrong there? Or is the "dead rtpengine detection" just broken?
Our configuration:
loadmodule "rtpengine.so" modparam("rtpengine", "rtpengine_disable_tout", 120) modparam("rtpengine", "setid_avp", "$avp(rtpsetid)") modparam("rtpengine", "rtpengine_sock", "0 == udp:1.2.3.4:9001=2 udp:1.2.3.5:9001=2 udp:1.2.3.6:9001=2") modparam("rtpengine", "rtpengine_sock", "1 == udp:2.3.4.5:9001=2")
When you query the running config via kamcmd for the value of rtpengine_tout_ms, what does it say? (Wondering if the default value of 1000 properly gets established or if some other value is in effect - it shouldn't block longer than this value)
On Tue, Aug 7, 2018 at 3:04 PM Richard Fuchs rfuchs@sipwise.com wrote:
On 2018-08-06 06:58, Sebastian Damm wrote: When you query the running config via kamcmd for the value of rtpengine_tout_ms, what does it say? (Wondering if the default value of 1000 properly gets established or if some other value is in effect - it shouldn't block longer than this value)
kamcmd> cfg.get rtpengine rtpengine_tout_ms 1000
I actually don't know how long it blocks for one request. But I know that whenever one RTPengine is gone, we get "SIP offline" notifications from our monitoring system (sending SIP OPTIONS) within minutes. I think, waiting for an RTPengine answer for a second is okay if it happens once every 120 seconds, but it's not okay if it happens every time.
Sebastian
Hi Sebastian,
You may need the following fix for your rtpengine module in Kamailio.
https://github.com/kamailio/kamailio/pull/1593
We had the similar issue with rtpengine module in Kamailio as it is using package memory and not shared memory.
Many Thanks
Regards Muhammad Zaka
-----Original Message----- From: sr-users sr-users-bounces@lists.kamailio.org On Behalf Of Sebastian Damm Sent: 08 August 2018 12:38 To: Kamailio (SER) - Users Mailing List sr-users@lists.kamailio.org Subject: Re: [SR-Users] UDP workers block when one or more rtpengine instances go offline
On Tue, Aug 7, 2018 at 3:04 PM Richard Fuchs rfuchs@sipwise.com wrote:
On 2018-08-06 06:58, Sebastian Damm wrote: When you query the running config via kamcmd for the value of rtpengine_tout_ms, what does it say? (Wondering if the default value of 1000 properly gets established or if some other value is in effect - it shouldn't block longer than this value)
kamcmd> cfg.get rtpengine rtpengine_tout_ms 1000
I actually don't know how long it blocks for one request. But I know that whenever one RTPengine is gone, we get "SIP offline" notifications from our monitoring system (sending SIP OPTIONS) within minutes. I think, waiting for an RTPengine answer for a second is okay if it happens once every 120 seconds, but it's not okay if it happens every time.
Sebastian
_______________________________________________ Kamailio (SER) - Users Mailing List sr-users@lists.kamailio.org https://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-users
Hi Muhammad, On Thu, Aug 9, 2018 at 8:34 AM Muhammad Zaka muhammad.zaka@cloudcall.com wrote:
You may need the following fix for your rtpengine module in Kamailio. https://github.com/kamailio/kamailio/pull/1593 We had the similar issue with rtpengine module in Kamailio as it is using package memory and not shared memory.
From the commit message I would expect this patch to apply only to
deleted rtpengine nodes but not temporarily offline nodes? Or is the same code run when temporarily disabling a node as well?
Regards, Sebastian
Hi Sebastian
It looks like your issue is related to package memory and not shared memory. Kamailio forked instance will pick the same rtpengine and is blocked via sending command to offline nodes socket connection.
Many Thanks
Regards Muhammad Zaka
-----Original Message----- From: sr-users sr-users-bounces@lists.kamailio.org On Behalf Of Sebastian Damm Sent: 09 August 2018 09:00 To: Kamailio (SER) - Users Mailing List sr-users@lists.kamailio.org Subject: Re: [SR-Users] UDP workers block when one or more rtpengine instances go offline
Hi Muhammad, On Thu, Aug 9, 2018 at 8:34 AM Muhammad Zaka muhammad.zaka@cloudcall.com wrote:
You may need the following fix for your rtpengine module in Kamailio. https://github.com/kamailio/kamailio/pull/1593 We had the similar issue with rtpengine module in Kamailio as it is using package memory and not shared memory.
From the commit message I would expect this patch to apply only to deleted rtpengine nodes but not temporarily offline nodes? Or is the same code run when temporarily disabling a node as well?
Regards, Sebastian
_______________________________________________ Kamailio (SER) - Users Mailing List sr-users@lists.kamailio.org https://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-users