Hi All,
As part of our very low cps load at 50 calls per second through Kamailio
Version 5.3.2 (Dispatcher Module used with Call Load based Routing -
Algorithm 10. Machine is a Centos: 7.7 with kernel version 3.10), we are
seeing the kamailio continuously crashing. Please see the data below for 3
types of segfaults seen:
Crash-1:
--------
Apr 3 19:32:25 FE-A07-34-VM6 kernel: kamailio[10382]: segfault at
7ff300000078 ip 00007ff300000078 sp 00007ffe5f2eafc8 error 14 in
libbz2.so.1.0.6[7ff331582000+f000]
Apr 3 19:32:25 FE-A07-34-VM6 mysqld: 2020-04-03 19:32:25 6131 [Warning]
Aborted connection 6131 to db: 'kamailio' user: 'kamailio' host:
'localhost' (Got an error reading communication packets)
Apr 3 19:32:25 FE-A07-34-VM6 /usr/sbin/kamailio[10423]: CRITICAL: <core>
[core/pass_fd.c:277]: receive_fd(): EOF on 13
Apr 3 19:32:25 FE-A07-34-VM6 /usr/sbin/kamailio[10381]: ALERT: <core>
[main.c:767]: handle_sigs(): child process 10382 exited by a signal 11
Apr 3 19:32:25 FE-A07-34-VM6 /usr/sbin/kamailio[10381]: ALERT: <core>
[main.c:770]: handle_sigs(): core was not generated
===============================
Crash-2:
-------
Apr 3 19:36:11 FE-A07-34-VM6 kernel: kamailio[12838]: segfault at
7efe00000078 ip 00007efe00000078 sp 00007ffdc876b868 error 15 in zero
(deleted)[7efdc407f000+40000000]
Apr 3 19:36:11 FE-A07-34-VM6 mysqld: 2020-04-03 19:36:11 6175 [Warning]
Aborted connection 6175 to db: 'kamailio' user: 'kamailio' host:
'localhost' (Got an error reading communication packets)
Apr 3 19:36:11 FE-A07-34-VM6 /usr/sbin/kamailio[12874]: CRITICAL: <core>
[core/pass_fd.c:277]: receive_fd(): EOF on 18
Apr 3 19:36:11 FE-A07-34-VM6 /usr/sbin/kamailio[12826]: ALERT: <core>
[main.c:767]: handle_sigs(): child process 12838 exited by a signal 11
Apr 3 19:36:11 FE-A07-34-VM6 /usr/sbin/kamailio[12826]: ALERT: <core>
[main.c:770]: handle_sigs(): core was not generated
============================
Crash-3:
--------
Apr 3 19:40:53 FE-A07-34-VM6 kernel: kamailio[13542]: segfault at 80 ip
000000000065e193 sp 00007ffdae10c1f0 error 4 in kamailio[400000+476000]
Apr 3 19:40:53 FE-A07-34-VM6 mysqld: 2020-04-03 19:40:53 6222 [Warning]
Aborted connection 6222 to db: 'kamailio' user: 'kamailio' host:
'localhost' (Got an error reading communication packets)
Apr 3 19:40:53 FE-A07-34-VM6 /usr/sbin/kamailio[13516]: ALERT: <core>
[main.c:767]: handle_sigs(): child process 13542 exited by a signal 11
Apr 3 19:40:53 FE-A07-34-VM6 /usr/sbin/kamailio[13558]: CRITICAL: <core>
[core/pass_fd.c:277]: receive_fd(): EOF on 38
Apr 3 19:40:53 FE-A07-34-VM6 /usr/sbin/kamailio[13516]: ALERT: <core>
[main.c:770]: handle_sigs(): core was not generated
Any pointers for resolution are most welcome as we need to quickly resolve
this.
Regards,
Harneet Singh
--
"Once you eliminate the impossible, whatever remains, no matter how
improbable, must be the truth" - Sir Arthur Conan Doyle
Hi All,
When my VoLTE Mobile is registered to the Kamailio IMS that support IPSec
by AKAv1 algorithm, inside the ESP packet is clear text (not encrypted). Is
that because my mistake in configuration or for my USIM card?
thanks in advance.
Mohammad
Hello,
I just realized that I had the dispatcher configured using a hash of
Call-ID. That means, after recvfrom there must be an extra processing
finding the Call-ID header in message, to calculate a hash and then
forward() message. The more the processing, the more cases when 200
could arrive before 180. I just changed it to round robin, and the
amount decreased a lot, but it's still there. If I send a burst of 1000
messages, about 5 of them leave out of order every time.
Best regards,
Luis
On 4/9/20 1:48 PM, Luis Rojas G. wrote:
> Hello,
>
> I have a lot of experience developing mutithreaded applications, and I
> don't see it so unlikely at all that a process loses cpu just after
> recvfrom(). It's just as probable as to lose it just before, or when
> writing on a cache or just before of after sendto(). If there are many
> messages going through, some of them will fall in this scenario. if I
> try sending a burst of 100 messages, I see two or three presenting the
> scenario.
>
> Just forward() with a single process does not give the capacity. I'm
> getting almost 1000caps. More than that and start getting errores,
> retransmissions, etc. And this is just one way. I need to receive the
> call to go back to the network (our application is a B2BUA), so I will
> be down to 500caps, with a simple scenario, with no reliable
> responses, reinvites, updates, etc. I will end up having as many
> standalone kamailio processes as the current servers I do have now.
>
> I really think the simplest way would be to add a small delay to 200
> OK. Very small, like 10ms, should be enough. Simple and it should
> work. As Alex Balashov commented he did for the case with ACK-Re-Invite.
>
> I have to figure out how to make async_ms_sleep() work in reply_route().
>
> Thanks for all the comments and ideas
>
> Best regards,
>
> Luis
>
>
>
> . On 4/9/20 12:17 PM, Daniel-Constantin Mierla wrote:
>>
>>
>> MICONDA(a)GMAIL.COM appears similar to someone who previously sent you
>> email, but may not be that person. Learn why this could be a risk
>> <http://aka.ms/LearnAboutSenderIdentification>
>> Feedback <http://aka.ms/SafetyTipsFeedback>
>>
>> Hello,
>>
>> then the overtaking is in between reading from the socket and getting
>> to parsing the call-id value -- the cpu is lost by first reader after
>> recvfrom() and the second process get enough cpu time to go ahead
>> further. I haven't encountered this case, but as I said previously,
>> it is very unlikely, but still possible. I added the route_locks_size
>> because in the past I had cases when processing of some messages took
>> longer executing config (e.g., due to authentication, accounting, ..)
>> and I needed to be sure they are processed in the order they enter
>> config execution.
>>
>> Then the option is to see if a single process with stateless sending
>> out (using forward()) gives the capacity, if you don't do any other
>> complex processing. Or if you do more complex processing, use a
>> dispatcher process with forwarding to local host or in a similar
>> manner try to use mqueue+rtimer for dispatching using shared memory
>> queues.
>>
>> Of course, it is open source and there is also the C coding way, to
>> add a synchronizing mechanism to protect against parallel execution
>> of the code from recvfrom() till call-id lock is acquired.
>>
>> Cheers,
>> Daniel
>>
--
Luis Rojas
Software Architect
Sixbell
Los Leones 1200
Providencia
Santiago, Chile
Phone: (+56-2) 22001288
mailto:luis.rojas@sixbell.com
http://www.sixbell.com
Hi there,
Can someone help figure out why there's this t_relay error, if I decide to
drop the single master branch either all branches belonging to the same
destination set, even though there are more serial sets to try later?
*Error:*
*tm [t_funcs.c:337]: t_relay_to(): t_forward_nonack returned error -6 (-6)*
*tm [t_funcs.c:355]: t_relay_to(): -6 error reply generation delayed *
*Example:*
request_route {
seturi("sip:a@mydomain.net");
append_branch("sip:b@mydomain.net", "0.5");
append_branch("sip:c@mydomain.net", "0.5");
append_branch("sip:d@10.22.0.30", "1.0");
# append_branch("sip:e@mydomain.net", "1.0"); # no error if I add this
branch with same Q=1.0
t_load_contacts();
t_next_contacts();
t_on_branch("check_branch");
t_on_failure("serial");
t_relay();
break;
}
branch_route[check_branch] {
if($rd=="10.22.0.30")
drop;
}
failure_route["serial"] {
if (!t_next_contacts())
exit;
}
t_on_branch("check_branch");
t_on_failure("serial");
t_relay();
}
Thanks in advance.
--Sergiu
On Thu, Apr 09, 2020 at 02:26:45PM +0100, David Villasmil wrote:
> How about just not forwarding the 180 if it’s coming right after the
> 200ok? I know it’s a hack, but a 180 is a provisional response
> indicating the ring status, not the actual audio for the ring... so if
> a 200 is received, the call is already established, no need to forward
> the 180.
There is a general principle, particularly in PSTN interworking, that
signalling should be reliably conserved end-to-end. I think dropping
messages is the most cavalier option available, perhaps worth
considering only if every other logical possibility has been exhausted
first.
--
Alex Balashov | Principal | Evariste Systems LLC
Tel: +1-706-510-6800 / +1-800-250-5920 (toll-free)
Web: http://www.evaristesys.com/, http://www.csrpswitch.com/
On 09.04.20 09:47, Daniel-Constantin Mierla wrote:
> [...]
>>
>> Any idea why Async is not allowed in reply_route()?
>>
Didn't notice this question in the first place to respond in my previous
email.
Probably the developer that did it needed it for SIP requests. However,
you can make a patch (pull request) to make it work for other cases. The
developers will review the changes and if they don't break anything, it
will be merged.
Cheers,
Daniel
Hello,
it was a reply to my email where I mentioned the route_locks_size
parameter. As he said he looked at that parameter, I assumed it was
about the route_locks_size, because there was not other parameter listed
in the emails. So using the route_locks_size parameter doesn't require
to use dialog module.
Cheers,
Daniel
On 09.04.20 10:29, Henning Westerholt wrote:
>
> Hello,
>
>
>
> I mentioned in some of earlier e-mails as one possible option to track
> the state of a dialog and to act depending on it.
>
>
>
> Cheers,
>
>
>
> Henning
>
>
>
> --
>
> Henning Westerholt – https://skalatan.de/blog/
>
> Kamailio services – https://gilawa.com <https://gilawa.com/>
>
>
>
> *From:* Daniel-Constantin Mierla <miconda(a)gmail.com>
> *Sent:* Thursday, April 9, 2020 9:48 AM
> *To:* luis.rojas(a)sixbell.com; Kamailio (SER) - Users Mailing List
> <sr-users(a)lists.kamailio.org>; Henning Westerholt <hw(a)skalatan.de>
> *Subject:* Re: [SR-Users] Kamailio propagates 180 and 200 OK OUT OF ORDER
>
>
>
> Hello,
>
> On 08.04.20 23:03, Luis Rojas G. wrote:
>
> Hello, Daniel,
>
>
>
> I looked into that parameter, but I need to use with the dialog
> module, and I'm pretty afraid to use that.
>
> who said or where is written than you need to load the dialog module?
> You definitely don't.
>
> Cheers,
> Daniel
>
>
>
> I was looking more into the stateless proxy, because I need to
> process a lot of traffic.
>
>
>
> My target is 4200CAPS. with duration between 90s and 210. Let's
> say, 150 seconds. That would mean 630.000 simultaneous dialogs. I
> don't think the solution can go that way.
>
> it would really help me to be able to use completely stateless
> proxy plus Async in reply_route(), to introduce an artificial
> delay before forwarding 200 OK to Invite.. As someone mentioned,
> it would help me on request_route(), for race conditions between
> ACK and Re-Invite.
>
> Any idea why Async is not allowed in reply_route()?
>
> Best regards,
>
>
>
> Luis
>
>
>
> On 4/8/20 1:07 PM, Daniel-Constantin Mierla wrote:
>
> Hello,
>
> you have to keep in mind that Kamailio is a SIP packet router,
> not a telephony engine. If 180 and 200 replies are part of a
> call is not something that Kamailio recognize at its core. Its
> main goal is to route out as fast as possible what is
> received, by executing the configuration file script. Now, a
> matter of your configuration file, processing of some SIP
> messages can take longer than processing other. And the
> processing is done in parallel, a matter of children parameter
> (and tcp_children, sctp_children).
>
> With that in mind, a way to try to cope better with the issue
> you face is to set route_locks_size parameter, see:
>
> *
> https://www.kamailio.org/wiki/cookbooks/devel/core#route_locks_size
> <https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.kamai…>
>
> Probably is what you look for.
>
> But if you want more tight constraints, like when receiving a
> 180 after a 200ok and not route it out, you have to make the
> logic in configuration file by combining modules such as
> dialog or htable (as already suggested).
>
> Cheers,
> Daniel
>
--
Daniel-Constantin Mierla -- www.asipto.comwww.twitter.com/miconda -- www.linkedin.com/in/miconda
Hi ,
The error i am facing in kamailio pcscf
Apr 9 13:44:31 tel-VirtualBox kamailio[2847]: INFO: rr [rr_mod.c:515]:
pv_get_route_uri_f(): No route header present.
Apr 9 13:44:31 tel-VirtualBox kamailio[2811]: ERROR: <core>
[core/lvalue.c:353]: lval_pvar_assign(): setting pvar failed
Apr 9 13:44:31 tel-VirtualBox kamailio[2811]: ERROR: <core>
[core/lvalue.c:404]: lval_assign(): assignment failed at pos:
(106,30-106,48)
Apr 9 13:44:31 tel-VirtualBox kamailio[2811]: ERROR: <core>
[core/lvalue.c:353]: lval_pvar_assign(): setting pvar failed
Apr 9 13:44:31 tel-VirtualBox kamailio[2811]: ERROR: <core>
[core/lvalue.c:404]: lval_assign(): assignment failed at pos:
(114,31-114,52)
kindly help me in this issue .. because of this , it is throwing 403 domain
not server error in scscf