[SR-Users] Kamailio propagates 180 and 200 OK OUT OF ORDER

Thu Apr 9 21:58:10 CEST 2020

Hello,

dispatcher has nothing to do with handling sip replies. It is intended
only for routing sip requests. If you use dispatcher for replies, you do
it wrong, just let kamailio route them based on Via headers.

So maybe I was looking at the wrong message flow processing, I was
speaking mainly about the case when the caller sends quickly the
reINVITE after the ACK to the initial INVITE 200ok and the reINVITE gets
to callee before the ACK. That was more of a branching in discussion on
Alex' remarks and the situation that I enocountered in the past and
created troubles. Never had to deal with troubles caused by change of
order between 180 and 200. In IP world, if the time between 180 and 200
is very short, it doesn't matter at all, because the 180 is for start
play a ring tone, which a human may not even hear it when 200 comes 50ms
after it.

If you face the re-ordering for replies, then Kamailio doesn't do much
internally if you don't have reply_route{} (as well as no onsend_route)
in config file, provided that you do not use tm module for sending out
(and by that no onreply_route or failure_route).

For a sip reply, kamilio is parsing the headers to find the 2nd Via
header and use that address to send out the reply. The request route is
not executed for sip replies.

What you can try is to set number of kamailio processes not to exceed
the number of CPU cores, so there is "no real competition" to get CPU
cycles. It could improve a bit, but still not a 100% accuracy (ie.,
there are other processes running on the system).

Cheers,
Daniel

On 09.04.20 21:29, Luis Rojas G. wrote:
> Hello,
>
> I just realized that I had the dispatcher configured using a hash of
> Call-ID.  That means, after recvfrom there must be an extra processing
> finding the Call-ID header in message, to calculate a hash and then
> forward() message. The more the processing, the more cases when 200
> could arrive  before 180. I just changed it to round robin, and the
> amount decreased a lot, but it's still there. If I send a burst of
> 1000 messages, about 5 of them leave out of order every time.
>
> Best regards,
>
> Luis
>
>
>
> On 4/9/20 1:48 PM, Luis Rojas G. wrote:
>> Hello,
>>
>> I have a lot of experience developing mutithreaded applications, and
>> I don't see it so unlikely at all that a process loses cpu just after
>> recvfrom(). It's just as probable as to lose it just before, or when
>> writing on a cache or just before of after sendto(). If there are
>> many messages going through, some of them will fall in this scenario.
>> if I try sending a burst of 100 messages, I see two or three
>> presenting the scenario.
>>
>> Just forward() with a single process does not give the capacity. I'm
>> getting almost 1000caps. More than that and start getting errores,
>> retransmissions, etc. And this is just one way. I need to receive the
>> call to go back to the network (our application is a B2BUA), so I
>> will be down to 500caps, with a simple scenario, with no reliable
>> responses, reinvites, updates, etc. I will end up having as many
>> standalone kamailio processes as the current servers I do have now.
>>
>> I really think the simplest way would be to add a small delay to 200
>> OK. Very small, like 10ms, should be enough. Simple and it should
>> work. As Alex Balashov commented he did for the case with
>> ACK-Re-Invite. 
>>
>> I have to figure out how to make async_ms_sleep() work in reply_route().
>>
>> Thanks for all the comments and ideas
>>
>> Best regards,
>>
>> Luis
>>
>>
>>
>> . On 4/9/20 12:17 PM, Daniel-Constantin Mierla wrote:
>>>
>>> 	
>>> MICONDA at GMAIL.COM appears similar to someone who previously sent you
>>> email, but may not be that person. Learn why this could be a risk
>>> <http://aka.ms/LearnAboutSenderIdentification>
>>> 	Feedback <http://aka.ms/SafetyTipsFeedback>
>>>
>>> Hello,
>>>
>>> then the overtaking is in between reading from the socket and
>>> getting to parsing the call-id value -- the cpu is lost by first
>>> reader after recvfrom() and the second process get enough cpu time
>>> to go ahead further. I haven't encountered this case, but as I said
>>> previously, it is very unlikely, but still possible. I added the
>>> route_locks_size because in the past I had cases when processing of
>>> some messages took longer executing config (e.g., due to
>>> authentication, accounting, ..) and I needed to be sure they are
>>> processed in the order they enter config execution.
>>>
>>> Then the option is to see if a single process with stateless sending
>>> out (using forward()) gives the capacity, if you don't do any other
>>> complex processing. Or if you do more complex processing, use a
>>> dispatcher process with forwarding to local host or in a similar
>>> manner try to use mqueue+rtimer for dispatching using shared memory
>>> queues.
>>>
>>> Of course, it is open source and there is also the C coding way, to
>>> add a synchronizing mechanism to protect against parallel execution
>>> of the code from recvfrom() till call-id lock is acquired.
>>>
>>> Cheers,
>>> Daniel
>>>
>
> -- 
> Luis Rojas
> Software Architect
> Sixbell
> Los Leones 1200
> Providencia
> Santiago, Chile
> Phone: (+56-2) 22001288
> mailto:luis.rojas at sixbell.com
> http://www.sixbell.com

-- 
Daniel-Constantin Mierla -- www.asipto.com
www.twitter.com/miconda -- www.linkedin.com/in/miconda

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.kamailio.org/pipermail/sr-users/attachments/20200409/f85213b5/attachment.html>