Good day,
I am testing the dispatcher module, using Kamailio as stateless proxy. I have a pool of UAC (scripts in SIPP) and a pool of UAS (also scripts in SIPP) for the destinations. Kamailio version is kamailio-5.3.3-4.1.x86_64.
Problem I have is, if UAS responds 180 and 200 OK to Invite immediately, sometimes they are propagated out of order. 200 OK before 180, like this :
UAS is 172.30.4.195:5061. UAC is 172.30.4.195:5080. Kamailio is 192.168.253.4:5070
Difference between 180 and 200 is just about 50 microseconds.
My guess is that both messages are received by different instances of Kamailio, and then because of context switches, even though the 180 is received before, that process ends after the processing of 200. However, I had the idea that in order to avoid these problems the kamailio processes synchronized with each other using a shared memory. I tried using stateful proxy and I obtained the same result.
By the way, anyone has any idea about how Kamailio's share memory is implemented? It clearly does not use the typical system calls shmget(), shmat(), because they are not shown by ipcs command.
Before posting here I googled, but I couldn't find anything related to this. I can't believe I am the only one who ever had this problem, so I guess I am doing something wrong...
Please, any help. I'm really stuck on this.
Thanks.
Hello Luis,
as the 1xx responses are usually send unreliable (unless you use PRACK), you should not make any assumption on the order or even the arrival of this messages. It can also happens on a network level, if send by UDP.
Can you elaborate why you think this re-ordering is a problem for you?
One idea to enforce some ordering would be to use the dialog module in combination with reply routes and the textops(x) module.
About the shared memory question – Kamailio implement its own memory manager (private memory and shared memory pool).
Cheers,
Henning
-- Henning Westerholt – https://skalatan.de/blog/ Kamailio services – https://gilawa.comhttps://gilawa.com/
From: sr-users sr-users-bounces@lists.kamailio.org On Behalf Of Luis Rojas G. Sent: Tuesday, April 7, 2020 10:43 PM To: sr-users@lists.kamailio.org Subject: [SR-Users] Kamailio propagates 180 and 200 OK OUT OF ORDER
Good day,
I am testing the dispatcher module, using Kamailio as stateless proxy. I have a pool of UAC (scripts in SIPP) and a pool of UAS (also scripts in SIPP) for the destinations. Kamailio version is kamailio-5.3.3-4.1.x86_64.
Problem I have is, if UAS responds 180 and 200 OK to Invite immediately, sometimes they are propagated out of order. 200 OK before 180, like this :
[cid:image001.png@01D60D84.49317330]
UAS is 172.30.4.195:5061. UAC is 172.30.4.195:5080. Kamailio is 192.168.253.4:5070
Difference between 180 and 200 is just about 50 microseconds.
My guess is that both messages are received by different instances of Kamailio, and then because of context switches, even though the 180 is received before, that process ends after the processing of 200. However, I had the idea that in order to avoid these problems the kamailio processes synchronized with each other using a shared memory. I tried using stateful proxy and I obtained the same result.
By the way, anyone has any idea about how Kamailio's share memory is implemented? It clearly does not use the typical system calls shmget(), shmat(), because they are not shown by ipcs command.
Before posting here I googled, but I couldn't find anything related to this. I can't believe I am the only one who ever had this problem, so I guess I am doing something wrong...
Please, any help. I'm really stuck on this.
Thanks.
--
Luis Rojas
Software Architect
Sixbell
Los Leones 1200
Providencia
Santiago, Chile
Phone: (+56-2) 22001288
mailto:luis.rojas@sixbell.com
--
Luis Rojas
Software Architect
Sixbell
Los Leones 1200
Providencia
Santiago, Chile
Phone: (+56-2) 22001288
mailto:luis.rojas@sixbell.com
Hello, Henning,
I am worried about this scenario, because it's a symptom of what may happen in other cases. For instance, I've seen that this operator usually sends re-invites immediate after sending ACK. This may create race conditions like 3.1.5 of RFC5407
https://tools.ietf.org/html/rfc5407#page-22
I'd understand that one happens because of packet loss, as it's in UDP's nature, but in this case it would be artificially created by Kamailio. if there was no problem at network level (packet loss, packets following different path on the network and arriving out of order), why Kamailio creates it?
I'd expect that the shared memory is used precisely for this. If an instance of kamailio receives a 200 OK, it could check on the shm and say "hey, another instance is processing a 180 for this call. Let's wait for it to finish" (*). I know there could still be a problem, the instance processing the 180 undergoes a context switch just after it receives the message, but before writing to shm, but it would greatly reduce the chance.
In our applications we use a SIP stack that always sends messages to the application in the same order it receives them, even though is multi-threaded and messages from the network are received by different threads. So, they really syncronize between them. Why Kamailio instances don't?
I am evaluating kamailio to use it as a dispatcher to balance load against our several Application Servers, to present to the operator just a couple of entrance points to our platform (they don't want to establish connections to each one of our servers). This operator is very difficult to deal with. I am sure they will complain something like "why are you sending messages out of order? Fix that". The operator will be able to see traces and check that messages entered the Kamailio nodes in order and left out of order. They will not accept it.
(*) Not really "wait", as it would introduce a delay in processing all messages. it should be like putting it on a queue, continue processing other messages, and go back to the queue later.
Well, thanks for your answer.
Luis
On 4/8/20 3:01 AM, Henning Westerholt wrote:
Hello Luis,
as the 1xx responses are usually send unreliable (unless you use PRACK), you should not make any assumption on the order or even the arrival of this messages. It can also happens on a network level, if send by UDP.
Can you elaborate why you think this re-ordering is a problem for you?
One idea to enforce some ordering would be to use the dialog module in combination with reply routes and the textops(x) module.
About the shared memory question – Kamailio implement its own memory manager (private memory and shared memory pool).
Cheers,
Henning
--
Henning Westerholt – https://skalatan.de/blog/ https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fskalatan.de%2Fblog%2F&data=02%7C01%7C%7C9909a729fd8a426f81aa08d7db8aab0a%7Cab4a33c2b5614f798601bc921698ad08%7C0%7C1%7C637219260993836600&sdata=ZLmPqvbWKbsXY49s870sElN2I0uIn0DtDQSqJOoxr6I%3D&reserved=0
Kamailio services – https://gilawa.com https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgilawa.com%2F&data=02%7C01%7C%7C9909a729fd8a426f81aa08d7db8aab0a%7Cab4a33c2b5614f798601bc921698ad08%7C0%7C1%7C637219260993836600&sdata=Hdgzfwgu80wiwJBOjh9N70hvXSvWjt8abuKFjVRsavo%3D&reserved=0
*From:* sr-users sr-users-bounces@lists.kamailio.org *On Behalf Of *Luis Rojas G. *Sent:* Tuesday, April 7, 2020 10:43 PM *To:* sr-users@lists.kamailio.org *Subject:* [SR-Users] Kamailio propagates 180 and 200 OK OUT OF ORDER
Good day,
I am testing the dispatcher module, using Kamailio as stateless proxy. I have a pool of UAC (scripts in SIPP) and a pool of UAS (also scripts in SIPP) for the destinations. Kamailio version is kamailio-5.3.3-4.1.x86_64.
Problem I have is, if UAS responds 180 and 200 OK to Invite immediately, sometimes they are propagated out of order. 200 OK before 180, like this :
UAS is 172.30.4.195:5061. UAC is 172.30.4.195:5080. Kamailio is 192.168.253.4:5070
Difference between 180 and 200 is just about 50 microseconds.
My guess is that both messages are received by different instances of Kamailio, and then because of context switches, even though the 180 is received before, that process ends after the processing of 200. However, I had the idea that in order to avoid these problems the kamailio processes synchronized with each other using a shared memory. I tried using stateful proxy and I obtained the same result.
By the way, anyone has any idea about how Kamailio's share memory is implemented? It clearly does not use the typical system calls shmget(), shmat(), because they are not shown by ipcs command.
Before posting here I googled, but I couldn't find anything related to this. I can't believe I am the only one who ever had this problem, so I guess I am doing something wrong...
Please, any help. I'm really stuck on this.
Thanks.
--
Hi Luis,
Kamailio architecture isn't going to change I'm sure. There is no central orchestrator - each worker process just grabs messages as fast as it can. If your processing is slow for some and fast for others then they can get out of order I reckon. 180s are really neither here nor there if there's a 200 OK right behind it.
Perhaps a proxy like Drachtio would work better for you?
Steve
On Wed, 8 Apr 2020 at 17:44, Luis Rojas G. luis.rojas@sixbell.com wrote:
Hello, Henning,
I am worried about this scenario, because it's a symptom of what may happen in other cases. For instance, I've seen that this operator usually sends re-invites immediate after sending ACK. This may create race conditions like 3.1.5 of RFC5407
https://tools.ietf.org/html/rfc5407#page-22
I'd understand that one happens because of packet loss, as it's in UDP's nature, but in this case it would be artificially created by Kamailio. if there was no problem at network level (packet loss, packets following different path on the network and arriving out of order), why Kamailio creates it?
I'd expect that the shared memory is used precisely for this. If an instance of kamailio receives a 200 OK, it could check on the shm and say "hey, another instance is processing a 180 for this call. Let's wait for it to finish" (*). I know there could still be a problem, the instance processing the 180 undergoes a context switch just after it receives the message, but before writing to shm, but it would greatly reduce the chance.
In our applications we use a SIP stack that always sends messages to the application in the same order it receives them, even though is multi-threaded and messages from the network are received by different threads. So, they really syncronize between them. Why Kamailio instances don't?
I am evaluating kamailio to use it as a dispatcher to balance load against our several Application Servers, to present to the operator just a couple of entrance points to our platform (they don't want to establish connections to each one of our servers). This operator is very difficult to deal with. I am sure they will complain something like "why are you sending messages out of order? Fix that". The operator will be able to see traces and check that messages entered the Kamailio nodes in order and left out of order. They will not accept it.
(*) Not really "wait", as it would introduce a delay in processing all messages. it should be like putting it on a queue, continue processing other messages, and go back to the queue later.
Well, thanks for your answer.
Luis
On 4/8/20 3:01 AM, Henning Westerholt wrote:
Hello Luis,
as the 1xx responses are usually send unreliable (unless you use PRACK), you should not make any assumption on the order or even the arrival of this messages. It can also happens on a network level, if send by UDP.
Can you elaborate why you think this re-ordering is a problem for you?
One idea to enforce some ordering would be to use the dialog module in combination with reply routes and the textops(x) module.
About the shared memory question – Kamailio implement its own memory manager (private memory and shared memory pool).
Cheers,
Henning
--
Henning Westerholt – https://skalatan.de/blog/ https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fskalatan.de%2Fblog%2F&data=02%7C01%7C%7C9909a729fd8a426f81aa08d7db8aab0a%7Cab4a33c2b5614f798601bc921698ad08%7C0%7C1%7C637219260993836600&sdata=ZLmPqvbWKbsXY49s870sElN2I0uIn0DtDQSqJOoxr6I%3D&reserved=0
Kamailio services – https://gilawa.com https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgilawa.com%2F&data=02%7C01%7C%7C9909a729fd8a426f81aa08d7db8aab0a%7Cab4a33c2b5614f798601bc921698ad08%7C0%7C1%7C637219260993836600&sdata=Hdgzfwgu80wiwJBOjh9N70hvXSvWjt8abuKFjVRsavo%3D&reserved=0
*From:* sr-users sr-users-bounces@lists.kamailio.org sr-users-bounces@lists.kamailio.org *On Behalf Of *Luis Rojas G. *Sent:* Tuesday, April 7, 2020 10:43 PM *To:* sr-users@lists.kamailio.org *Subject:* [SR-Users] Kamailio propagates 180 and 200 OK OUT OF ORDER
Good day,
I am testing the dispatcher module, using Kamailio as stateless proxy. I have a pool of UAC (scripts in SIPP) and a pool of UAS (also scripts in SIPP) for the destinations. Kamailio version is kamailio-5.3.3-4.1.x86_64.
Problem I have is, if UAS responds 180 and 200 OK to Invite immediately, sometimes they are propagated out of order. 200 OK before 180, like this :
UAS is 172.30.4.195:5061. UAC is 172.30.4.195:5080. Kamailio is 192.168.253.4:5070
Difference between 180 and 200 is just about 50 microseconds.
My guess is that both messages are received by different instances of Kamailio, and then because of context switches, even though the 180 is received before, that process ends after the processing of 200. However, I had the idea that in order to avoid these problems the kamailio processes synchronized with each other using a shared memory. I tried using stateful proxy and I obtained the same result.
By the way, anyone has any idea about how Kamailio's share memory is implemented? It clearly does not use the typical system calls shmget(), shmat(), because they are not shown by ipcs command.
Before posting here I googled, but I couldn't find anything related to this. I can't believe I am the only one who ever had this problem, so I guess I am doing something wrong...
Please, any help. I'm really stuck on this.
Thanks.
--
-- Luis Rojas Software Architect Sixbell Los Leones 1200 Providencia Santiago, Chile Phone: (+56-2) 22001288mailto:luis.rojas@sixbell.com luis.rojas@sixbell.comhttp://www.sixbell.com
Kamailio (SER) - Users Mailing List sr-users@lists.kamailio.org https://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-users
If you think about it, if the 200 OK is so close to the 180 it doesn’t really matter from a signalling standpoint if the 180 comes first or if it arrives after the 200 OK. It’s the 200 OK that is important. If the 180 comes after, it’s simply ignored and the dialog is established successfully.
The 1xx is seldom significant (unless you have PRACK but that’s another story).
Or do you really have a situation where the 180 is critical?
/O
On 8 Apr 2020, at 18:01, Steve Davies steve-lists-srusers@connection-telecom.com wrote:
Hi Luis,
Kamailio architecture isn't going to change I'm sure. There is no central orchestrator - each worker process just grabs messages as fast as it can. If your processing is slow for some and fast for others then they can get out of order I reckon. 180s are really neither here nor there if there's a 200 OK right behind it.
Perhaps a proxy like Drachtio would work better for you?
Steve
On Wed, 8 Apr 2020 at 17:44, Luis Rojas G. <luis.rojas@sixbell.com mailto:luis.rojas@sixbell.com> wrote: Hello, Henning,
I am worried about this scenario, because it's a symptom of what may happen in other cases. For instance, I've seen that this operator usually sends re-invites immediate after sending ACK. This may create race conditions like 3.1.5 of RFC5407
https://tools.ietf.org/html/rfc5407#page-22 https://tools.ietf.org/html/rfc5407#page-22
I'd understand that one happens because of packet loss, as it's in UDP's nature, but in this case it would be artificially created by Kamailio. if there was no problem at network level (packet loss, packets following different path on the network and arriving out of order), why Kamailio creates it?
I'd expect that the shared memory is used precisely for this. If an instance of kamailio receives a 200 OK, it could check on the shm and say "hey, another instance is processing a 180 for this call. Let's wait for it to finish" (*). I know there could still be a problem, the instance processing the 180 undergoes a context switch just after it receives the message, but before writing to shm, but it would greatly reduce the chance.
In our applications we use a SIP stack that always sends messages to the application in the same order it receives them, even though is multi-threaded and messages from the network are received by different threads. So, they really syncronize between them. Why Kamailio instances don't?
I am evaluating kamailio to use it as a dispatcher to balance load against our several Application Servers, to present to the operator just a couple of entrance points to our platform (they don't want to establish connections to each one of our servers). This operator is very difficult to deal with. I am sure they will complain something like "why are you sending messages out of order? Fix that". The operator will be able to see traces and check that messages entered the Kamailio nodes in order and left out of order. They will not accept it.
(*) Not really "wait", as it would introduce a delay in processing all messages. it should be like putting it on a queue, continue processing other messages, and go back to the queue later.
Well, thanks for your answer.
Luis
On 4/8/20 3:01 AM, Henning Westerholt wrote:
Hello Luis,
as the 1xx responses are usually send unreliable (unless you use PRACK), you should not make any assumption on the order or even the arrival of this messages. It can also happens on a network level, if send by UDP.
Can you elaborate why you think this re-ordering is a problem for you?
One idea to enforce some ordering would be to use the dialog module in combination with reply routes and the textops(x) module.
About the shared memory question – Kamailio implement its own memory manager (private memory and shared memory pool).
Cheers,
Henning
--
Henning Westerholt – https://skalatan.de/blog/ https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fskalatan.de%2Fblog%2F&data=02%7C01%7C%7C9909a729fd8a426f81aa08d7db8aab0a%7Cab4a33c2b5614f798601bc921698ad08%7C0%7C1%7C637219260993836600&sdata=ZLmPqvbWKbsXY49s870sElN2I0uIn0DtDQSqJOoxr6I%3D&reserved=0 Kamailio services – https://gilawa.com https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgilawa.com%2F&data=02%7C01%7C%7C9909a729fd8a426f81aa08d7db8aab0a%7Cab4a33c2b5614f798601bc921698ad08%7C0%7C1%7C637219260993836600&sdata=Hdgzfwgu80wiwJBOjh9N70hvXSvWjt8abuKFjVRsavo%3D&reserved=0
From: sr-users sr-users-bounces@lists.kamailio.org mailto:sr-users-bounces@lists.kamailio.org On Behalf Of Luis Rojas G. Sent: Tuesday, April 7, 2020 10:43 PM To: sr-users@lists.kamailio.org mailto:sr-users@lists.kamailio.org Subject: [SR-Users] Kamailio propagates 180 and 200 OK OUT OF ORDER
Good day,
I am testing the dispatcher module, using Kamailio as stateless proxy. I have a pool of UAC (scripts in SIPP) and a pool of UAS (also scripts in SIPP) for the destinations. Kamailio version is kamailio-5.3.3-4.1.x86_64.
Problem I have is, if UAS responds 180 and 200 OK to Invite immediately, sometimes they are propagated out of order. 200 OK before 180, like this :
<image001.png>
UAS is 172.30.4.195:5061 http://172.30.4.195:5061/. UAC is 172.30.4.195:5080 http://172.30.4.195:5080/. Kamailio is 192.168.253.4:5070 http://192.168.253.4:5070/ Difference between 180 and 200 is just about 50 microseconds.
My guess is that both messages are received by different instances of Kamailio, and then because of context switches, even though the 180 is received before, that process ends after the processing of 200. However, I had the idea that in order to avoid these problems the kamailio processes synchronized with each other using a shared memory. I tried using stateful proxy and I obtained the same result.
By the way, anyone has any idea about how Kamailio's share memory is implemented? It clearly does not use the typical system calls shmget(), shmat(), because they are not shown by ipcs command.
Before posting here I googled, but I couldn't find anything related to this. I can't believe I am the only one who ever had this problem, so I guess I am doing something wrong...
Please, any help. I'm really stuck on this.
Thanks.
--
-- Luis Rojas Software Architect Sixbell Los Leones 1200 Providencia Santiago, Chile Phone: (+56-2) 22001288 mailto:luis.rojas@sixbell.com mailto:luis.rojas@sixbell.com http://www.sixbell.com http://www.sixbell.com/ _______________________________________________ Kamailio (SER) - Users Mailing List sr-users@lists.kamailio.org mailto:sr-users@lists.kamailio.org https://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-users https://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-users _______________________________________________ Kamailio (SER) - Users Mailing List sr-users@lists.kamailio.org https://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-users
Yes, I know that specifically in this case, from the point fo view of SIP, it's not "much" important. It's just a symptom than I can't rely on Kamailio to keep the ordering of messages when they are very very close in time. With this customer (a Brazilian mobile operator) I have seen scenarios where they send Re-Invite immediately after ACK, and sometimes it caused us problems. I can't think right now in other scenario,, but I'm afraid to find out in production. For what I see the Async module, as it is now, could help me to deal with requests. However, even though it's not a problem for SIP, the operator will complain, I know them. And also, they will not like to just drop the 180, because there will be scenarios with interworking, so it needs to propagate the ACM ISUP body, with parameters as backward call indicators.
Luis
On 4/9/20 3:27 AM, Olle E. Johansson wrote:
If you think about it, if the 200 OK is so close to the 180 it doesn’t really matter from a signalling standpoint if the 180 comes first or if it arrives after the 200 OK. It’s the 200 OK that is important. If the 180 comes after, it’s simply ignored and the dialog is established successfully.
The 1xx is seldom significant (unless you have PRACK but that’s another story).
Or do you really have a situation where the 180 is critical?
/O
On 8 Apr 2020, at 18:01, Steve Davies <steve-lists-srusers@connection-telecom.com mailto:steve-lists-srusers@connection-telecom.com> wrote:
Hi Luis,
Kamailio architecture isn't going to change I'm sure. There is no central orchestrator - each worker process just grabs messages as fast as it can. If your processing is slow for some and fast for others then they can get out of order I reckon. 180s are really neither here nor there if there's a 200 OK right behind it.
Perhaps a proxy like Drachtio would work better for you?
Steve
On Wed, 8 Apr 2020 at 17:44, Luis Rojas G. <luis.rojas@sixbell.com mailto:luis.rojas@sixbell.com> wrote:
Hello, Henning, I am worried about this scenario, because it's a symptom of what may happen in other cases. For instance, I've seen that this operator usually sends re-invites immediate after sending ACK. This may create race conditions like 3.1.5 of RFC5407 https://tools.ietf.org/html/rfc5407#page-22 <https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftools.ietf.org%2Fhtml%2Frfc5407%23page-22&data=02%7C01%7C%7Cbd5174d4cf944b0510eb08d7dc5771b6%7Cab4a33c2b5614f798601bc921698ad08%7C0%7C0%7C637220140475291806&sdata=RAqaEpyoKIedkPmVMHa%2Fl72%2B3JBkU%2F7PyiAjCMqpr4E%3D&reserved=0> I'd understand that one happens because of packet loss, as it's in UDP's nature, but in this case it would be artificially created by Kamailio. if there was no problem at network level (packet loss, packets following different path on the network and arriving out of order), why Kamailio creates it? I'd expect that the shared memory is used precisely for this. If an instance of kamailio receives a 200 OK, it could check on the shm and say "hey, another instance is processing a 180 for this call. Let's wait for it to finish" (*). I know there could still be a problem, the instance processing the 180 undergoes a context switch just after it receives the message, but before writing to shm, but it would greatly reduce the chance. In our applications we use a SIP stack that always sends messages to the application in the same order it receives them, even though is multi-threaded and messages from the network are received by different threads. So, they really syncronize between them. Why Kamailio instances don't? I am evaluating kamailio to use it as a dispatcher to balance load against our several Application Servers, to present to the operator just a couple of entrance points to our platform (they don't want to establish connections to each one of our servers). This operator is very difficult to deal with. I am sure they will complain something like "why are you sending messages out of order? Fix that". The operator will be able to see traces and check that messages entered the Kamailio nodes in order and left out of order. They will not accept it. (*) Not really "wait", as it would introduce a delay in processing all messages. it should be like putting it on a queue, continue processing other messages, and go back to the queue later. Well, thanks for your answer. Luis On 4/8/20 3:01 AM, Henning Westerholt wrote:
Hello Luis, as the 1xx responses are usually send unreliable (unless you use PRACK), you should not make any assumption on the order or even the arrival of this messages. It can also happens on a network level, if send by UDP. Can you elaborate why you think this re-ordering is a problem for you? One idea to enforce some ordering would be to use the dialog module in combination with reply routes and the textops(x) module. About the shared memory question – Kamailio implement its own memory manager (private memory and shared memory pool). Cheers, Henning -- Henning Westerholt – https://skalatan.de/blog/ <https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fskalatan.de%2Fblog%2F&data=02%7C01%7C%7Cbd5174d4cf944b0510eb08d7dc5771b6%7Cab4a33c2b5614f798601bc921698ad08%7C0%7C0%7C637220140475301797&sdata=gqiNRCFj%2F1GUuTnnB0X7bBmO2z6zDrXns6qJBWAXkfE%3D&reserved=0> Kamailio services – https://gilawa.com <https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgilawa.com%2F&data=02%7C01%7C%7Cbd5174d4cf944b0510eb08d7dc5771b6%7Cab4a33c2b5614f798601bc921698ad08%7C0%7C0%7C637220140475301797&sdata=7bZMGT2k%2Fi%2BdVrYgIfoS2gt%2F50YCfBeyKMI%2Bxx04FsY%3D&reserved=0> *From:* sr-users <sr-users-bounces@lists.kamailio.org> <mailto:sr-users-bounces@lists.kamailio.org> *On Behalf Of *Luis Rojas G. *Sent:* Tuesday, April 7, 2020 10:43 PM *To:* sr-users@lists.kamailio.org <mailto:sr-users@lists.kamailio.org> *Subject:* [SR-Users] Kamailio propagates 180 and 200 OK OUT OF ORDER Good day, I am testing the dispatcher module, using Kamailio as stateless proxy. I have a pool of UAC (scripts in SIPP) and a pool of UAS (also scripts in SIPP) for the destinations. Kamailio version is kamailio-5.3.3-4.1.x86_64. Problem I have is, if UAS responds 180 and 200 OK to Invite immediately, sometimes they are propagated out of order. 200 OK before 180, like this : <image001.png> UAS is 172.30.4.195:5061 <https://nam02.safelinks.protection.outlook.com/?url=http%3A%2F%2F172.30.4.195%3A5061%2F&data=02%7C01%7C%7Cbd5174d4cf944b0510eb08d7dc5771b6%7Cab4a33c2b5614f798601bc921698ad08%7C0%7C0%7C637220140475311794&sdata=i%2FMPGmKH%2BoZrk1BhqVfB6BYrLwyeDTT%2BZ3g%2FbR4f1bU%3D&reserved=0>. UAC is 172.30.4.195:5080 <https://nam02.safelinks.protection.outlook.com/?url=http%3A%2F%2F172.30.4.195%3A5080%2F&data=02%7C01%7C%7Cbd5174d4cf944b0510eb08d7dc5771b6%7Cab4a33c2b5614f798601bc921698ad08%7C0%7C0%7C637220140475321788&sdata=a0%2FA4NPnvgECMGGqSFmB0A%2FV04sof91YEEFrDl7lUsA%3D&reserved=0>. Kamailio is 192.168.253.4:5070 <https://nam02.safelinks.protection.outlook.com/?url=http%3A%2F%2F192.168.253.4%3A5070%2F&data=02%7C01%7C%7Cbd5174d4cf944b0510eb08d7dc5771b6%7Cab4a33c2b5614f798601bc921698ad08%7C0%7C0%7C637220140475321788&sdata=lsh41ie0Pt8V9e7dYY22XGFSf3N%2Bx7AH7KNjdZz0wZM%3D&reserved=0> Difference between 180 and 200 is just about 50 microseconds. My guess is that both messages are received by different instances of Kamailio, and then because of context switches, even though the 180 is received before, that process ends after the processing of 200. However, I had the idea that in order to avoid these problems the kamailio processes synchronized with each other using a shared memory. I tried using stateful proxy and I obtained the same result. By the way, anyone has any idea about how Kamailio's share memory is implemented? It clearly does not use the typical system calls shmget(), shmat(), because they are not shown by ipcs command. Before posting here I googled, but I couldn't find anything related to this. I can't believe I am the only one who ever had this problem, so I guess I am doing something wrong... Please, any help. I'm really stuck on this. Thanks. --
-- Luis Rojas Software Architect Sixbell Los Leones 1200 Providencia Santiago, Chile Phone: (+56-2) 22001288 mailto:luis.rojas@sixbell.com http://www.sixbell.com <https://nam02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.sixbell.com%2F&data=02%7C01%7C%7Cbd5174d4cf944b0510eb08d7dc5771b6%7Cab4a33c2b5614f798601bc921698ad08%7C0%7C0%7C637220140475331785&sdata=mdRBm3%2FLquXhok2NdBHLsPdolLZaYxixSDi04dubqpE%3D&reserved=0> _______________________________________________ Kamailio (SER) - Users Mailing List sr-users@lists.kamailio.org <mailto:sr-users@lists.kamailio.org> https://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-users <https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.kamailio.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fsr-users&data=02%7C01%7C%7Cbd5174d4cf944b0510eb08d7dc5771b6%7Cab4a33c2b5614f798601bc921698ad08%7C0%7C0%7C637220140475331785&sdata=5VqpYRjbnYTDa70nvXNIT3Ywj6%2FF5Uh%2B%2Bd2rudw2d5w%3D&reserved=0>
Kamailio (SER) - Users Mailing List sr-users@lists.kamailio.org mailto:sr-users@lists.kamailio.org https://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-users
On Thu, Apr 09, 2020 at 08:14:02AM -0400, Luis Rojas G. wrote:
Yes, I know that specifically in this case, from the point fo view of SIP, it's not "much" important. It's just a symptom than I can't rely on Kamailio to keep the ordering of messages when they are very very close in time. With this customer (a Brazilian mobile operator) I have seen scenarios where they send Re-Invite immediately after ACK, and sometimes it caused us problems. I can't think right now in other scenario,, but I'm afraid to find out in production. For what I see the Async module, as it is now, could help me to deal with requests. However, even though it's not a problem for SIP, the operator will complain, I know them. And also, they will not like to just drop the 180, because there will be scenarios with interworking, so it needs to propagate the ACM ISUP body, with parameters as backward call indicators.
Agreed that you need to conserve signalling messages end-to-end, as a matter of principle; can't just drop things.
As I mentioned earlier, we had the same problem--ironically, with a major mobile operator--sending reinvites and e2e ACKs almost contemporaneously. We solved it with 'async' by delaying all reinvites by 50 ms, and haven't had a single complaint since.
Aside from that specific scenario, we haven't seen ordering problems, and have never had cause to call into question whether we can 'rely' on Kamailio to conserve ordering.
There are valid questions raised in this thread about whether any user-space SIP element subject to the vicissitudes and realities of packet-switched networking can be relied upon to preserve ordering in a consistent and universal way.
At the end of the day, this is a network problem. PSTN interworking is imperfect; it imposes synchronous assumptions upon asynchronous media, and we see this play out in many areas, e.g. fax. Not much you can do about it, but thankfully the corner cases are relatively few.
-- Alex
Hello, Alex,
How specifically did you solve that scenario (ACK-Re-Invite)? Would you mind sharing that part of your script? I have been trying to do that, but can't make it work. I was working on the requests, to see how Async works.
If I put it just in request_route, for instance, where record_route is put :
if (is_method("INVITE|SUBSCRIBE")) { if (is_method("INVITE")) { async_ms_sleep(20); } record_route(); }
I get this error :
1(18720) ERROR: async [async_mod.c:246]: w_async_ms_sleep(): cannot be executed as last action in a route block
I tried in route[DISPATCH]
route[DISPATCH] { # round robin dispatching on gateways group '1' if(!ds_select_dst("1", "0")) { send_reply("404", "No destination"); exit; } xdbg("--- SCRIPT: going to <$ru> via <$du> (attrs: $xavp(_dsdst_=>attrs))\n"); t_on_failure("RTF_DISPATCH"); async_ms_sleep(10); forward(); # route(RELAY); exit; }
I get this error :
17(18736) WARNING: <core> [core/async_task.c:244]: async_task_push(): async task pushed, but no async workers - ignoring
No async workers? Why? I have this at the beginning of file :
loadmodule "async.so" modparam("async", "workers", 4) modparam("async", "ms_timer", 5)
Please, any advice.
Luis
On 4/9/20 9:49 AM, Alex Balashov wrote:
On Thu, Apr 09, 2020 at 08:14:02AM -0400, Luis Rojas G. wrote:
Yes, I know that specifically in this case, from the point fo view of SIP, it's not "much" important. It's just a symptom than I can't rely on Kamailio to keep the ordering of messages when they are very very close in time. With this customer (a Brazilian mobile operator) I have seen scenarios where they send Re-Invite immediately after ACK, and sometimes it caused us problems. I can't think right now in other scenario,, but I'm afraid to find out in production. For what I see the Async module, as it is now, could help me to deal with requests. However, even though it's not a problem for SIP, the operator will complain, I know them. And also, they will not like to just drop the 180, because there will be scenarios with interworking, so it needs to propagate the ACM ISUP body, with parameters as backward call indicators.
Agreed that you need to conserve signalling messages end-to-end, as a matter of principle; can't just drop things.
As I mentioned earlier, we had the same problem--ironically, with a major mobile operator--sending reinvites and e2e ACKs almost contemporaneously. We solved it with 'async' by delaying all reinvites by 50 ms, and haven't had a single complaint since.
Aside from that specific scenario, we haven't seen ordering problems, and have never had cause to call into question whether we can 'rely' on Kamailio to conserve ordering.
There are valid questions raised in this thread about whether any user-space SIP element subject to the vicissitudes and realities of packet-switched networking can be relied upon to preserve ordering in a consistent and universal way.
At the end of the day, this is a network problem. PSTN interworking is imperfect; it imposes synchronous assumptions upon asynchronous media, and we see this play out in many areas, e.g. fax. Not much you can do about it, but thankfully the corner cases are relatively few.
-- Alex
Hi Luis,
Rather confusingly, there is an 'async_workers' parameter in the core as well, which needs to be set:
https://www.kamailio.org/wiki/cookbooks/5.3.x/core#async_workers
There is some relationship between 'async_workers' in the core and the 'workers' modparam in the 'async' module which is explained by Daniel somewhere in a past mailing list thread, but I do not remember it offhand. I really should dig it out and update the documentation with this nuance.
Having said that, I did not use the 'async' framework for my fix, but rather 'mqueue' and 'rtimer'. I have no real justification for that; just custom, habit and comfort with those mechanisms. I grew accustomed to them at a time when I had some issues with 100% CPU utilisation in a virtualised (Xen) environment when using early versions of the 'async' concepts.
1) The first thing is to create a reinvite queue; a single one will do, since it's specifically designed to be multiprocess-safe:
loadmodule "mqueue" modparam("mqueue", "mqueue", "name=reinvite_q");
2) Then I create 12 'rtimer' processes to consume this queue, each having a 10,000 usec re-invocation delay:
loadmodule "rtimer" modparam("rtimer", "timer", "name=reinvite_q1;interval=10000u;mode=1;") modparam("rtimer", "exec", "timer=reinvite_q1;route=REINVITE_DEQUEUE") modparam("rtimer", "timer", "name=reinvite_q2;interval=10000u;mode=1;") modparam("rtimer", "exec", "timer=reinvite_q2;route=REINVITE_DEQUEUE") modparam("rtimer", "timer", "name=reinvite_q3;interval=10000u;mode=1;") modparam("rtimer", "exec", "timer=reinvite_q3;route=REINVITE_DEQUEUE") modparam("rtimer", "timer", "name=reinvite_q4;interval=10000u;mode=1;") modparam("rtimer", "exec", "timer=reinvite_q4;route=REINVITE_DEQUEUE") modparam("rtimer", "timer", "name=reinvite_q5;interval=10000u;mode=1;") modparam("rtimer", "exec", "timer=reinvite_q5;route=REINVITE_DEQUEUE") modparam("rtimer", "timer", "name=reinvite_q6;interval=10000u;mode=1;") modparam("rtimer", "exec", "timer=reinvite_q6;route=REINVITE_DEQUEUE") modparam("rtimer", "timer", "name=reinvite_q7;interval=10000u;mode=1;") modparam("rtimer", "exec", "timer=reinvite_q7;route=REINVITE_DEQUEUE") modparam("rtimer", "timer", "name=reinvite_q8;interval=10000u;mode=1;") modparam("rtimer", "exec", "timer=reinvite_q8;route=REINVITE_DEQUEUE") modparam("rtimer", "timer", "name=reinvite_q9;interval=10000u;mode=1;") modparam("rtimer", "exec", "timer=reinvite_q9;route=REINVITE_DEQUEUE") modparam("rtimer", "timer", "name=reinvite_q10;interval=10000u;mode=1;") modparam("rtimer", "exec", "timer=reinvite_q10;route=REINVITE_DEQUEUE") modparam("rtimer", "timer", "name=reinvite_q11;interval=10000u;mode=1;") modparam("rtimer", "exec", "timer=reinvite_q11;route=REINVITE_DEQUEUE") modparam("rtimer", "timer", "name=reinvite_q12;interval=10000u;mode=1;") modparam("rtimer", "exec", "timer=reinvite_q12;route=REINVITE_DEQUEUE")
3) I then add this handling for reinvites to the loose_route() block in the main request route -- logging and other extraneous matter omitted:
--- if(has_totag()) { if(loose_route()) { if(is_method("INVITE")) { if(!t_suspend()) { sl_send_reply("500", "Internal Server Error"); exit; }
mq_add("reinvite_q", "$T(id_index):$T(id_label)", ""); } else { # Normal in-dialog request handling, t_relay() and that.
route(IN_DLG_REQ); } } } ---
And the handler on the other side, when the transaction is reanimated, as it were:
--- route[REINVITE_DEQUEUE] { while(mq_fetch("reinvite_q")) { xlog("L_INFO", "[R-REINVITE-DEQUEUE:$ci] -> Resuming re-invite handling ($TV(Sn)) in PID $pp\n");
$var(id) = $(mqk(reinvite_q){s.select,0,:}{s.int}); $var(label) = $(mqk(reinvite_q){s.select,1,:}{s.int});
xlog("L_INFO", "[R-REINVITE-DEQUEUE:$ci] -> Resuming re-invite handling ($TV(Sn)) in PID $pp\n");
# Call route[IN_DLG_REQ] to do what we otherwise would have done # immediately, were it any other kind of in-dialog request.
t_continue("$var(id)", "$var(label)", "IN_DLG_REQ"); return; } } ---
The idea behind 12 whole 'rtimer' processes is to massively overprovision the amount of reinvite handlers available relative to the actual number of reinvites passing through the system, even at high volumes.
The danger with too few processes is that the intended 10 ms delay may not apply, because while(mq_fetch(...)) will just spin, always dequeueing new reinvites and never allowing the execution route to return control to 'rtimer'.
As a practical matter, the system is not extremely high-volume and this is a very academic concern.
Overall, I would describe this method as crude but effective. It was developed some years ago. 'async' provides some simplification and removes some of the manual labour around this kind of management, from what I understand, and overall I think you'd find it easier to use that.
¡Suerte! I'd be happy to answer any additional questions.
-- Alex
Nice hack! :)
-ovidiu
On Tue, Apr 14, 2020 at 5:27 PM Alex Balashov abalashov@evaristesys.com wrote:
Hi Luis,
Rather confusingly, there is an 'async_workers' parameter in the core as well, which needs to be set:
https://www.kamailio.org/wiki/cookbooks/5.3.x/core#async_workers
There is some relationship between 'async_workers' in the core and the 'workers' modparam in the 'async' module which is explained by Daniel somewhere in a past mailing list thread, but I do not remember it offhand. I really should dig it out and update the documentation with this nuance.
Having said that, I did not use the 'async' framework for my fix, but rather 'mqueue' and 'rtimer'. I have no real justification for that; just custom, habit and comfort with those mechanisms. I grew accustomed to them at a time when I had some issues with 100% CPU utilisation in a virtualised (Xen) environment when using early versions of the 'async' concepts.
- The first thing is to create a reinvite queue; a single one will do,
since it's specifically designed to be multiprocess-safe:
loadmodule "mqueue" modparam("mqueue", "mqueue", "name=reinvite_q");
- Then I create 12 'rtimer' processes to consume this queue, each
having a 10,000 usec re-invocation delay:
loadmodule "rtimer" modparam("rtimer", "timer", "name=reinvite_q1;interval=10000u;mode=1;") modparam("rtimer", "exec", "timer=reinvite_q1;route=REINVITE_DEQUEUE") modparam("rtimer", "timer", "name=reinvite_q2;interval=10000u;mode=1;") modparam("rtimer", "exec", "timer=reinvite_q2;route=REINVITE_DEQUEUE") modparam("rtimer", "timer", "name=reinvite_q3;interval=10000u;mode=1;") modparam("rtimer", "exec", "timer=reinvite_q3;route=REINVITE_DEQUEUE") modparam("rtimer", "timer", "name=reinvite_q4;interval=10000u;mode=1;") modparam("rtimer", "exec", "timer=reinvite_q4;route=REINVITE_DEQUEUE") modparam("rtimer", "timer", "name=reinvite_q5;interval=10000u;mode=1;") modparam("rtimer", "exec", "timer=reinvite_q5;route=REINVITE_DEQUEUE") modparam("rtimer", "timer", "name=reinvite_q6;interval=10000u;mode=1;") modparam("rtimer", "exec", "timer=reinvite_q6;route=REINVITE_DEQUEUE") modparam("rtimer", "timer", "name=reinvite_q7;interval=10000u;mode=1;") modparam("rtimer", "exec", "timer=reinvite_q7;route=REINVITE_DEQUEUE") modparam("rtimer", "timer", "name=reinvite_q8;interval=10000u;mode=1;") modparam("rtimer", "exec", "timer=reinvite_q8;route=REINVITE_DEQUEUE") modparam("rtimer", "timer", "name=reinvite_q9;interval=10000u;mode=1;") modparam("rtimer", "exec", "timer=reinvite_q9;route=REINVITE_DEQUEUE") modparam("rtimer", "timer", "name=reinvite_q10;interval=10000u;mode=1;") modparam("rtimer", "exec", "timer=reinvite_q10;route=REINVITE_DEQUEUE") modparam("rtimer", "timer", "name=reinvite_q11;interval=10000u;mode=1;") modparam("rtimer", "exec", "timer=reinvite_q11;route=REINVITE_DEQUEUE") modparam("rtimer", "timer", "name=reinvite_q12;interval=10000u;mode=1;") modparam("rtimer", "exec", "timer=reinvite_q12;route=REINVITE_DEQUEUE")
- I then add this handling for reinvites to the loose_route() block in
the main request route -- logging and other extraneous matter omitted:
if(has_totag()) { if(loose_route()) { if(is_method("INVITE")) { if(!t_suspend()) { sl_send_reply("500", "Internal Server Error"); exit; }
mq_add("reinvite_q", "$T(id_index):$T(id_label)", ""); } else { # Normal in-dialog request handling, t_relay() and that. route(IN_DLG_REQ); } }
}
And the handler on the other side, when the transaction is reanimated, as it were:
route[REINVITE_DEQUEUE] { while(mq_fetch("reinvite_q")) { xlog("L_INFO", "[R-REINVITE-DEQUEUE:$ci] -> Resuming re-invite handling ($TV(Sn)) in PID $pp\n");
$var(id) = $(mqk(reinvite_q){s.select,0,:}{s.int}); $var(label) = $(mqk(reinvite_q){s.select,1,:}{s.int}); xlog("L_INFO", "[R-REINVITE-DEQUEUE:$ci] -> Resuming re-invite handling ($TV(Sn)) in PID $pp\n"); # Call route[IN_DLG_REQ] to do what we otherwise would have done # immediately, were it any other kind of in-dialog request. t_continue("$var(id)", "$var(label)", "IN_DLG_REQ"); return;
} }
The idea behind 12 whole 'rtimer' processes is to massively overprovision the amount of reinvite handlers available relative to the actual number of reinvites passing through the system, even at high volumes.
The danger with too few processes is that the intended 10 ms delay may not apply, because while(mq_fetch(...)) will just spin, always dequeueing new reinvites and never allowing the execution route to return control to 'rtimer'.
As a practical matter, the system is not extremely high-volume and this is a very academic concern.
Overall, I would describe this method as crude but effective. It was developed some years ago. 'async' provides some simplification and removes some of the manual labour around this kind of management, from what I understand, and overall I think you'd find it easier to use that.
¡Suerte! I'd be happy to answer any additional questions.
-- Alex
-- Alex Balashov | Principal | Evariste Systems LLC
Tel: +1-706-510-6800 / +1-800-250-5920 (toll-free) Web: http://www.evaristesys.com/, http://www.csrpswitch.com/
Kamailio (SER) - Users Mailing List sr-users@lists.kamailio.org https://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-users
Hi, Alex,
Thanks for all the details. Wow, such nice implementation. It requires a deep knowledge of Kamailio.
I see, this is the part where the transaction is saved.
mq_add("reinvite_q", "$T(id_index):$T(id_label)", "");
I didn't know much about pseudo-variables. I am reading now.
https://www.kamailio.org/wiki/cookbooks/5.3.x/pseudovariables#tmx_module_pse...
I guess, this approach could help me also in case of replies, not just requests, to add a delay to 200 OK to Invite.
By the way, I assigned a value to "async_workers", and now the Invite (I was delaying any Invite, not just reinvite, just to test the api) is propagated delayed, which is great, but at the same time a
SIP/2.0 500 I'm terribly sorry, server error occurred (1/TM)
is sent back to the origin, just killing the call.
I'll start playing with mqueue and rtimer.
Thanks again!
Luis
On 4/14/20 5:26 PM, Alex Balashov wrote:
Hi Luis,
Rather confusingly, there is an 'async_workers' parameter in the core as well, which needs to be set:
https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.kamail...
There is some relationship between 'async_workers' in the core and the 'workers' modparam in the 'async' module which is explained by Daniel somewhere in a past mailing list thread, but I do not remember it offhand. I really should dig it out and update the documentation with this nuance.
Having said that, I did not use the 'async' framework for my fix, but rather 'mqueue' and 'rtimer'. I have no real justification for that; just custom, habit and comfort with those mechanisms. I grew accustomed to them at a time when I had some issues with 100% CPU utilisation in a virtualised (Xen) environment when using early versions of the 'async' concepts.
- The first thing is to create a reinvite queue; a single one will do,
since it's specifically designed to be multiprocess-safe:
loadmodule "mqueue" modparam("mqueue", "mqueue", "name=reinvite_q");
- Then I create 12 'rtimer' processes to consume this queue, each
having a 10,000 usec re-invocation delay:
loadmodule "rtimer" modparam("rtimer", "timer", "name=reinvite_q1;interval=10000u;mode=1;") modparam("rtimer", "exec", "timer=reinvite_q1;route=REINVITE_DEQUEUE") modparam("rtimer", "timer", "name=reinvite_q2;interval=10000u;mode=1;") modparam("rtimer", "exec", "timer=reinvite_q2;route=REINVITE_DEQUEUE") modparam("rtimer", "timer", "name=reinvite_q3;interval=10000u;mode=1;") modparam("rtimer", "exec", "timer=reinvite_q3;route=REINVITE_DEQUEUE") modparam("rtimer", "timer", "name=reinvite_q4;interval=10000u;mode=1;") modparam("rtimer", "exec", "timer=reinvite_q4;route=REINVITE_DEQUEUE") modparam("rtimer", "timer", "name=reinvite_q5;interval=10000u;mode=1;") modparam("rtimer", "exec", "timer=reinvite_q5;route=REINVITE_DEQUEUE") modparam("rtimer", "timer", "name=reinvite_q6;interval=10000u;mode=1;") modparam("rtimer", "exec", "timer=reinvite_q6;route=REINVITE_DEQUEUE") modparam("rtimer", "timer", "name=reinvite_q7;interval=10000u;mode=1;") modparam("rtimer", "exec", "timer=reinvite_q7;route=REINVITE_DEQUEUE") modparam("rtimer", "timer", "name=reinvite_q8;interval=10000u;mode=1;") modparam("rtimer", "exec", "timer=reinvite_q8;route=REINVITE_DEQUEUE") modparam("rtimer", "timer", "name=reinvite_q9;interval=10000u;mode=1;") modparam("rtimer", "exec", "timer=reinvite_q9;route=REINVITE_DEQUEUE") modparam("rtimer", "timer", "name=reinvite_q10;interval=10000u;mode=1;") modparam("rtimer", "exec", "timer=reinvite_q10;route=REINVITE_DEQUEUE") modparam("rtimer", "timer", "name=reinvite_q11;interval=10000u;mode=1;") modparam("rtimer", "exec", "timer=reinvite_q11;route=REINVITE_DEQUEUE") modparam("rtimer", "timer", "name=reinvite_q12;interval=10000u;mode=1;") modparam("rtimer", "exec", "timer=reinvite_q12;route=REINVITE_DEQUEUE")
- I then add this handling for reinvites to the loose_route() block in
the main request route -- logging and other extraneous matter omitted:
if(has_totag()) { if(loose_route()) { if(is_method("INVITE")) { if(!t_suspend()) { sl_send_reply("500", "Internal Server Error"); exit; } mq_add("reinvite_q", "$T(id_index):$T(id_label)", ""); } else { # Normal in-dialog request handling, t_relay() and that. route(IN_DLG_REQ); } } }
And the handler on the other side, when the transaction is reanimated, as it were:
route[REINVITE_DEQUEUE] { while(mq_fetch("reinvite_q")) { xlog("L_INFO", "[R-REINVITE-DEQUEUE:$ci] -> Resuming re-invite handling ($TV(Sn)) in PID $pp\n");
$var(id) = $(mqk(reinvite_q){s.select,0,:}{s.int}); $var(label) = $(mqk(reinvite_q){s.select,1,:}{s.int}); xlog("L_INFO", "[R-REINVITE-DEQUEUE:$ci] -> Resuming re-invite handling ($TV(Sn)) in PID $pp\n"); # Call route[IN_DLG_REQ] to do what we otherwise would have done # immediately, were it any other kind of in-dialog request. t_continue("$var(id)", "$var(label)", "IN_DLG_REQ"); return; }
}
The idea behind 12 whole 'rtimer' processes is to massively overprovision the amount of reinvite handlers available relative to the actual number of reinvites passing through the system, even at high volumes.
The danger with too few processes is that the intended 10 ms delay may not apply, because while(mq_fetch(...)) will just spin, always dequeueing new reinvites and never allowing the execution route to return control to 'rtimer'.
As a practical matter, the system is not extremely high-volume and this is a very academic concern.
Overall, I would describe this method as crude but effective. It was developed some years ago. 'async' provides some simplification and removes some of the manual labour around this kind of management, from what I understand, and overall I think you'd find it easier to use that.
¡Suerte! I'd be happy to answer any additional questions.
-- Alex
On Wed, Apr 15, 2020 at 08:38:42AM -0400, Luis Rojas G. wrote:
Thanks for all the details. Wow, such nice implementation. It requires a deep knowledge of Kamailio.
Well, thank you -- I think that's much too generous. :-)
I see, this is the part where the transaction is saved.
mq_add("reinvite_q", "$T(id_index):$T(id_label)", "");
I didn't know much about pseudo-variables. I am reading now.
https://www.kamailio.org/wiki/cookbooks/5.3.x/pseudovariables#tmx_module_pse...
Pseudovariables are a confusing term, laden with historical baggage.
Historically, PVs referred to specific (and mostly read-only) variables which target parts of the message buffer, e.g.
https://www.kamailio.org/wiki/cookbooks/5.3.x/pseudovariables#ct_-_contact_h...
But now they just refer to variables of all kinds, including the namespace containers exposed by different modules (e.g. $T(...), $shv(...), $dlg_ctx(...)), $vars, $avps, you name it.
I guess, this approach could help me also in case of replies, not just requests, to add a delay to 200 OK to Invite.
I'm not sure that's possible. Remember that the request route is fully imperative; a request is dropped into them and the list of actions in the config script determines what happens to it, including any relaying (or lack thereof).
onreply_routes, by contrast, are callbacks/hooks from normal, built-in transaction-stateful behaviour; they expose the reply and allow you to modify or drop it, but otherwise they are optional, because the proxy already has an automatic reply-handling behaviour which will play out without manual invocation from you.
By the way, I assigned a value to "async_workers", and now the Invite (I was delaying any Invite, not just reinvite, just to test the api) is propagated delayed, which is great, but at the same time a
SIP/2.0 500 I'm terribly sorry, server error occurred (1/TM)
This sounds like possibly an attempt to relay twice, but I'm not sure.
-- Alex