hello Daniel

related to this module, I had an issue this week on a different topic.
We have been testing this module with the application we made to write events from kamailio to kafka, and it was working fine.
But the other day i tried to send messages to a damaged kafka cluster, so the application got congested, and the socket between kamailio and the the application had some issue.

I didn't see the buffer Send-Q increased in the socket from kamailio's point of view, but i saw Recv-Q increased in the socket from application's point of view
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name     Timer
tcp    22802      0 APP_IP:40068      KAMAILIO_IP:8448      ESTABLISHED 128595/evapi_kafka   off (0.00/0/0)

and after a while of having timeouts trying to produce at kafka and having the socket buffer congested, kamailio stopped to process any UDP message

Would you have any idea about the reason for this behaviour?
I would expect the evapi socket not to interfer with UDP receivers

thanks a lot and regards
david


El mar, 15 dic 2020 a las 15:05, David Escartin (<descartin@sonoc.io>) escribió:
Hello Daniel

Thanks a lot for the feedback
I tried the netstring option but since i had to reparse in the application, I finally didn't use it.
But I will consider it since I will test the performance for heavy load during the next days. 

About the tcp_nodelay option, I remember I saw no frames with more than a message (Json object) relayed to evapi and sent out to the socket :$, but anycase, I removed that option and changed the application side.

best regards
david

El vie, 11 dic 2020 a las 13:51, Daniel-Constantin Mierla (<miconda@gmail.com>) escribió:

Hello,

tcp is a streaming protocol, it is the receiving application that has to parse and split the stream content in logical messages from its point of view. As you noticed, there can be socket options to enforce some sending policy (wait or not wait for a specific amount of data to accumulate, not to send too many small packets, but if it is too much in very short time, it is no difference).

As a side note, your patch below is setting the TCP_NODELAY to listen socket, not to the socket associated with the client connection (returned by accept()). I do not think is inherited by accept() from listen socket, in such case practically your patch didn't make any difference in behaviour.

The evapi module has the option (modparam) to serialize the packets in netstring format, being easier to split the stream in messages and I would recommend that mode for heavy traffic.

At the end, we can introduce modparams for evapi to set TCP_NODELAY, it can be useful to reduce delays when sending short data messages from time to time, but it won't help in your case.

Cheers,
Daniel

On 11.12.20 09:39, David Escartin wrote:
Dear all

seems the issue was not on the module or related to kamailio, but related to the application we were using to read from tcp socket.
I saw that some messages sent with evapi_relay were encapsulated in the same frame, and i even tried to force the TCP_NODELAY option on the evapi socket by compiling the kamailio with this
--- a/src/modules/evapi/evapi_dispatch.c
+++ b/src/modules/evapi/evapi_dispatch.c
@@ -30,8 +30,8 @@
 #include <netinet/in.h>
 #include <arpa/inet.h>
 #include <fcntl.h>
-
 #include <ev.h>
+#include <netinet/tcp.h>
 
 #include "../../core/sr_module.h"
 #include "../../core/dprint.h"
@@ -690,6 +691,15 @@ int evapi_run_dispatcher(char *laddr, int lport)
                freeaddrinfo(ai_res);
                return -1;
        }
+      
+        if(setsockopt(evapi_srv_sock, IPPROTO_TCP, TCP_NODELAY,
+               &yes_true, sizeof(int)) < 0) {
+               LM_INFO("cannot set TCP_NODELAY option on descriptor\n");
+               close(evapi_srv_sock);
+               freeaddrinfo(ai_res);
+               return -1;
+       }
+
 
        if (bind(evapi_srv_sock, ai_res->ai_addr, ai_res->ai_addrlen) < 0) {
                LM_ERR("cannot bind to local address and port [%s:%d]\n", laddr, lport);

and i saw that with this change we had always a frame for each message published to evapi, but the issue was still there. 
So no matter if this option was activated or not in Kamailio, I had to tune the application (in erlang) to delimit the messages received by converting them to line mode. This way we could reach up to 1000 processed messages per second.

best regards
david

 

El lun, 30 nov 2020 a las 11:19, David Escartin (<descartin@sonoc.io>) escribió:
Dear all

we have been testing this module with the following setup
kamailio 5.3.2
evapi params
modparam("evapi", "workers", 4)
modparam("evapi", "netstring_format", 0)
modparam("evapi", "bind_addr", "127.0.0.1:8448")
modparam("evapi", "max_clients", 32)

then in the configuration we do evapi_relay of avp including a json data (which can be quite long), like this
{"key" : "aarp2q0tcpqhs0cpucuhukjs2ah2j00q@10.18.5.64" , "msg" : {"rg_in":"701","ani_init":{"ani_source":"pai", ....... }}}

We have an application listening on the tcp socket and writing those messages to a kafka cluster, and this works ok, and in the previous manual tests we have done no issue was found.
But when making some load tests, and passing some live traffic we see some issues

seems like some times, when there are messages to be sent to the tcp socket at the same time, they are sent in the same message, when normally each data sent using evapi_relay is sent in 1 message
We do sometimes see something like this on the application consuming from the tcp socket
2020-11-25 15:20:01.744 UTC [error] <0.706.0>@evapi_kafka_listener:handle_info:167 body "{\"key\" : \"6142651aa63616c6c04a783cd@72.21.24.130\" , \"msg\" : {\"rg_in\":\"677\",\"ani_init\":{\"ani_source\":\"fro\",.......}}}{\"key\" : \"isbc7caT4001915251VabcGhEfHdNiF0i@172.16.120.1\" , \"msg\" : {\"rg_in\":\"22\",\"ani_init\":{\"ani_source\":\"pai\", ....... ,\"translate" not valid json; error = {691,invalid_trailing_data}
2020-11-25 15:20:01.745 UTC [error] <0.706.0>@evapi_kafka_listener:handle_info:167 body "dPartition\":\"-1\",......}}}" not valid json; error = {1,invalid_json}

and we do see that the application cannot parse the json message fine, because we have like 2 json objects together ......{\"ani_source\":\"fro\",.......}}}{\"key\" : \"isbc7caT4001915251Vabc............
This happens with 2 different UDP receivers processing messages and calling evapi_relay at the same time. But i don't think this happens all the time. Seems like some issue when several processes try to use evapi workers at the same time.
We tried to increase evapi workers and it's the same

We also saw another issue I think. Seems when the avp sent to evapi socket is bigger than ~1680 char, the json is also truncated, and also happens when we use the socket in Lo interface which has an MTU of 65535.

Could you please take a look to see if there is any problem or limitation, or if we are using something wrong?

thanks and best regards 
david

--
Logo

David Escartín Almudévar
VoIP/Switch Engineer
descartin@sonoc.io

SONOC
C/ Josefa Amar y Borbón, 10, 4ª · 50001 Zaragoza, España
Tlf: +34 917019888 ·
 www.sonoc.io



--
Logo

David Escartín Almudévar
VoIP/Switch Engineer
descartin@sonoc.io

SONOC
C/ Josefa Amar y Borbón, 10, 4ª · 50001 Zaragoza, España
Tlf: +34 917019888 ·
 www.sonoc.io


_______________________________________________
Kamailio (SER) - Development Mailing List
sr-dev@lists.kamailio.org
https://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-dev
-- 
Daniel-Constantin Mierla -- www.asipto.com
www.twitter.com/miconda -- www.linkedin.com/in/miconda
Funding: https://www.paypal.me/dcmierla


--
Logo

David Escartín Almudévar
VoIP/Switch Engineer
descartin@sonoc.io

SONOC
C/ Josefa Amar y Borbón, 10, 4ª · 50001 Zaragoza, España
Tlf: +34 917019888 ·
 www.sonoc.io



--
Logo

David Escartín Almudévar
VoIP/Switch Engineer
descartin@sonoc.io

SONOC
C/ Josefa Amar y Borbón, 10, 4ª · 50001 Zaragoza, España
Tlf: +34 917019888 ·
 www.sonoc.io