Hi,
Recently experienced issues with 1 TCP connection between 2 kamailio servers: 1. KAM1 sends 2x forked INVITEs to KAM2 2. KAM2 starts config route processing for INVITE1. But blocks for ~1s due to rtpengine module pinging some inactive IPs 3. KAM1 re-transmits forked INVITE2
Worth mentioning that: 1. KAM2 uses same TCP connection for receiving KDMQs too. During that period, noticed KDMQ default_callback error triggered, due to timeout. So clearly, no KDMQs were processed anymore, during that time. 2. No errors related to TCP connection logged 3. kamailio version 5.8, tcp_reuse_port=yes, and don't set any route_locks_size, used 4 socket workers for that specific TCP connection
Looked for quite a while in tcp_main.c and tcp_read.c trying to figure out what is happening with TCP connection(s) in general, and come to the following conclusion: TCP connection structure is held by the TCP socket worker process until the SIP request is completely received in the buffer, parsed *and* processed routing config for it. Afterwards TCP socket worker releases the TCP connection structure by signalling this back to the TCP_MAIN process. Thus other TCP socket worker would be able to handle *next* SIP request, for *the same* TCP connection. ...but while one TCP socket worker executes config route, no other TCP socket workers will be able to handle *next* SIP request, for *the same* TCP connection.
My questions are: 1. Is the above conclusion correct? => this explains the above issue, and want to double check I understood the core tcp code correctly 2. Can async socket workers solve this?
Thank you, Stefan