Re: [sr-dev] Problem with TCP and EPOLL

17 Feb 2012

Hello,

On 2/17/12 11:35 AM, Paul Pankhurst wrote:
...
  I now understand what is going wrong....

 To make the xcap server work with the size of documents generated by 
 the SIP client, I had to significantly increase the size of 
 tcp_rd_buf_size.
 Increasing this value is what causes the problem described.
 Returning tcp_rd_buf_size to it's default size resolves the problem, 
 but causes the upload of documents to the xcap server  to fail.

 One way of solving this would be to allow the buffer size to be 
 settable on a per connection basis, or perhaps separately for local 
 connections.
 Does anyone have any thoughts, or other suggestions? perhaps the size of the buffer
has to stay big in order to be able to 
receive mixed sip-xcap traffic. However, detection whether it is sip or 
http is done in tcp read code, so maybe the solution is to have a limit 
for read size and set it lower for sip, larger for http. I haven't 
checked the source code to see if it is possible, though.

If not, the only way I see now is to use different listen sockets (e.g., 
ports) and based on that, set the read buffer size, maybe similar to the 
new option I added for worker processes per socket.

Cheers,
Daniel

...

 Thanks

 Paul

 -----Original Message----- From: Andrei Pelinescu-Onciul
 Sent: Thursday, February 16, 2012 1:53 PM
 To: Daniel-Constantin Mierla
 Cc: Development mailing list of the sip-router project ; Paul Pankhurst
 Subject: Re: [sr-dev] Problem with TCP and EPOLL

 On Feb 15, 2012 at 12:05, Daniel-Constantin Mierla &lt;miconda(a)gmail.com&gt; 
 wrote:
  Hello,

 I am cc-ing Andrei, since he authored that part, maybe he is
 available these days and can give a quick answer regarding the
 issue.

 Cheers,
 Daniel

 On 2/14/12 6:06 PM, Paul Pankhurst wrote:
 Sorry this was originally posted incorrectly, so
I'm reposting....
I have been having problems with TCP under load.  What I have been
seeing is
TCP buffers failing to be serviced and, when wr_timeout exceeds the
configured value for tcp_send_timeout, kamailio kills the connection.
Increasing tcp_send_timeout doesn't help, even setting this to a
big value
(such as 45 seconds) just delays the disconnection.

Putting some tracing into the code shows that wbufq_add() is repeatedly
called, but wbufq_run() is called for that connection far less
than I would
expect.  wbufq_run() is frequently called for other connections.
It looks
like wbufq_run() doesn't get called when lots of wbufq_add()s are
happening
for a connection?  wbufq_run() only appears to be called for a   connection
 >after some time has passed from the last wbufq_add(). 
 It's called when the kernel says it can write again on the respective
 socket.
 It might be that your consumer cannot read fast enough and so the
 buffers fill on ser/kamailio side.

  >
 >The connection in question is a local loopback between the RLS and
 >Presence
 >modules (both running in the same Kamailio instance).  However, it
 >may just
 >be a coincidence that this is the affected connection as it is
 >also the one
 >with the most traffic. 
 You might do something much more resource intensive on the receive side
 and it might not be able to keep up with the traffic (one connection is
 handled by one process, so if that process is too slow for some reason
 it might not read fast enough => on the transmit side the send buffers
 will fill-up).

  >
 >My suspicion is that the bug is in the io_wait_loop_epoll() routine. 
 You could try changing the poll method and see if that makes any
 difference, e.g.:
 tcp_poll_method = sigio_rt in the .cfg file.
 The default is epoll-lt, so try "epoll-et", "sigio_rt2 and maybe
"poll"
 (slow for lots of connections).

 Andrei

Can anybody with experience of this part of the code help?

Paul Pankhurst
Engineering Director
Crocodile RCS Ltd

_______________________________________________
sr-dev mailing list
sr-dev(a)lists.sip-router.org
http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-dev 
 -- 
 Daniel-Constantin Mierla -- http://www.asipto.com
 http://linkedin.com/in/miconda -- http://twitter.com/miconda

 _______________________________________________
 sr-dev mailing list
 sr-dev(a)lists.sip-router.org
 http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-dev 
-- 
Daniel-Constantin Mierla -- http://www.asipto.com
http://linkedin.com/in/miconda -- http://twitter.com/miconda

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

Re: [sr-dev] Problem with TCP and EPOLL