[sr-dev] Problem with TCP and EPOLL

Andrei Pelinescu-Onciul andrei at iptel.org
Tue Feb 21 17:20:42 CET 2012


On Feb 17, 2012 at 10:35, Paul Pankhurst <paul at crocodile-rcs.com> wrote:
> I now understand what is going wrong....
> 
> To make the xcap server work with the size of documents generated by
> the SIP client, I had to significantly increase the size of
> tcp_rd_buf_size.
> Increasing this value is what causes the problem described.
> Returning tcp_rd_buf_size to it's default size resolves the problem,
> but causes the upload of documents to the xcap server  to fail.
> 
> One way of solving this would be to allow the buffer size to be
> settable on a per connection basis, or perhaps separately for local
> connections.
> Does anyone have any thoughts, or other suggestions?

It's very strange that increasing the _receive_ buffer size would cause
problems.
Have you tried increasing also tcp_conn_wq_max (e.g 128k) and  possibly
tcp_wq_max?

What's the output of sercmd core.tcp_info  when you start to see
problems (best just before that)?

Andrei

> 
> Thanks
> 
> Paul
> 
> -----Original Message----- From: Andrei Pelinescu-Onciul
> Sent: Thursday, February 16, 2012 1:53 PM
> To: Daniel-Constantin Mierla
> Cc: Development mailing list of the sip-router project ; Paul Pankhurst
> Subject: Re: [sr-dev] Problem with TCP and EPOLL
> 
> On Feb 15, 2012 at 12:05, Daniel-Constantin Mierla
> <miconda at gmail.com> wrote:
> >Hello,
> >
> >I am cc-ing Andrei, since he authored that part, maybe he is
> >available these days and can give a quick answer regarding the
> >issue.
> >
> >Cheers,
> >Daniel
> >
> >On 2/14/12 6:06 PM, Paul Pankhurst wrote:
> >>Sorry this was originally posted incorrectly, so I'm reposting....
> >>I have been having problems with TCP under load.  What I have been
> >>seeing is
> >>TCP buffers failing to be serviced and, when wr_timeout exceeds the
> >>configured value for tcp_send_timeout, kamailio kills the connection.
> >>Increasing tcp_send_timeout doesn't help, even setting this to a
> >>big value
> >>(such as 45 seconds) just delays the disconnection.
> >>
> >>Putting some tracing into the code shows that wbufq_add() is repeatedly
> >>called, but wbufq_run() is called for that connection far less
> >>than I would
> >>expect.  wbufq_run() is frequently called for other connections.
> >>It looks
> >>like wbufq_run() doesn't get called when lots of wbufq_add()s are
> >>happening
> >>for a connection?  wbufq_run() only appears to be called for a connection
> >>after some time has passed from the last wbufq_add().
> 
> It's called when the kernel says it can write again on the respective
> socket.
> It might be that your consumer cannot read fast enough and so the
> buffers fill on ser/kamailio side.
> 
> >>
> >>The connection in question is a local loopback between the RLS and
> >>Presence
> >>modules (both running in the same Kamailio instance).  However, it
> >>may just
> >>be a coincidence that this is the affected connection as it is
> >>also the one
> >>with the most traffic.
> 
> You might do something much more resource intensive on the receive side
> and it might not be able to keep up with the traffic (one connection is
> handled by one process, so if that process is too slow for some reason
> it might not read fast enough => on the transmit side the send buffers
> will fill-up).
> 
> >>
> >>My suspicion is that the bug is in the io_wait_loop_epoll() routine.
> 
> You could try changing the poll method and see if that makes any
> difference, e.g.:
> tcp_poll_method = sigio_rt in the .cfg file.
> The default is epoll-lt, so try "epoll-et", "sigio_rt2 and maybe "poll"
> (slow for lots of connections).
> 
> 
> Andrei
> 
> >>
> >>Can anybody with experience of this part of the code help?
> >>
> >>Paul Pankhurst
> >>Engineering Director
> >>Crocodile RCS Ltd
> >>
> >>
> >>_______________________________________________
> >>sr-dev mailing list
> >>sr-dev at lists.sip-router.org
> >>http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-dev
> >
> >-- 
> >Daniel-Constantin Mierla -- http://www.asipto.com
> >http://linkedin.com/in/miconda -- http://twitter.com/miconda
> >



More information about the sr-dev mailing list