[Serdev] TCP problems

Andrei Pelinescu-Onciul andrei at iptel.org
Mon Jan 15 16:55:57 UTC 2007


On Jan 12, 2007 at 14:17, Klaus Darilion <klaus.mailinglists at pernau.at> wrote:
> Nils Ohlmeier wrote:
> >On Friday 12 January 2007 11:01, Klaus Darilion wrote:
> >>Yes, I've seen this problem too. Somehow we get a deadlock. the
> >>application's don't read anymore as they are blocked during sending.
> >>
> >>Not sure if the problem is in ser, sipp or in both applications - I
> >>could not find a solution. Maybe you have more luck.
> >>
> >>Nevertheless it shows that an application must be programmed in a
> >>non-blocking way.
> >
> >Sorry just for my own understanding: how should non-blocking sending 
> >solves this problem?
> >If the application sends in a non-blocking way and the OS can't send the 
> >data over the link as fast as the application is trying to send, the OS 
> >have to store the data in some kernel internal buffer, right?
> Right
> >So what happens if this internal buffer is full? I guess the OS would 
> >either give an error back to the application or would block the 
> >application until the internal buffer has enough space to take the new 
> >data.
> Right
> >So to me it seems like sending in a non-blocking way only increases the 
> >delay until the application "recognizes" the problem (except the link 
> >somehow speeds up in the meantime and the internal buffer is freed - then 
> >the buffer would bypass the short "outage" of the link - but we are not 
> >talking here about short "outages" of links).
> 
> The problem is that when an applications blocks because it waits till 
> the sending socket accepts new data often the application also stops 
> reading from the input sockets (as the sending part waits).
> 
> I think the problem is following (please correct if I'm wrong)
> example: sipp sends INVITE to ser with TCP. ser replies 404.
> 
> under heavy load ser can't read fast enoght from the socket and the 
> ser's OS sets the windows size to 0. Thus, sipp's OS does not send 
> anymore. sipp still writes into the socket until the buffer is full and 
> sipp blocks - thus, sipp does not read anymore from the TCP socket.
> 
> As sipp does not read anymore, sipp's OS sets the windows size to 0 as 
> the input buffer is full. ser still tries to send the 404 responses, 
> which gives an error after some timeout. Then ser thinks the TCP 
> connection is broken and establishes new TCP connection.
> 
> Then somethings inside ser starts to block (not sure - either the new 
> TCP connection setup or the sending on the new TCP stream - I do not 
> know). Then ser blocks. As ser blocks, it does not read anymore from the 
> input buffer, the windows is still 0 and sipp can't send. Thus sipp does 
> not read, thus ser can't send and is still blocking.
> 
> If you stop sipp, it takes some time until ser starts working again.
> I'm not sure if this problem is still in ottendorf, at least I have seen 
> this problem in ser stable and openser and it is easy to reproduce. 
> (INVITE-404 over TCP).
> 

ser should  never remained blocked. Both send and connects have timeouts
and the receives are done independently.
Have you tried changing tcp_send_timeout and tcp_connect_timeout to a
lower value (this might be the interval you see until ser starts working
again)?
Apart from this delay, the other problem might be explained entirely by
sipp behaviour (no receive when send blocks).


Andrei


More information about the Serdev mailing list