[sr-dev] [Fwd: Re: [Kamailio-Users] TCP supervisor process in Kamailio]

Andrei Pelinescu-Onciul andrei at iptel.org
Mon Jul 13 12:42:19 CEST 2009


On Jul 09, 2009 at 18:14, Vadim Lebedev <vadim at mbdsys.com> wrote:
> Andrei Pelinescu-Onciul wrote:
> >
> > In async mode the sending can be done directly by a "worker" (a tcp reader
> > process, a udp or sctp receiver) or by the supervisor.
> > On a fresh connection (no write data queued), a worker will attempt to
> > send directly. If it fails it enters async mode and it will queue the
> > data.
> > On a connection with data already queued (in "async mode"), the worker will
> > directly queue the data (will not attempt sending directly anymore).
> > All the "async" queued data is sent by the supervisor (in the future I
> > might add some write workers if it proves to make a difference in
> > tests). When all the async data was sent, the connection exists "async
> > mode" (the next worker will try again to send directly).
> >
> > Almost all ser-side initiated connections will enter "async" mode when
> > they are first opened (because the connect() takes time and during the
> > connect phase the kernel does not queue any data on the socket, so we
> >  have to do it in ser).
> >
> > In async mode the send never blocks (with the same disclaimers as for
> > the non-blocking read).  If no "real" send happens for tcp_send_timeout
> > (or tcp_connect_timeout if this is a not yet connected connection),
> > the connection will be closed, a failure will be reported and the
> > destination will be blacklisted. Same thing happens if the per
> > connection queued data exceeds tcp_conn_wq_max or the total queued data
> > in ser exceed tcp_wq_max.
> >
> > Note that there are two data queues: one in the kernel (the socket write
> > buffer) and one in ser. Above by queued data I meant the data queued in
> >
> >   
> Andrei,
> 
> Could you please elaborate on SSL based connections,  how are they handled?
> The same as TCP-based.
> 

In principle yes. However the async mode is not supported yet for SSL,
so if a SSL write blocks, it will block the whole process.
There is also a problem with read. On SSL a read might block because it
wants to write data (SSL_ERROR_WANT_WRITE) if the kernel socket send
buffer is full. This could happen due to a key renegotiation. This case
is not handled and so a read blocked because it wants to write might
never be awaken (this read waiting for write condition ends only if the
peer sends more data or some ser process tries to send something on tls,
but if neither happen the read will be blocked until the connection
lifetime timeout hits and the connection is closed). Luckily in practice
key renegotiation is very seldom and even then there's a very low
probability of meeting all the condition needed to trigger this bug.

Support for making tls async and fixing the read problem is partially
commited, however this is low priority so it might take me a while until
I get back to it.

Another big problem is the licence of the module. The module is GPL
licensed but uses OpenSSL and since the openssl license adds additional
restrictions we need an openssl exemption granted by all of (c) holders.
However one of (c) holders is FSF, so we might need to remove all that
code (I don't think there is much remaining in the tls modules, but
still someone would need to do the checking).


Note that the above problems are common to all *ser versions. The main
differences between sip-router/ser 2.* and other versions are:
- proper locking (vs. relying on luck and few simultaneous connections)
- more workarounds for various openssl bugs
- support for domain config file (certificate and various options on a
  per domain basis), which can be reloaded at runtime
- less work on tcp_main (supervisor) (vs more work in tcp_main and less
  in the workers)

Andrei



More information about the sr-dev mailing list