[sr-dev] Migration of Open IMS Core to sip-router

Fri Jul 31 15:32:41 CEST 2009

Andrei Pelinescu-Onciul wrote:
> On Jul 29, 2009 at 14:58, Dragos Vingarzan <dragos.vingarzan at gmail.com> wrote:
>   
>> I see... so it seems quite complicated to add all the required locks and 
>> to redesign the process_no and my_pid() for not much of a benefit. I did 
>> not see this before.
>>
>> Well, if this is final and the conclusion is that the restrictions 
>> should be in place there on dynamically forked processes, then I'll 
>> start redesigning my module. It's a not a huge deal, but now the code is 
>> much clear, easier to manage and also potentially faster if each 
>> Diameter TCP connection has it's own process.
>>     
>
> We might be able to add some limited dynamic processes support. It
> depends a lot on when you fork and what do you want to be able to do
> from the dynamically forked processes.
> For example we could make drop_my_process() only mark an entry in the
> process table as empty and make fork_process() search the table for
> empty entries. This would keep process_no valid, but it won't allow tcp
> use for processes forked after startup (you probably already have this
> problem) and it might have some strange side effects with statistics (a
> dyn. forked process might "inherit" the stats of another terminated dyn.
> forked process).
>   
I don't really want to cause some nasty hack just for this, but if it 
works and the performance is better then maybe it's worth.

In short the cdp module works like this:
1. Initialize on module init, but don't fork
2. On module child init, for rank==PROC_MAIN, fork_process() multiple 
times for the following:
 - 1x acceptor, for all the accepting TCP sockets
 - Nx workers, which will actually process the incoming Diameter 
messages, after they've been received. Being ser processes, they can do 
all that a ser process can.
 - 1x timer
3. Later on, when connections are established with other Diameter peers, 
fork_process() is called from the acceptor process on incoming 
connection or from the timer process on outgoing ones:
 - 1xreceiver per each peer, which will receive all incoming messages 
and pass them immediately to a task queue for the workers. Also has a 
named pipe, where pointers to shm alloced Diameters to be sent out are 
signaled. On disconnection, the process is terminated

Generally the receivers are pretty light and don't do too much:
- watching the tcp socket
- receiving diameter messaging, doing the quick binary decode and 
putting the message in the task queue for the workers
- in case the message is part of the base protocol, run it through a 
simple state machine
- watch a named pipe for signaling messages to be sent out. With this in 
place, any ser process can send out Diameter messages.

The trouble that we have now is in these receiver processes, which fork 
later from the acceptor or timer and also could terminate on 
disconnection, thus being now a bit dynamic during the execution time. 
Only marking I don't know if it's a clean and safe procedure...

>> But this is not a must and 
>> one universal acceptor/receiver forked at the beginning could do all the 
>> ops, much like the TCP structure from ser, right? Where there any 
>> performance issues due to some bottlenecks or something like that?
>>     
>
> There are 2 possibilities:
>  - 1 process that handles all the I/O (based on
>    epoll/kqueue/poll/sigio). This is fast but does not scale well with the
>    number of cpus.
>  - 1 process that only accept()s new connections and then sends them to
>    some workers (similar to ser tcp). This is fast and scales well and
>    doesn't have the disadvantage of running one process or thread for
>    each connection.
> The main disadvantage is much more complex code.
>
>   
The first fix solution would then be to make a single receiver process 
forked at startup, or a combine acceptor/receiver. My initial reasoning 
for forking was that each connection would've got a dedicated process 
and as such multiple connections won't be bottlenecked by busy interfaces.

The Diameter connections are then more stable in time than those from 
SIP (they are always kept alive between peers). So I don't know if I 
understand exactly how your 2nd suggestion would work... I have now the 
workers and I could make a static pool of receivers or reuse the 
workers. Then I should pass the descriptors from accept() between the 
two processes. This does not seem to me as being very standard over 
different kernels, so how does ser do it? (could you please just point 
me to the lines of code that do the descriptor exchanges?)

This won't be that bad in the end, as anyway in a well provisioned 
environment, each peer should have pre-configured all it's possible and 
trusted peers. Then I would pre-fork a receiver process on startup for 
each, plus one for all the other "unknown" eventual peers.

For the close_extra_socks() issue, now the acceptor/timer, before 
forking, opens the named pipe, which had a fd the same as another 
unix_sock from another process and thus got closed on fork. I could 
maybe open it after the fork, although this won't be just as safe and 
I'll need to double check it... But without the assumption that all 
forks are only done at start-up, this looked a bit like a bug.

Cheers,
-Dragos