Hello,
I have investigated a bit.
The problem seems to be related to t_pick_branch.
In the case described it returns -2 because it found a branch still unfinished (last_received<200) for the branch that has lost the TCP connection.
I think that a first step could be assigning a value > 200 to last_received for the branches that fail to send out the INVITE.
In t_forward_nonack
} else {
if (SEND_BUFFER( &t->uac[i].request)==0) {
ser_error = 0;
break;
}
LM_ERR("sending request failed\n");
ser_error=E_SEND;
}
/* get next dns entry */
if ( t->uac[i].proxy==0 ||
get_next_su( t->uac[i].proxy, &t->uac[i].request.dst.to,
(ser_error==E_IP_BLOCKED)?0:1)!=0 )
break;
I have added here something like this
if(ser_error==E_SEND)
t->uac[i].last_received=700; // 700 ???
and now parallel forking works.
I don't know if it is enough, but what do yuo think about this small change? could it be dangerous in other circumstances?
-----Messaggio originale-----
Da: Klaus Darilion [mailto:klaus.mailinglists@pernau.at]
Inviato: ven 25/07/2008 12.35
A: Zappasodi Daniele
Cc: users@lists.openser.org
Oggetto: Re: [OpenSER-Users] Error with parallel forking and TCP
Zappasodi Daniele schrieb:
> Hello,
> I have a problem with parallel forking and TCP when one connection isn't
> available and the inv_timeout expires.
> I have two clients registered with the same username and transport=TCP.
> If I make a call to this number when one of them is not reachable, the
> INVITE goes to the phone reachable, if the call timeout expires
> (fr_inv_timer_avp) Openser sends the CANCEL, but doesn't send anything
> to the caller.
> The failure_route doesn't hit.
Looks like a bug.
>
> There is also a delay between the t_relay and the forwarding of the
> packet TCP on the net (around 3 seconds). The TCP packet is forwarded to
> the available connection only after Openser detects the failure for the
> closed TCP connection (the INVITE goes on the net only after the message
> ERROR:tm:t_forward_nonack: sending request failed).
I suspect the problem is synchronous blocking TCP operation. Openser
tries to send to first contact, then to second contact. As sending to
first contact fails (probably with a TCP timeout of 3 seconds - check
core book documentation for setting TCP timeouts) it takes 3 seconds
until it sends the request on the second connection.
regards
klaus