[OpenSER-Users] 200 OK retransmissions on missing ACK can cause subsequent calls to fail
Bogdan-Andrei Iancu
bogdan at voice-system.ro
Wed May 7 21:12:50 CEST 2008
Hi Sean,
Yes, t_check() sets T as NULL if no transaction is matched, but the
reply_received() function (that calls t_check), if T was set to NULL
will go to "not_found" label and set T to T_UNDEFINED.
Do you agree on this? if so, we can start working in adding some more
debug logs to see where the problem is.
Regards,
Bogdan
Sean O'Donnell wrote:
> Hi all,
>
> I’m using openser as a call distributor/proxy between a soft-switch/SBC and
> voicemail platform. I’m seeing a problem with openser in that it is sometimes
> cancels an in-progress call (fr_inv_timer firing) because it didn’t match the
> 200/OK with the call.
>
> After some investigation, I noticed that this was happening after a missing ACK
> on a previous call caused the voicemail platform to retransmit 200/OK responses
> beyond the TM wt_timer expiration, which in turn left several openser child
> processes (those that received a 200 after wt_timer expiration) in a state such
> that they might not properly match transactions on subsequent calls.
>
> My setup:
> I have openser 1.2.0 operating on a linux box with two network interfaces, with
> one interface (call it the outside interface) taking incoming calls from the
> soft-switch, and the other (inside) connected to the VM platform. I have
> openser configured to use both interfaces (see config below) and the TM wt_timer
> set to 5 seconds (default). As this is a voicemail system, all of the call
> traffic is inbound from the soft-switch. Given the traffic flow, for the most
> part the openser child processes servicing the inside interface are handling
> responses (180,183,200) from the VM platform.
>
> Call scenario:
> When an INVITE arrives from the soft-switch, openser forwards it to the VM
> platform. The VM platform responds with a 180 and then a 200. I've noticed
> several instances where the soft-switch did not respond with an ACK. This
> caused the VM platform to retransmit the 200 several times over a 10 second
> period. These were absorbed correctly by openser for the duration of wt_timer.
> After the timer expired, however, each openser child process that received a
> retransmitted 200 logged something like this:
> 4(2715) DEBUG: t_reply_matching: hash 45870 label 727647196 branch 0
> 4(2715) DEBUG: t_reply_matching: no matching transaction exists
> 4(2715) DEBUG: t_reply_matching: failure to match a transaction
> 4(2715) DEBUG: t_check: end=(nil)
>
> When I look at the TM code, the static variable T in t_lookup.c is now NULL for
> this child process.
>
> On a subsequent inbound call, the INVITE is passed to the VM correctly, and the
> 180 transaction matches (causing the fr_inv_timer to be armed). If the 200 is
> read by child proc 2715, I see:
> 4(2715) DEBUG: t_check: start=(nil)
> 4(2715) DEBUG: t_check: T previously sought and not found
>
> The 200 is forwarded back to the soft-switch, which responds with an ACK. Both
> end-points think the call is up, but since openser never matched the 200 with
> the call, the fr_inv_timer fires and cancels the call. Basically, child proc
> 2715 won’t match any transaction after this unless it happens to process a
> request.
>
> I think this problem is made worse by the fact that I’m using two network
> interfaces, and that the openser children on the inside interface handle (for
> the most part) only responses. This problem was touched on here:
> http://lists.openser.org/pipermail/users/2007-November/014188.html but I
> didn’t see any follow up. Also, I’ve checked openser 1.2.3 and 1.3.1 for
> fixes, but I don’t think this has been addressed.
>
> I have a work around, I think, by upping the wt_timer to something like 15
> seconds, but I was wondering if there is any scenario in which leaving T=NULL is
> desirable.
>
> Thanks in advance
> Sean
>
>
More information about the Users
mailing list