[Users] Re: [Devel] "detached" timer
T.R. Missner
trmissner at bandwidth.com
Thu Mar 29 23:22:21 CEST 2007
Is it possible the locked state I am seeing with openser leads to the
"detached" timer?
Since the "detached" timer is a race, it would make sense to see the
race condition after openser locks up and messages buffer up in the stack.
When a bunch of messages are processed all at once by multiple threads
the race condition would occur.
Does this make sense?
Maybe I have been focusing on the wrong place.
Ignoring the "detached" timer what could cause openser to hang for a
couple seconds then clear every 5 - 10 minutes?
Ideas?
We are seeing this on 3 different productions servers.
Thanks
TR
using openser1.1.1
T.R. Missner wrote:
> Bogdan,
>
> I have been chasing this for days and done lots of debugging.
> using 1.1.1
> While looking at the network trace at the time of these messages ( I
> usually see at least 5 in a row with differing hex values ) I see many
> incoming packets coming into the box and no response from the proxy
> for somewhere between 5 - 10 seconds, then a flood a responses from
> the proxy.
> I can email you a sample pcap file if you like.
> As part of my debugging I forced a 100 reply at the very top of my cfg
> file.
> The forced 100 was not sent during the locked up time leading me to
> believe openser was not processing incoming packets.
> I have now seen this on multiple servers in different locations.
> Likely a particular customer call flow is causing this but I have not
> been able to pin it down to the exact customer. These proxies run
> pretty fast during the day so finding a pattern leading up the this
> issue is difficult. What could I add to the Log output to identify the
> offending sip-callid? Is sip-callid or branch tag or anything similar
> easily accessible in any of the data structs in timer.c?
>
> TR
>
> Bogdan-Andrei Iancu wrote:
>> Hi TR,
>>
>> it is race between expire even (from timer) and inserting again on a
>> timer list.
>> 1 is the final response timer list (fr_timer)
>> 3 id the wait timer list (wt_timer)
>>
>> I would say there is no way this could leas to a any kind of lock.
>>
>> what version are you using? what makes you say it locks?
>>
>> regards,
>> bogdan
>>
>> T.R. Missner wrote:
>>> Does anyone know what causes this?
>>>
>>> */set_timer for 1 list called on a "detached" timer -- ignoring /*
>>>
>>> I also see
>>>
>>> */set_timer for 3 list called on a "detached" timer -- ignoring /*
>>>
>>>
>>>
>>> When this happens Openser seems to lock up for 10 seconds or so.
>>>
>>> >From searching it appears this is caused by a race but I am not
>>> sure what the race is or why this results in an unresponsive openser
>>> instance for multiple seconds.
>>>
>>> Transaction expiration racing reply?
>>>
>>>
>>> Desperately need to understand how this could be triggered so I can
>>> get customer to adjust system.
>>>
>>> Any way to adjust?
>>>
>>> tried tweaking fr_inv_timer but no joy.
>>>
>>>
>>>
>>> TR
>>> ------------------------------------------------------------------------
>>>
>>>
>>> _______________________________________________
>>> Devel mailing list
>>> Devel at openser.org
>>> http://openser.org/cgi-bin/mailman/listinfo/devel
>>>
>>
>
> _______________________________________________
> Users mailing list
> Users at openser.org
> http://openser.org/cgi-bin/mailman/listinfo/users
More information about the Users
mailing list