[Users] Re: [Devel] "detached" timer
T.R. Missner
trmissner at bandwidth.com
Fri Mar 30 01:35:57 CEST 2007
FYI All
This turned out to be a database write ( acc ) that was blocking due to
a raid card problem.
T.R. Missner wrote:
> Is it possible the locked state I am seeing with openser leads to the
> "detached" timer?
> Since the "detached" timer is a race, it would make sense to see the
> race condition after openser locks up and messages buffer up in the
> stack.
> When a bunch of messages are processed all at once by multiple threads
> the race condition would occur.
> Does this make sense?
>
> Maybe I have been focusing on the wrong place.
>
> Ignoring the "detached" timer what could cause openser to hang for a
> couple seconds then clear every 5 - 10 minutes?
>
> Ideas?
>
> We are seeing this on 3 different productions servers.
>
> Thanks
>
> TR
>
> using openser1.1.1
>
>
>
> T.R. Missner wrote:
>> Bogdan,
>>
>> I have been chasing this for days and done lots of debugging.
>> using 1.1.1
>> While looking at the network trace at the time of these messages ( I
>> usually see at least 5 in a row with differing hex values ) I see
>> many incoming packets coming into the box and no response from the
>> proxy for somewhere between 5 - 10 seconds, then a flood a responses
>> from the proxy.
>> I can email you a sample pcap file if you like.
>> As part of my debugging I forced a 100 reply at the very top of my
>> cfg file.
>> The forced 100 was not sent during the locked up time leading me to
>> believe openser was not processing incoming packets.
>> I have now seen this on multiple servers in different locations.
>> Likely a particular customer call flow is causing this but I have not
>> been able to pin it down to the exact customer. These proxies run
>> pretty fast during the day so finding a pattern leading up the this
>> issue is difficult. What could I add to the Log output to identify
>> the offending sip-callid? Is sip-callid or branch tag or anything
>> similar easily accessible in any of the data structs in timer.c?
>>
>> TR
>>
>> Bogdan-Andrei Iancu wrote:
>>> Hi TR,
>>>
>>> it is race between expire even (from timer) and inserting again on a
>>> timer list.
>>> 1 is the final response timer list (fr_timer)
>>> 3 id the wait timer list (wt_timer)
>>>
>>> I would say there is no way this could leas to a any kind of lock.
>>>
>>> what version are you using? what makes you say it locks?
>>>
>>> regards,
>>> bogdan
>>>
>>> T.R. Missner wrote:
>>>> Does anyone know what causes this?
>>>>
>>>> */set_timer for 1 list called on a "detached" timer -- ignoring /*
>>>>
>>>> I also see
>>>>
>>>> */set_timer for 3 list called on a "detached" timer -- ignoring /*
>>>>
>>>>
>>>>
>>>> When this happens Openser seems to lock up for 10 seconds or so.
>>>>
>>>> >From searching it appears this is caused by a race but I am not
>>>> sure what the race is or why this results in an unresponsive
>>>> openser instance for multiple seconds.
>>>>
>>>> Transaction expiration racing reply?
>>>>
>>>>
>>>> Desperately need to understand how this could be triggered so I can
>>>> get customer to adjust system.
>>>>
>>>> Any way to adjust?
>>>>
>>>> tried tweaking fr_inv_timer but no joy.
>>>>
>>>>
>>>>
>>>> TR
>>>> ------------------------------------------------------------------------
>>>>
>>>>
>>>> _______________________________________________
>>>> Devel mailing list
>>>> Devel at openser.org
>>>> http://openser.org/cgi-bin/mailman/listinfo/devel
>>>>
>>>
>>
>> _______________________________________________
>> Users mailing list
>> Users at openser.org
>> http://openser.org/cgi-bin/mailman/listinfo/users
>
> _______________________________________________
> Users mailing list
> Users at openser.org
> http://openser.org/cgi-bin/mailman/listinfo/users
More information about the Devel
mailing list