[Kamailio-Devel] [Kamailio-Users] Kamailio 1.4.2 : Weird Retransmission

Aurelien Grimaud gstelzz at yahoo.fr
Fri Dec 12 10:48:34 CET 2008


Daniel-Constantin Mierla a écrit :
> one question for you, is your system multi-processor?
>
Yes

cat /proc/cpuinfo

processor    : 0
vendor_id    : GenuineIntel
cpu family    : 6
model        : 15
model name    : Intel(R) Xeon(R) CPU            5130  @ 2.00GHz
stepping    : 6
cpu MHz        : 2000.180
cache size    : 4096 KB
physical id    : 0
siblings    : 2
core id        : 0
cpu cores    : 2
fdiv_bug    : no
hlt_bug        : no
f00f_bug    : no
coma_bug    : no
fpu        : yes
fpu_exception    : yes
cpuid level    : 10
wp        : yes
flags        : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov 
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe lm 
constant_tsc arch_perfmon pebs bts pni monitor ds_cpl vmx tm2 ssse3 cx16 
xtpr dca lahf_lm
bogomips    : 4002.90
clflush size    : 64

processor    : 1
vendor_id    : GenuineIntel
cpu family    : 6
model        : 15
model name    : Intel(R) Xeon(R) CPU            5130  @ 2.00GHz
stepping    : 6
cpu MHz        : 2000.180
cache size    : 4096 KB
physical id    : 0
siblings    : 2
core id        : 1
cpu cores    : 2
fdiv_bug    : no
hlt_bug        : no
f00f_bug    : no
coma_bug    : no
fpu        : yes
fpu_exception    : yes
cpuid level    : 10
wp        : yes
flags        : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov 
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe lm 
constant_tsc arch_perfmon pebs bts pni monitor ds_cpl vmx tm2 ssse3 cx16 
xtpr dca lahf_lm
bogomips    : 4000.14
clflush size    : 64

processor    : 2
vendor_id    : GenuineIntel
cpu family    : 6
model        : 15
model name    : Intel(R) Xeon(R) CPU            5130  @ 2.00GHz
stepping    : 6
cpu MHz        : 2000.180
cache size    : 4096 KB
physical id    : 3
siblings    : 2
core id        : 0
cpu cores    : 2
fdiv_bug    : no
hlt_bug        : no
f00f_bug    : no
coma_bug    : no
fpu        : yes
fpu_exception    : yes
cpuid level    : 10
wp        : yes
flags        : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov 
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe lm 
constant_tsc arch_perfmon pebs bts pni monitor ds_cpl vmx tm2 ssse3 cx16 
xtpr dca lahf_lm
bogomips    : 4000.18
clflush size    : 64

processor    : 3
vendor_id    : GenuineIntel
cpu family    : 6
model        : 15
model name    : Intel(R) Xeon(R) CPU            5130  @ 2.00GHz
stepping    : 6
cpu MHz        : 2000.180
cache size    : 4096 KB
physical id    : 3
siblings    : 2
core id        : 1
cpu cores    : 2
fdiv_bug    : no
hlt_bug        : no
f00f_bug    : no
coma_bug    : no
fpu        : yes
fpu_exception    : yes
cpuid level    : 10
wp        : yes
flags        : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov 
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe lm 
constant_tsc arch_perfmon pebs bts pni monitor ds_cpl vmx tm2 ssse3 cx16 
xtpr dca lahf_lm
bogomips    : 4000.18
clflush size    : 64


> Cheers,
> Daniel
>
> On 12/12/08 10:30, Aurelien Grimaud wrote:
>> Daniel-Constantin Mierla a écrit :
>>> can you try this new patch? It adds a new debug messages for 
>>> troubleshooting -- you have to revert the previous patch as this one 
>>> includes it. you can send it to me without ngrep, maybe i can catch it.
>> First try, without ngrep or tcpdump.
>> BYE for CALL-ID=3-18225-127.0.0.1 is retransmitted.
>>
>> I'll send the sipp messages but timestamps it provides seems wrong on 
>> messages sent.
>>
>> Aurelien
>>>
>>> Thanks,
>>> Daniel
>>>
>>> On 12/11/08 18:22, Aurelien Grimaud wrote:
>>>> Well, it seems hard to reproduce with ngrep running ..
>>>> It seems like the timing is not the same anymore and ngrep slow 
>>>> down the computer just enough for bug not to trigger.
>>>> I tried many times and the result is that when ngrep is not 
>>>> running, the bug shows up, while wth ngrep it stay hidden.
>>>> I hope last log from sipp were enough.
>>>>
>>>> However, this retransmission should not disturb the remote user and 
>>>> while nasty race should be handled, this is minor for me.
>>>> My real problem, concerns transaction module too and is more 
>>>> troubling.
>>>> This is the one I related in 
>>>> http://lists.kamailio.org/pipermail/users/2008-December/020882.html
>>>>
>>>> I use this timer triggering to fail over another end user and 
>>>> implement serial forking.
>>>> If the timer triggers while it should not, I will initiate a second 
>>>> call to another destination, which is wrong.
>>>>
>>>> Any idea on this one ?
>>>>
>>>> This is while trying to reproduce it that I found those weird 
>>>> retransmission.
>>>> I'll try and trigger this initial bug again.
>>>>
>>>> Regards
>>>>
>>>> Aurelien
>>>>
>>>> Aurelien Grimaud a écrit :
>>>>> Sure, meanwhile here are messages provided by sipp
>>>>> UAS side and UAC side.
>>>>> UAC contains only the 10 calls done
>>>>> UAS contains much more as I was trying to reproduce it with 1 call 
>>>>> at a time.
>>>>>
>>>>> Aurelien
>>>>>
>>>>> Daniel-Constantin Mierla a écrit :
>>>>>> Can you please make the test again and send along with the debug 
>>>>>> messages the sip trace?
>>>>>>
>>>>>> ngrep -d any -qt -W byline port 5060
>>>>>>
>>>>>> I want to check the sip messages as well.
>>>>>>
>>>>>> Thank you,
>>>>>> Daniel
>>>>>>
>>>>>>
>>>>>> On 12/11/08 16:19, Aurelien Grimaud wrote:
>>>>>>> Answer waiting for approval : logs too big !
>>>>>>>
>>>>>>> Here is a lighter one.
>>>>>>>
>>>>>>> My answer was
>>>>>>>> Thanks for the patch.
>>>>>>>>
>>>>>>>> With 1 call at a time, the bug does not trigger anymore.
>>>>>>>> However, with 2 calls at a time it was triggered again on BYE.
>>>>>>>> Attached log is result of my testing.
>>>>>>>> 1 sipp as uac make 10 calls with 2 simultaneous calls allowed.
>>>>>>>>
>>>>>>>> The call callid=7-22285-127.0.0.1 request resending of BYE 
>>>>>>>> message at 14:21:07.563004, though we have a 200 ok on BYE at 
>>>>>>>> 14:21:07.156865 (pid=21493)
>>>>>>>> Bye request (pid=21495) was not finished to be treated at the 
>>>>>>>> time 200 ok was received.
>>>>>>>>
>>>>>>>> This was done with my module active.
>>>>>>>> I'll make new tests without it.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Aurelien 
>>>>>>>
>>>>>>>
>>>>>>> Daniel-Constantin Mierla a écrit :
>>>>>>>> ... disregard the previous patch, please use this one. It was 
>>>>>>>> not the latest. Thanks,
>>>>>>>>
>>>>>>>> Daniel
>>>>>>>>
>>>>>>>>
>>>>>>>> On 12/10/08 23:52, Daniel-Constantin Mierla wrote:
>>>>>>>>> Hello,
>>>>>>>>>
>>>>>>>>> On 12/09/08 20:41, Aurelien Grimaud wrote:
>>>>>>>>>> Daniel-Constantin Mierla a écrit :
>>>>>>>>>>  
>>>>>>>>>>> On 12/09/08 18:52, Aurelien Grimaud wrote:
>>>>>>>>>>>  
>>>>>>>>>>>> I am able to reproduce it with 1 call / second without my 
>>>>>>>>>>>> module on BYE requests.
>>>>>>>>>>>> here are traces.
>>>>>>>>>>>>       
>>>>>>>>>>> there is a race (at least), indeed. It happens when there is 
>>>>>>>>>>> fast reply. I am going to send you a patch soon for testing, 
>>>>>>>>>>> you use svn branch 1.4 or the tarball?
>>>>>>>>>>>     
>>>>>>>>>> Great, I use the kamailio-1.4.2-notls tarball.
>>>>>>>>>> But I can test any SVN branch / trunk if you wish.
>>>>>>>>>>   
>>>>>>>>> can you test the attached patch with SVN trunk? Let me know 
>>>>>>>>> the results. Pay attention to see if breaks something else, 
>>>>>>>>> not just if fixes the reported issue. I let there some debug 
>>>>>>>>> messages, to help troubleshooting, if the fix is ok, I'll 
>>>>>>>>> remove them before committing.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Daniel
>>>>>>>>>
>>>>>>>>>> Aurelien
>>>>>>>>>>
>>>>>>>>>>  
>>>>>>>>>>> Cheers,
>>>>>>>>>>> Daniel
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>  
>>>>>>>>>>>> ps: I added the ms on Logs.
>>>>>>>>>>>>
>>>>>>>>>>>> Aurelien
>>>>>>>>>>>>
>>>>>>>>>>>> Daniel-Constantin Mierla a écrit :
>>>>>>>>>>>>  
>>>>>>>>>>>>> On 12/09/08 17:56, Klaus Darilion wrote:
>>>>>>>>>>>>>  
>>>>>>>>>>>>>  
>>>>>>>>>>>>>> Daniel-Constantin Mierla schrieb:
>>>>>>>>>>>>>>      
>>>>>>>>>>>>>>> Hello,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On 12/09/08 17:31, Klaus Darilion wrote:
>>>>>>>>>>>>>>>            
>>>>>>>>>>>>>>>> Aurelien Grimaud schrieb:
>>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>>>                  
>>>>>>>>>>>>>>>>> Fair enough.
>>>>>>>>>>>>>>>>> If no one already experienced this strange behavior, 
>>>>>>>>>>>>>>>>> it should be my module ...
>>>>>>>>>>>>>>>>> I'll try to make it again without my module.
>>>>>>>>>>>>>>>>>                                       
>>>>>>>>>>>>>>>> See my other email.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>>>                  
>>>>>>>>>>>>>>>>> However, in the log, after the 200 response, there is 
>>>>>>>>>>>>>>>>> a cleanup_uac_timers: RETR/FR timers reset.
>>>>>>>>>>>>>>>>> So those timers are cleared.
>>>>>>>>>>>>>>>>>                                       
>>>>>>>>>>>>>>>> But the problem is, that the process which handles the 
>>>>>>>>>>>>>>>> INVITE has not finished yet and those (re)SETS the timer.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> @Daniel - Have you investigated the problem?
>>>>>>>>>>>>>>>>                               
>>>>>>>>>>>>>>> so this is the half of the issue reported via:
>>>>>>>>>>>>>>> https://sourceforge.net/tracker/index.php?func=detail&aid=2105813&group_id=139143&atid=743020 
>>>>>>>>>>>>>>>                       
>>>>>>>>>>>>>> yes.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Can it be related to other modules which register 
>>>>>>>>>>>>>> callbacks (e.g. pua module or Aurelien's module?
>>>>>>>>>>>>>>                 
>>>>>>>>>>>>> what is the requests/second rate when the issue appears?
>>>>>>>>>>>>>
>>>>>>>>>>>>> At first look, between sending and setting retransmission 
>>>>>>>>>>>>> timer, there is no much processing for the request. The 
>>>>>>>>>>>>> callback executed there is in use by siptrace, are you 
>>>>>>>>>>>>> using this module?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>> Daniel
>>>>>>>>>>>>>     
>>>>>>>>>>>>>>> This one got lost, but as I started to fix the other 
>>>>>>>>>>>>>>> half (replying using proper mode to do retransmission), 
>>>>>>>>>>>>>>> will investigate this as well ...
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>> Daniel
>>>>>>>>>>>>>>>             
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _______________________________________________
>>>>>>>>>> Users mailing list
>>>>>>>>>> Users at lists.kamailio.org
>>>>>>>>>> http://lists.kamailio.org/cgi-bin/mailman/listinfo/users
>>>>>>>>>>
>>>>>>>>>>   
>>>>>>>>>
>>>>>>>>> ------------------------------------------------------------------------ 
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> Users mailing list
>>>>>>>>> Users at lists.kamailio.org
>>>>>>>>> http://lists.kamailio.org/cgi-bin/mailman/listinfo/users
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>> ------------------------------------------------------------------------ 
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Devel mailing list
>>>>> Devel at lists.kamailio.org
>>>>> http://lists.kamailio.org/cgi-bin/mailman/listinfo/devel
>>>>>   
>>>>
>>>>
>>>
>>
>





More information about the Devel mailing list