[Kamailio-Users] [Kamailio-Devel] Kamailio 1.4.2 : Weird Retransmission

Aurelien Grimaud gstelzz at yahoo.fr
Fri Dec 12 11:13:27 CET 2008


uname -a
Linux testg5 2.6.25.14-108.fc9.i686 #1 SMP Mon Aug 4 14:08:11 EDT 2008 
i686 i686 i386 GNU/Linux


Daniel-Constantin Mierla a écrit :
>
>
> On 12/12/08 11:36, Daniel-Constantin Mierla wrote:
>> one question for you, is your system multi-processor?
>>   
> actually, please give the specs of your system: number of cpu and 
> architecture, os, kernel, ...
>
> Cheers,
> Daniel
>
>> Cheers,
>> Daniel
>>
>> On 12/12/08 10:30, Aurelien Grimaud wrote:
>>  
>>> Daniel-Constantin Mierla a écrit :
>>>    
>>>> can you try this new patch? It adds a new debug messages for 
>>>> troubleshooting -- you have to revert the previous patch as this 
>>>> one includes it. you can send it to me without ngrep, maybe i can 
>>>> catch it.
>>>>       
>>> First try, without ngrep or tcpdump.
>>> BYE for CALL-ID=3-18225-127.0.0.1 is retransmitted.
>>>
>>> I'll send the sipp messages but timestamps it provides seems wrong 
>>> on messages sent.
>>>
>>> Aurelien
>>>    
>>>> Thanks,
>>>> Daniel
>>>>
>>>> On 12/11/08 18:22, Aurelien Grimaud wrote:
>>>>      
>>>>> Well, it seems hard to reproduce with ngrep running ..
>>>>> It seems like the timing is not the same anymore and ngrep slow 
>>>>> down the computer just enough for bug not to trigger.
>>>>> I tried many times and the result is that when ngrep is not 
>>>>> running, the bug shows up, while wth ngrep it stay hidden.
>>>>> I hope last log from sipp were enough.
>>>>>
>>>>> However, this retransmission should not disturb the remote user 
>>>>> and while nasty race should be handled, this is minor for me.
>>>>> My real problem, concerns transaction module too and is more 
>>>>> troubling.
>>>>> This is the one I related in 
>>>>> http://lists.kamailio.org/pipermail/users/2008-December/020882.html
>>>>>
>>>>> I use this timer triggering to fail over another end user and 
>>>>> implement serial forking.
>>>>> If the timer triggers while it should not, I will initiate a 
>>>>> second call to another destination, which is wrong.
>>>>>
>>>>> Any idea on this one ?
>>>>>
>>>>> This is while trying to reproduce it that I found those weird 
>>>>> retransmission.
>>>>> I'll try and trigger this initial bug again.
>>>>>
>>>>> Regards
>>>>>
>>>>> Aurelien
>>>>>
>>>>> Aurelien Grimaud a écrit :
>>>>>        
>>>>>> Sure, meanwhile here are messages provided by sipp
>>>>>> UAS side and UAC side.
>>>>>> UAC contains only the 10 calls done
>>>>>> UAS contains much more as I was trying to reproduce it with 1 
>>>>>> call at a time.
>>>>>>
>>>>>> Aurelien
>>>>>>
>>>>>> Daniel-Constantin Mierla a écrit :
>>>>>>          
>>>>>>> Can you please make the test again and send along with the debug 
>>>>>>> messages the sip trace?
>>>>>>>
>>>>>>> ngrep -d any -qt -W byline port 5060
>>>>>>>
>>>>>>> I want to check the sip messages as well.
>>>>>>>
>>>>>>> Thank you,
>>>>>>> Daniel
>>>>>>>
>>>>>>>
>>>>>>> On 12/11/08 16:19, Aurelien Grimaud wrote:
>>>>>>>            
>>>>>>>> Answer waiting for approval : logs too big !
>>>>>>>>
>>>>>>>> Here is a lighter one.
>>>>>>>>
>>>>>>>> My answer was
>>>>>>>>              
>>>>>>>>> Thanks for the patch.
>>>>>>>>>
>>>>>>>>> With 1 call at a time, the bug does not trigger anymore.
>>>>>>>>> However, with 2 calls at a time it was triggered again on BYE.
>>>>>>>>> Attached log is result of my testing.
>>>>>>>>> 1 sipp as uac make 10 calls with 2 simultaneous calls allowed.
>>>>>>>>>
>>>>>>>>> The call callid=7-22285-127.0.0.1 request resending of BYE 
>>>>>>>>> message at 14:21:07.563004, though we have a 200 ok on BYE at 
>>>>>>>>> 14:21:07.156865 (pid=21493)
>>>>>>>>> Bye request (pid=21495) was not finished to be treated at the 
>>>>>>>>> time 200 ok was received.
>>>>>>>>>
>>>>>>>>> This was done with my module active.
>>>>>>>>> I'll make new tests without it.
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> Aurelien                 
>>>>>>>> Daniel-Constantin Mierla a écrit :
>>>>>>>>              
>>>>>>>>> ... disregard the previous patch, please use this one. It was 
>>>>>>>>> not the latest. Thanks,
>>>>>>>>>
>>>>>>>>> Daniel
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 12/10/08 23:52, Daniel-Constantin Mierla wrote:
>>>>>>>>>                
>>>>>>>>>> Hello,
>>>>>>>>>>
>>>>>>>>>> On 12/09/08 20:41, Aurelien Grimaud wrote:
>>>>>>>>>>                  
>>>>>>>>>>> Daniel-Constantin Mierla a écrit :
>>>>>>>>>>>  
>>>>>>>>>>>                    
>>>>>>>>>>>> On 12/09/08 18:52, Aurelien Grimaud wrote:
>>>>>>>>>>>>  
>>>>>>>>>>>>                      
>>>>>>>>>>>>> I am able to reproduce it with 1 call / second without my 
>>>>>>>>>>>>> module on BYE requests.
>>>>>>>>>>>>> here are traces.
>>>>>>>>>>>>>                               
>>>>>>>>>>>> there is a race (at least), indeed. It happens when there 
>>>>>>>>>>>> is fast reply. I am going to send you a patch soon for 
>>>>>>>>>>>> testing, you use svn branch 1.4 or the tarball?
>>>>>>>>>>>>                           
>>>>>>>>>>> Great, I use the kamailio-1.4.2-notls tarball.
>>>>>>>>>>> But I can test any SVN branch / trunk if you wish.
>>>>>>>>>>>                       
>>>>>>>>>> can you test the attached patch with SVN trunk? Let me know 
>>>>>>>>>> the results. Pay attention to see if breaks something else, 
>>>>>>>>>> not just if fixes the reported issue. I let there some debug 
>>>>>>>>>> messages, to help troubleshooting, if the fix is ok, I'll 
>>>>>>>>>> remove them before committing.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Daniel
>>>>>>>>>>
>>>>>>>>>>                  
>>>>>>>>>>> Aurelien
>>>>>>>>>>>
>>>>>>>>>>>  
>>>>>>>>>>>                    
>>>>>>>>>>>> Cheers,
>>>>>>>>>>>> Daniel
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>  
>>>>>>>>>>>>                      
>>>>>>>>>>>>> ps: I added the ms on Logs.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Aurelien
>>>>>>>>>>>>>
>>>>>>>>>>>>> Daniel-Constantin Mierla a écrit :
>>>>>>>>>>>>>  
>>>>>>>>>>>>>                        
>>>>>>>>>>>>>> On 12/09/08 17:56, Klaus Darilion wrote:
>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>                          
>>>>>>>>>>>>>>> Daniel-Constantin Mierla schrieb:
>>>>>>>>>>>>>>>                                  
>>>>>>>>>>>>>>>> Hello,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On 12/09/08 17:31, Klaus Darilion wrote:
>>>>>>>>>>>>>>>>                                          
>>>>>>>>>>>>>>>>> Aurelien Grimaud schrieb:
>>>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>>>>                                                  
>>>>>>>>>>>>>>>>>> Fair enough.
>>>>>>>>>>>>>>>>>> If no one already experienced this strange behavior, 
>>>>>>>>>>>>>>>>>> it should be my module ...
>>>>>>>>>>>>>>>>>> I'll try to make it again without my module.
>>>>>>>>>>>>>>>>>>                                       
>>>>>>>>>>>>>>>>>>                                   
>>>>>>>>>>>>>>>>> See my other email.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>>>>                                                  
>>>>>>>>>>>>>>>>>> However, in the log, after the 200 response, there is 
>>>>>>>>>>>>>>>>>> a cleanup_uac_timers: RETR/FR timers reset.
>>>>>>>>>>>>>>>>>> So those timers are cleared.
>>>>>>>>>>>>>>>>>>                                       
>>>>>>>>>>>>>>>>>>                                   
>>>>>>>>>>>>>>>>> But the problem is, that the process which handles the 
>>>>>>>>>>>>>>>>> INVITE has not finished yet and those (re)SETS the timer.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> @Daniel - Have you investigated the problem?
>>>>>>>>>>>>>>>>>                               
>>>>>>>>>>>>>>>>>                                 
>>>>>>>>>>>>>>>> so this is the half of the issue reported via:
>>>>>>>>>>>>>>>> https://sourceforge.net/tracker/index.php?func=detail&aid=2105813&group_id=139143&atid=743020 
>>>>>>>>>>>>>>>>                                                     
>>>>>>>>>>>>>>> yes.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Can it be related to other modules which register 
>>>>>>>>>>>>>>> callbacks (e.g. pua module or Aurelien's module?
>>>>>>>>>>>>>>>                                             
>>>>>>>>>>>>>> what is the requests/second rate when the issue appears?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> At first look, between sending and setting retransmission 
>>>>>>>>>>>>>> timer, there is no much processing for the request. The 
>>>>>>>>>>>>>> callback executed there is in use by siptrace, are you 
>>>>>>>>>>>>>> using this module?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>> Daniel
>>>>>>>>>>>>>>                               
>>>>>>>>>>>>>>>> This one got lost, but as I started to fix the other 
>>>>>>>>>>>>>>>> half (replying using proper mode to do retransmission), 
>>>>>>>>>>>>>>>> will investigate this as well ...
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>> Daniel
>>>>>>>>>>>>>>>>                                           
>>>>>>>>>>>
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> Users mailing list
>>>>>>>>>>> Users at lists.kamailio.org
>>>>>>>>>>> http://lists.kamailio.org/cgi-bin/mailman/listinfo/users
>>>>>>>>>>>
>>>>>>>>>>>                       
>>>>>>>>>> ------------------------------------------------------------------------ 
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _______________________________________________
>>>>>>>>>> Users mailing list
>>>>>>>>>> Users at lists.kamailio.org
>>>>>>>>>> http://lists.kamailio.org/cgi-bin/mailman/listinfo/users
>>>>>>>>>>                   
>>>>>> ------------------------------------------------------------------------ 
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Devel mailing list
>>>>>> Devel at lists.kamailio.org
>>>>>> http://lists.kamailio.org/cgi-bin/mailman/listinfo/devel
>>>>>>             
>>>>>         
>>
>>   
>






More information about the Users mailing list