Hi All,
We are looking to enable nathelpers keepalive timeout parameter where it will remove an aor's contact if it doesn't receive a reply from the ua.
The problem we have is that our registrar's are seperated from our proxies, so this means nathelper is sending options messages via the proxy to the ua. In this case, if the us has lost connectivity, the kamailio proxy sends a 408 Request Timeout back to the registrar and the nathelper module sees that as a response to the initial options message it sends, so it never actually removes the contact from the location table.
Is there a way I can stop the proxy from sending any responses for these specific nat ping/options messages, or, is there a way I can tell the nathelper modules that a 408 response should be treated as the ua being down and then after the timeout interval remove the contact from the location table?
I've had a read of the nathelper, tm and sl modules but I didnt see anything that jumps out at me that I could achieve one or the other possible solutions I mentioned above.
Does anyone have any suggestions or pointers on how we can achieve this with the registrars being seperate from the edge proxies.
Any pointrs/tips etc would be greatly appreciated.
Thanks
On Thu, May 17, 2018 at 11:43:36AM +0100, Asgaroth wrote: [OPTIONS and seperate proxy/registrar]
Does anyone have any suggestions or pointers on how we can achieve this with the registrars being seperate from the edge proxies.
Any pointrs/tips etc would be greatly appreciated.
I noticed the same problem (most important with multiple registrars I didn't like paritioning logic) and my solution was to send my own OPTIONS.
In your favorite scripting language, select all location entries and construct an OPTIONS and send that directly to the correct proxy with a preloaded Route (based on Path headers inserted on the proxy). In the response I defined a valid reply as a response code other than 408 and below 500.
On 17/05/18 14:23, Daniel Tryba wrote:
I noticed the same problem (most important with multiple registrars I didn't like paritioning logic) and my solution was to send my own OPTIONS.
We have it running with multiple (3) registrars that replicate with dmq usrloc, works perfectly so far for us.
In your favorite scripting language, select all location entries and construct an OPTIONS and send that directly to the correct proxy with a preloaded Route (based on Path headers inserted on the proxy). In the response I defined a valid reply as a response code other than 408 and below 500.
I'm not sure I'm following your logic here. The registrars with nathelper loaded are already doing what you describe above. They are actually sending the options keepalive using path from the location/registrar module. The options messages are getting to the ua's and that part is working as intended.
The part we're having an issue with is when we enable the keepalive_timeout module parameter. My understanding is that with this module parameter enabled, if a ua is sent a keepalive message and nathelper does *not* recieve a response back from the client after predefined number of attempts, then nathelper will flush the contact from location as it deems it as "down".
When the options message is sent via the correct proxy to a ua that is "down", the proxy retransmits the options 3 - 4 times to the ua, and because the ua does not respond (its down), kamailio generates a 408 Request Timeout response back to the registrar for that keepalive request. The behaviour of the proxy is expected behaviour under normal conditions, however, it causes an issue with this sceanrio in that the 408 generated by the proxy and sent back to the registrar is interpreted by the nathelper module as a successfull response from the ua and therefor does not remove the contact from the location table, even though the request had timed out.
Are you saying that you manualy remove your contacts from the database using your custom script based on the responses back? We are currently running in memory only mode, so it makes it a little more difficult.
I think what would be nice is if nathelper had a similar option to what dispatcher has, in the sense that you could defined, via module parameter, what response code would be deemed as a "failure" in contacting the ua, that way, in this scenario, we could say that a 408 timeout should be considered as identicle to a non-response from the ua.
I cant think of how to disable sending these 408 from the proxy for these specific types of messages, or, if this is even possible.