### Description
When relaying an `INVITE` from a Kamailio proxy to an interconnect we are using DNS SRV records for load-balancing and failover. The proxy is listening on both a private interface and a public interface with an rfc1918 (ie. `10.0.0.14`) and a public IP address (ie. `185.0.0.34`) respectively.
The first branch (before DNS SRV failover) is working as expected. The message will be relayed from the received socket (`185.0.0.34`) to the interconnect. When this branch results in a timeout, the proxy will try to do DNS SRV failover. This new branch and any subsequent branches will no longer use the initial received socket as source. In our case we see that the private address (`10.0.0.14`) is now being used as source address.
### Troubleshooting
#### Reproduction
DNS SRV: ``` _sip._udp.transit.net. SRV 10 10 5060 transit1.net. _sip._udp.transit.net. SRV 20 10 5060 transit2.net. transit1.net. A 185.10.20.30 transit2.net. A 185.10.20.31 ``` Kamailio: ``` $du = "sip:transit.net;transport=udp"; xinfo("Relaying [$rm] request: [$ru] with Call-ID [$ci]"); t_set_fr(0, 1000); if (not t_relay()) { sl_reply_error(); } ``` Network flow: ``` 12:30:00 INVITE udp:10.0.0.18:5060 => udp:185.0.0.34:5060 (internal request to proxy) 12:30:00 INVITE udp:185.0.0.34:5060 => udp:185.10.20.30:5060 (relaying from proxy to interconnect) ^^^^^^^^^^
(request times out after 1 second, proxy will do a failover to the next endpoint)
12:30:01 INVITE udp:10.0.0.14:5060 => udp:185.10.20.31:5060 (relaying to next interconnect address) ^^^^^^^^^ ```
#### Log Messages
Attempt to see where it goes wrong: ``` onsend_route { xinfo("[$RAut] [$Rut] [$sas]\n"); xinfo("$snd(buf)\n"); } ``` ``` INFO: [sip:185.0.0.34:5060;transport=udp] [sip:185.0.0.34:5060;transport=udp] [udp:10.0.0.18:5060] INFO: INVITE sip:+1234567890@transit.net;user=phone SIP/2.0#015#012Record-Route: sip:185.0.0.34;lr;ftag=tDr7m6erX1N3D#015#012Via: SIP/2.0/UDP 10.0.0.14;branch=z9hG4bKafe7.7fb590e263fa44677514193a6a1156ce.1#015#012Via: SIP/2.0/UDP 10.0.0.18;received=10.0.0.18;rport=5060;branch=z9hG4bK6t59a17N60FcB ... ``` So the `Record-Route` seems to be correct, but the top most `Via` header shows the private IP address. The message is being sent from the private IP address as well and never reaches the second address from the interconnect.
### Possible Solutions
A workaround to add `$fs = "udp:185.0.0.34:5060"` in the `onsend_route` seems to be effective.
### Additional Information
* **Kamailio Version** - output of `kamailio -v`
``` version: kamailio 5.2.3 (x86_64/linux) flags: STATS: Off, USE_TCP, USE_TLS, USE_SCTP, TLS_HOOKS, USE_RAW_SOCKS, DISABLE_NAGLE, USE_MCAST, DNS_IP_HACK, SHM_MEM, SHM_MMAP, PKG_MALLOC, Q_MALLOC, F_MALLOC, TLSF_MALLOC, DBG_SR_MEMORY, USE_FUTEX, FAST_LOCK-ADAPTIVE_WAIT, USE_DNS_CACHE, USE_DNS_FAILOVER, USE_NAPTR, USE_DST_BLACKLIST, HAVE_RESOLV_RES ADAPTIVE_WAIT_LOOPS=1024, MAX_RECV_BUFFER_SIZE 262144 MAX_URI_SIZE 1024, BUF_SIZE 65535, DEFAULT PKG_SIZE 8MB poll method support: poll, epoll_lt, epoll_et, sigio_rt, select. id: unknown compiled with gcc 7.4.0 ```
* **Operating System**:
``` Ubuntu 18.04 LTS Linux proxy4 4.15.0-64-generic #73-Ubuntu SMP Thu Sep 12 13:16:13 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux ```
First, you have to set mhomed=1.
However, if you have an IP route from the first network interface to the second target address, then it is not going to help. The kernel will say that the first network interface is still good to use. You have to isolate ip routing between the two networks.
Via DNS SRV is no way to specify what local socket/network interface to use, Kamailio relies on the kernel to say which local ip has a route to the target.
If you cannot isolate the IP routing to get it work with mhomed=1, the alternative is to use dispatcher where you can specify the local socket for each destination.
In case you want to discuss further, then use sr-users mailing list. This doesn't look like an issue in the code.
Closed #2152.
Thanks Daniel! For our configuration `mhomed` is not working well. Besides the documentation states that the incoming socket would be used by default (as long as we don't switch protocol) which is why I reported this as a bug. I would expect consistency between the first and any subsequent branches in a DNS SRV failover scenario.
``` When deactivated, the incoming socket will be used or the first one for a different protocol, disregarding the destination location. ``` Source: https://www.kamailio.org/wiki/cookbooks/5.2.x/core#mhomed
Reopened #2152.
I misunderstood a bit the issue. Provide full debug messages (with debug=3 in kamailio.cfg) printed by kamailio when handling such invite.
No problem and thanks for looking into this. The debug logs can be found here. The IP addresses and hostnames are anonymised (as required by my client). In this scenario, traffic should go to `185.10.20.29` first, and failover to `185.10.20.30` or `185.10.20.31` based on DNS SRV.
What I was seeing with tcpdump: ``` 1: 185.0.1.34 => udp:185.10.20.29:5060 2: 10.0.1.14 => udp:185.10.20.30:5060 3: 10.0.1.14 => udp:185.10.20.31:5060 ```
https://gist.github.com/ThomasLobker/1bd279853e7b0d60add2fa9016e79f28
Can you share all the config or the relevant parts of your config file related to routing out: branch routes and failure routes? Or maybe you can make a minimal config that can be used to reproduce the issue?
I noticed info messages that suggest the from address is the right one for targets that private address ends up being used:
``` Nov 28 09:33:23 proxy4.ams1.mysipnetwork outbound1[16849]: INFO: Sending [INVITE] message from [udp:185.0.1.34:5060] to [udp:185.10.20.31:5060] ```
That socket in that line is the result of `$RAut` in the `onsend_route` and it shows the expected IP address, but the actual outgoing packet is coming from a private address anyway. The current configuration is huge, so I will create a minimal config to reproduce the issue. I need one or two days to have this done.
Any chance to get soon a minimal config to try to reproduce the issue?
A patch that might have been the cause was reverted. Being an old issue without follow up, I am going to close it. Reopen if still an issue, providing new details using the latest versions from git branches.
Closed #2152.