[SR-Users] Path MTU issues over UDP/IPv6

Fri May 20 10:01:34 CEST 2022

Hello Rick,

thanks for looking into it. 

You already opened an issue about that, which is a good idea to keep track of it.

Cheers,

Henning

-- 
Henning Westerholt - https://skalatan.de/blog/
Kamailio services - https://gilawa.com

-----Original Message-----
From: Rick van Rein <rick+kamailio.org at vanrein.org> 
Sent: Thursday, May 19, 2022 12:21 PM
To: Kamailio (SER) - Users Mailing List <sr-users at lists.kamailio.org>
Cc: Richard Fuchs <rfuchs at sipwise.com>; Henning Westerholt <hw at gilawa.com>
Subject: Re: [SR-Users] Path MTU issues over UDP/IPv6

Hello Henning and Richard,

Henning Westerholt helped me focus in the code:

> You find the implementation of the MTU handling in the src/core/udp_server.c file. Its just setting the appropriate socket option right now.

I think I found a few bugs, centering around
https://github.com/kamailio/kamailio/blob/master/src/core/udp_server.c#L331-L349

The file clearly shows how the option is processed,

    (pmtu_discovery) ? IP_PMTUDISC_DO : IP_PMTUDISC_DONT

This is IPv4-only, and it looks like a bug that no check on the family is done before this is set.  Note that Linux defines

    /usr/include/linux/in6.h: #define IPV6_MTU_DISCOVER  23
    /usr/include/linux/in.h:  #define IP_MTU_DISCOVER    10

In general, Path MTU discovery only applies to connected sockets, which is not what happens in udp_server.c -- the IPv4 version sets the DF flag, which made me wonder if that actually gets handled at all.
The IP_RECVERR flag described in ip(7) is used and is intended for such connectionless MTU handling.  For IPv6, there is an IPV6_RECVERR,

     /usr/include/linux/in6.h: #define IPV6_RECVERR  25
     /usr/include/linux/in.h:  #define IP_RECVERR    11

The IPV6 variant is absent, which would be another bug.
(FYI, I use an IPv6-only setup, probably why this turns up.)

This being the mechanism to handle MTU discovery for unconnected sockets, I read ip(7) and it mentions a flag MSG_ERRQUEUE to be used with recvmsg().  I could not find this flag in Kamailio, so I suspect that this treatment was not completed after adding the IP_RECVERR flag.

An approach that would always be safe AFAIK is to change a socket with this kind of error to a connected socket, and set the lower MTU on that.  And then, continue sending.  Connecting over UDP is kind-of free, and avoids relying on another protocol in the peer.
The expense would be grabbing an extra socket, which is why it may be better to await Path MTU failure.

Richard Fuchs explained in detail what happens:

> 5. The application wants to send another packet to the same destination
>    (e.g. in Kamailio's case probably a retransmission of the first one,
>    as that packet was never acknowledged).
> 6. The application does exactly the same thing as in step 1.
> 7. The kernel now knows about the smaller PMTU to that packet's
>    destination and will therefore fragment the packet appropriately
>    before sending the fragments out.

These last steps however, only apply to a _connected_ UDP socket.  I chased for that in the given file, but did not find it.

I suppose there are also problems in Linux' double-action of MTU as implied MRU -- it means that you cannot be conservative in what you send and liberal in what you accept -- that would have been a useful OS-level strategy.  In lieu of that, I suppose it is an application problem :'-(

This in general feels like it is outside my reach.  I can understand it, but cannot fix it.  Have I hereby submitted a bug, or is an issue on GitHub the proper path?

Thanks,

Rick van Rein