[sr-dev] [kamailio/kamailio] Path MTU handling - suggested solution for IPv6 (Issue #3119)

vanrein notifications at github.com
Sat Jun 11 09:48:24 CEST 2022


I got one thing wrong, and that saves bundles of work.  Here's from [experimental code](https://gitlab.com/0cpm/mtugames/-/blob/master/submit-mtu-hilo.c),

```
/*
 * Confusingly, ip(7) states
 *
 * IP_MTU (since Linux 2.2)
 *    Retrieve the current known path MTU of the current socket.
 *    Returns an integer.  IP_MTU  is valid only for getsockopt(2) and
 *    can be employed only when the socket has been connected.
 *
 * Similarly, ipv6(7) states
 *
 * IPV6_MTU
 *     getsockopt(): Retrieve the current known path MTU of the current
 *     socket.  Valid only when the socket has been connected.  Returns
 *     an integer.
 *     
 *     setsockopt():  Set  the  MTU to be used for the socket.  The MTU
 *     is limited by the device MTU or the path MTU when path MTU
 *     discovery is enabled.  Argument is a pointer to integer.
 *
 * This suggests that IP_MTU is a socket property.  However, it makes
 * more sense as a shared global property, which indeed seems to apply:
 *
 * The ipv6(7) entry for IPV6_MTU_DISCOVER references IP_MTU_DISCOVER;
 * the ip(7) entry for IP_MTU_DISCOVER states
 *
 * IP_MTU_DISCOVER (since Linux 2.2)
 *    When PMTU discovery is enabled, the kernel automatically keeps track
 *    of the path MTU  per destination host.  When it is connected to a
 *    specific peer with connect(2), the currently known path MTU can be
 *    retrieved conveniently using the IP_MTU socket option (e.g.,  after
 *    an  EMSGSIZE  error  occurred).   The  path MTU may change over time.
 *    For connectionless sockets with many destinations, the new MTU for a
 *    given destination can also be  accessed using  the  error  queue (see
 *    IP_RECVERR).  A new error will be queued for every incoming MTU update.
 *
 *    While MTU discovery is in progress, initial packets from datagram
 *    sockets may be dropped.  Applications  using  UDP  should  be aware
 *    of this and not take it into account for their packet retransmit strategy.
 *
 * Retransmission is common in UDP applications.  Ideally, the IP_RECVERR or
 * IPV6_RECVERR are used to immediately resend, without wait for timers to
 * expire; and without limiting the number of Path MTU lessens learnt to the
 * number of timer rounds.
 *
 * For IPv6, where fragmenttion is required to accomodate the Path MTU, and
 * for unconnected applications, the lessons from Path MTU discovery are of
 * major impact on their behaviour; we should always let the socket fragment
 * frames when so desired, so:
 *
 * IP_MTU_DISCOVER (since Linux 2.2)
 *    IP_PMTUDISC_WANT will fragment a datagram if needed according to the
 *    path MTU, [IPv4-only: or will set the don't-fragment flag otherwise].
 *
 *    Path MTU discovery value   Meaning
 *    IP_PMTUDISC_WANT           Use per-route settings.
 *    IP_PMTUDISC_DONT           Never do Path MTU Discovery.
 *    IP_PMTUDISC_DO             Always do Path MTU Discovery.
 *    IP_PMTUDISC_PROBE          Set DF but ignore Path MTU.
 *
 */
```

*I'm documenting it here, so that the knowledge is not lost on the project.  This is difficult stuff.*

It would seem that Path MTU discovery is not maintained per socket (which would benefit locality and proper cleanup of the knowledge) but as a global kernel property for the route (which benefits reuse of the knowledge, IWO a useful form of caching).

## Conclusions for Kamailio on IPv6

 1. The idea to [set different MTU values for two sockets](https://gitlab.com/0cpm/mtugames/-/blob/master/submit-mtu-hilo.c) failed for unconnected sockets.  And to have multiple MTUs you need unconnected sockets.
 2. This means that the idea of a secondary socket is not going to work in Kamailio either.
 3. It does seem to be true that the kernel keeps track of Path MTU *if asked*.
 4. For IPv6, not learning from Path MTU feedback (ICMPv6 Packet too Big) always leads to the same effect; once a frame is dropped it is *always lost, regardless of resends*.  Kamailio comes across as unstable, especially because SIP message sizes vary and make some things works while others fail.
 5. Note that it *never causes packet drops* if Path MTU discovery is enabled for IPv6; there is just a reason for fragmentation, which at most is an efficiency issue.  Note that IPv6 has no "Don't Fragment" option; this behaviour is always active.
 6. And it means that it *can only add value* to enable Path MTU discovery for IPv6.  Even if `sysctl()` could make such a setting, Kamailio stability demands this for IPv6, AFAIK.
 7. Path MTU discovery for IPv4 continues to be an option and a matter of taste, unlike for IPv6.

## Perfection for Kamailio over IPv6

 1. The first contact with an IPv6 host may drop with *Packet too Big* over ICMPv6 messages.  This may happen when the kernel drops knowledge.  Some SIP processing is an hour apart, and may cause this dropping of knowledge.
 2. Use of `IPV6_RECVERR` enables immediate resending, with improved Path MTU knowledge. This involves an extra polling mechanism, which is beyond my reach.  This also links into the `tm` logic and goes beyond my reach.  For `sl` replies there will probably be a 2nd round if Path MTU problems arise, because the reply was sent-then-forgotten, and needs to wait for another round.

-- 
Reply to this email directly or view it on GitHub:
https://github.com/kamailio/kamailio/issues/3119#issuecomment-1152876570
You are receiving this because you are subscribed to this thread.

Message ID: <kamailio/kamailio/issues/3119/1152876570 at github.com>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.kamailio.org/pipermail/sr-dev/attachments/20220611/662045ce/attachment.htm>


More information about the sr-dev mailing list