cachepc-linux

Fork of AMDESE/linux with modifications for CachePC side-channel attack
git clone https://git.sinitax.com/sinitax/cachepc-linux
Log | Files | Refs | README | LICENSE | sfeed.txt

netdev-features.rst (7732B)


      1.. SPDX-License-Identifier: GPL-2.0
      2
      3=====================================================
      4Netdev features mess and how to get out from it alive
      5=====================================================
      6
      7Author:
      8	Michał Mirosław <mirq-linux@rere.qmqm.pl>
      9
     10
     11
     12Part I: Feature sets
     13====================
     14
     15Long gone are the days when a network card would just take and give packets
     16verbatim.  Today's devices add multiple features and bugs (read: offloads)
     17that relieve an OS of various tasks like generating and checking checksums,
     18splitting packets, classifying them.  Those capabilities and their state
     19are commonly referred to as netdev features in Linux kernel world.
     20
     21There are currently three sets of features relevant to the driver, and
     22one used internally by network core:
     23
     24 1. netdev->hw_features set contains features whose state may possibly
     25    be changed (enabled or disabled) for a particular device by user's
     26    request.  This set should be initialized in ndo_init callback and not
     27    changed later.
     28
     29 2. netdev->features set contains features which are currently enabled
     30    for a device.  This should be changed only by network core or in
     31    error paths of ndo_set_features callback.
     32
     33 3. netdev->vlan_features set contains features whose state is inherited
     34    by child VLAN devices (limits netdev->features set).  This is currently
     35    used for all VLAN devices whether tags are stripped or inserted in
     36    hardware or software.
     37
     38 4. netdev->wanted_features set contains feature set requested by user.
     39    This set is filtered by ndo_fix_features callback whenever it or
     40    some device-specific conditions change. This set is internal to
     41    networking core and should not be referenced in drivers.
     42
     43
     44
     45Part II: Controlling enabled features
     46=====================================
     47
     48When current feature set (netdev->features) is to be changed, new set
     49is calculated and filtered by calling ndo_fix_features callback
     50and netdev_fix_features(). If the resulting set differs from current
     51set, it is passed to ndo_set_features callback and (if the callback
     52returns success) replaces value stored in netdev->features.
     53NETDEV_FEAT_CHANGE notification is issued after that whenever current
     54set might have changed.
     55
     56The following events trigger recalculation:
     57 1. device's registration, after ndo_init returned success
     58 2. user requested changes in features state
     59 3. netdev_update_features() is called
     60
     61ndo_*_features callbacks are called with rtnl_lock held. Missing callbacks
     62are treated as always returning success.
     63
     64A driver that wants to trigger recalculation must do so by calling
     65netdev_update_features() while holding rtnl_lock. This should not be done
     66from ndo_*_features callbacks. netdev->features should not be modified by
     67driver except by means of ndo_fix_features callback.
     68
     69
     70
     71Part III: Implementation hints
     72==============================
     73
     74 * ndo_fix_features:
     75
     76All dependencies between features should be resolved here. The resulting
     77set can be reduced further by networking core imposed limitations (as coded
     78in netdev_fix_features()). For this reason it is safer to disable a feature
     79when its dependencies are not met instead of forcing the dependency on.
     80
     81This callback should not modify hardware nor driver state (should be
     82stateless).  It can be called multiple times between successive
     83ndo_set_features calls.
     84
     85Callback must not alter features contained in NETIF_F_SOFT_FEATURES or
     86NETIF_F_NEVER_CHANGE sets. The exception is NETIF_F_VLAN_CHALLENGED but
     87care must be taken as the change won't affect already configured VLANs.
     88
     89 * ndo_set_features:
     90
     91Hardware should be reconfigured to match passed feature set. The set
     92should not be altered unless some error condition happens that can't
     93be reliably detected in ndo_fix_features. In this case, the callback
     94should update netdev->features to match resulting hardware state.
     95Errors returned are not (and cannot be) propagated anywhere except dmesg.
     96(Note: successful return is zero, >0 means silent error.)
     97
     98
     99
    100Part IV: Features
    101=================
    102
    103For current list of features, see include/linux/netdev_features.h.
    104This section describes semantics of some of them.
    105
    106 * Transmit checksumming
    107
    108For complete description, see comments near the top of include/linux/skbuff.h.
    109
    110Note: NETIF_F_HW_CSUM is a superset of NETIF_F_IP_CSUM + NETIF_F_IPV6_CSUM.
    111It means that device can fill TCP/UDP-like checksum anywhere in the packets
    112whatever headers there might be.
    113
    114 * Transmit TCP segmentation offload
    115
    116NETIF_F_TSO_ECN means that hardware can properly split packets with CWR bit
    117set, be it TCPv4 (when NETIF_F_TSO is enabled) or TCPv6 (NETIF_F_TSO6).
    118
    119 * Transmit UDP segmentation offload
    120
    121NETIF_F_GSO_UDP_L4 accepts a single UDP header with a payload that exceeds
    122gso_size. On segmentation, it segments the payload on gso_size boundaries and
    123replicates the network and UDP headers (fixing up the last one if less than
    124gso_size).
    125
    126 * Transmit DMA from high memory
    127
    128On platforms where this is relevant, NETIF_F_HIGHDMA signals that
    129ndo_start_xmit can handle skbs with frags in high memory.
    130
    131 * Transmit scatter-gather
    132
    133Those features say that ndo_start_xmit can handle fragmented skbs:
    134NETIF_F_SG --- paged skbs (skb_shinfo()->frags), NETIF_F_FRAGLIST ---
    135chained skbs (skb->next/prev list).
    136
    137 * Software features
    138
    139Features contained in NETIF_F_SOFT_FEATURES are features of networking
    140stack. Driver should not change behaviour based on them.
    141
    142 * LLTX driver (deprecated for hardware drivers)
    143
    144NETIF_F_LLTX is meant to be used by drivers that don't need locking at all,
    145e.g. software tunnels.
    146
    147This is also used in a few legacy drivers that implement their
    148own locking, don't use it for new (hardware) drivers.
    149
    150 * netns-local device
    151
    152NETIF_F_NETNS_LOCAL is set for devices that are not allowed to move between
    153network namespaces (e.g. loopback).
    154
    155Don't use it in drivers.
    156
    157 * VLAN challenged
    158
    159NETIF_F_VLAN_CHALLENGED should be set for devices which can't cope with VLAN
    160headers. Some drivers set this because the cards can't handle the bigger MTU.
    161[FIXME: Those cases could be fixed in VLAN code by allowing only reduced-MTU
    162VLANs. This may be not useful, though.]
    163
    164*  rx-fcs
    165
    166This requests that the NIC append the Ethernet Frame Checksum (FCS)
    167to the end of the skb data.  This allows sniffers and other tools to
    168read the CRC recorded by the NIC on receipt of the packet.
    169
    170*  rx-all
    171
    172This requests that the NIC receive all possible frames, including errored
    173frames (such as bad FCS, etc).  This can be helpful when sniffing a link with
    174bad packets on it.  Some NICs may receive more packets if also put into normal
    175PROMISC mode.
    176
    177*  rx-gro-hw
    178
    179This requests that the NIC enables Hardware GRO (generic receive offload).
    180Hardware GRO is basically the exact reverse of TSO, and is generally
    181stricter than Hardware LRO.  A packet stream merged by Hardware GRO must
    182be re-segmentable by GSO or TSO back to the exact original packet stream.
    183Hardware GRO is dependent on RXCSUM since every packet successfully merged
    184by hardware must also have the checksum verified by hardware.
    185
    186* hsr-tag-ins-offload
    187
    188This should be set for devices which insert an HSR (High-availability Seamless
    189Redundancy) or PRP (Parallel Redundancy Protocol) tag automatically.
    190
    191* hsr-tag-rm-offload
    192
    193This should be set for devices which remove HSR (High-availability Seamless
    194Redundancy) or PRP (Parallel Redundancy Protocol) tags automatically.
    195
    196* hsr-fwd-offload
    197
    198This should be set for devices which forward HSR (High-availability Seamless
    199Redundancy) frames from one port to another in hardware.
    200
    201* hsr-dup-offload
    202
    203This should be set for devices which duplicate outgoing HSR (High-availability
    204Seamless Redundancy) or PRP (Parallel Redundancy Protocol) tags automatically
    205frames in hardware.