cachepc-linux

Fork of AMDESE/linux with modifications for CachePC side-channel attack
git clone https://git.sinitax.com/sinitax/cachepc-linux
Log | Files | Refs | README | LICENSE | sfeed.txt

prog_flow_dissector.rst (5361B)


      1.. SPDX-License-Identifier: GPL-2.0
      2
      3============================
      4BPF_PROG_TYPE_FLOW_DISSECTOR
      5============================
      6
      7Overview
      8========
      9
     10Flow dissector is a routine that parses metadata out of the packets. It's
     11used in the various places in the networking subsystem (RFS, flow hash, etc).
     12
     13BPF flow dissector is an attempt to reimplement C-based flow dissector logic
     14in BPF to gain all the benefits of BPF verifier (namely, limits on the
     15number of instructions and tail calls).
     16
     17API
     18===
     19
     20BPF flow dissector programs operate on an ``__sk_buff``. However, only the
     21limited set of fields is allowed: ``data``, ``data_end`` and ``flow_keys``.
     22``flow_keys`` is ``struct bpf_flow_keys`` and contains flow dissector input
     23and output arguments.
     24
     25The inputs are:
     26  * ``nhoff`` - initial offset of the networking header
     27  * ``thoff`` - initial offset of the transport header, initialized to nhoff
     28  * ``n_proto`` - L3 protocol type, parsed out of L2 header
     29  * ``flags`` - optional flags
     30
     31Flow dissector BPF program should fill out the rest of the ``struct
     32bpf_flow_keys`` fields. Input arguments ``nhoff/thoff/n_proto`` should be
     33also adjusted accordingly.
     34
     35The return code of the BPF program is either BPF_OK to indicate successful
     36dissection, or BPF_DROP to indicate parsing error.
     37
     38__sk_buff->data
     39===============
     40
     41In the VLAN-less case, this is what the initial state of the BPF flow
     42dissector looks like::
     43
     44  +------+------+------------+-----------+
     45  | DMAC | SMAC | ETHER_TYPE | L3_HEADER |
     46  +------+------+------------+-----------+
     47                              ^
     48                              |
     49                              +-- flow dissector starts here
     50
     51
     52.. code:: c
     53
     54  skb->data + flow_keys->nhoff point to the first byte of L3_HEADER
     55  flow_keys->thoff = nhoff
     56  flow_keys->n_proto = ETHER_TYPE
     57
     58In case of VLAN, flow dissector can be called with the two different states.
     59
     60Pre-VLAN parsing::
     61
     62  +------+------+------+-----+-----------+-----------+
     63  | DMAC | SMAC | TPID | TCI |ETHER_TYPE | L3_HEADER |
     64  +------+------+------+-----+-----------+-----------+
     65                        ^
     66                        |
     67                        +-- flow dissector starts here
     68
     69.. code:: c
     70
     71  skb->data + flow_keys->nhoff point the to first byte of TCI
     72  flow_keys->thoff = nhoff
     73  flow_keys->n_proto = TPID
     74
     75Please note that TPID can be 802.1AD and, hence, BPF program would
     76have to parse VLAN information twice for double tagged packets.
     77
     78
     79Post-VLAN parsing::
     80
     81  +------+------+------+-----+-----------+-----------+
     82  | DMAC | SMAC | TPID | TCI |ETHER_TYPE | L3_HEADER |
     83  +------+------+------+-----+-----------+-----------+
     84                                          ^
     85                                          |
     86                                          +-- flow dissector starts here
     87
     88.. code:: c
     89
     90  skb->data + flow_keys->nhoff point the to first byte of L3_HEADER
     91  flow_keys->thoff = nhoff
     92  flow_keys->n_proto = ETHER_TYPE
     93
     94In this case VLAN information has been processed before the flow dissector
     95and BPF flow dissector is not required to handle it.
     96
     97
     98The takeaway here is as follows: BPF flow dissector program can be called with
     99the optional VLAN header and should gracefully handle both cases: when single
    100or double VLAN is present and when it is not present. The same program
    101can be called for both cases and would have to be written carefully to
    102handle both cases.
    103
    104
    105Flags
    106=====
    107
    108``flow_keys->flags`` might contain optional input flags that work as follows:
    109
    110* ``BPF_FLOW_DISSECTOR_F_PARSE_1ST_FRAG`` - tells BPF flow dissector to
    111  continue parsing first fragment; the default expected behavior is that
    112  flow dissector returns as soon as it finds out that the packet is fragmented;
    113  used by ``eth_get_headlen`` to estimate length of all headers for GRO.
    114* ``BPF_FLOW_DISSECTOR_F_STOP_AT_FLOW_LABEL`` - tells BPF flow dissector to
    115  stop parsing as soon as it reaches IPv6 flow label; used by
    116  ``___skb_get_hash`` and ``__skb_get_hash_symmetric`` to get flow hash.
    117* ``BPF_FLOW_DISSECTOR_F_STOP_AT_ENCAP`` - tells BPF flow dissector to stop
    118  parsing as soon as it reaches encapsulated headers; used by routing
    119  infrastructure.
    120
    121
    122Reference Implementation
    123========================
    124
    125See ``tools/testing/selftests/bpf/progs/bpf_flow.c`` for the reference
    126implementation and ``tools/testing/selftests/bpf/flow_dissector_load.[hc]``
    127for the loader. bpftool can be used to load BPF flow dissector program as well.
    128
    129The reference implementation is organized as follows:
    130  * ``jmp_table`` map that contains sub-programs for each supported L3 protocol
    131  * ``_dissect`` routine - entry point; it does input ``n_proto`` parsing and
    132    does ``bpf_tail_call`` to the appropriate L3 handler
    133
    134Since BPF at this point doesn't support looping (or any jumping back),
    135jmp_table is used instead to handle multiple levels of encapsulation (and
    136IPv6 options).
    137
    138
    139Current Limitations
    140===================
    141BPF flow dissector doesn't support exporting all the metadata that in-kernel
    142C-based implementation can export. Notable example is single VLAN (802.1Q)
    143and double VLAN (802.1AD) tags. Please refer to the ``struct bpf_flow_keys``
    144for a set of information that's currently can be exported from the BPF context.
    145
    146When BPF flow dissector is attached to the root network namespace (machine-wide
    147policy), users can't override it in their child network namespaces.