cachepc-linux

Fork of AMDESE/linux with modifications for CachePC side-channel attack
git clone https://git.sinitax.com/sinitax/cachepc-linux
Log | Files | Refs | README | LICENSE | sfeed.txt

bpf_prog_run.rst (6138B)


      1.. SPDX-License-Identifier: GPL-2.0
      2
      3===================================
      4Running BPF programs from userspace
      5===================================
      6
      7This document describes the ``BPF_PROG_RUN`` facility for running BPF programs
      8from userspace.
      9
     10.. contents::
     11    :local:
     12    :depth: 2
     13
     14
     15Overview
     16--------
     17
     18The ``BPF_PROG_RUN`` command can be used through the ``bpf()`` syscall to
     19execute a BPF program in the kernel and return the results to userspace. This
     20can be used to unit test BPF programs against user-supplied context objects, and
     21as way to explicitly execute programs in the kernel for their side effects. The
     22command was previously named ``BPF_PROG_TEST_RUN``, and both constants continue
     23to be defined in the UAPI header, aliased to the same value.
     24
     25The ``BPF_PROG_RUN`` command can be used to execute BPF programs of the
     26following types:
     27
     28- ``BPF_PROG_TYPE_SOCKET_FILTER``
     29- ``BPF_PROG_TYPE_SCHED_CLS``
     30- ``BPF_PROG_TYPE_SCHED_ACT``
     31- ``BPF_PROG_TYPE_XDP``
     32- ``BPF_PROG_TYPE_SK_LOOKUP``
     33- ``BPF_PROG_TYPE_CGROUP_SKB``
     34- ``BPF_PROG_TYPE_LWT_IN``
     35- ``BPF_PROG_TYPE_LWT_OUT``
     36- ``BPF_PROG_TYPE_LWT_XMIT``
     37- ``BPF_PROG_TYPE_LWT_SEG6LOCAL``
     38- ``BPF_PROG_TYPE_FLOW_DISSECTOR``
     39- ``BPF_PROG_TYPE_STRUCT_OPS``
     40- ``BPF_PROG_TYPE_RAW_TRACEPOINT``
     41- ``BPF_PROG_TYPE_SYSCALL``
     42
     43When using the ``BPF_PROG_RUN`` command, userspace supplies an input context
     44object and (for program types operating on network packets) a buffer containing
     45the packet data that the BPF program will operate on. The kernel will then
     46execute the program and return the results to userspace. Note that programs will
     47not have any side effects while being run in this mode; in particular, packets
     48will not actually be redirected or dropped, the program return code will just be
     49returned to userspace. A separate mode for live execution of XDP programs is
     50provided, documented separately below.
     51
     52Running XDP programs in "live frame mode"
     53-----------------------------------------
     54
     55The ``BPF_PROG_RUN`` command has a separate mode for running live XDP programs,
     56which can be used to execute XDP programs in a way where packets will actually
     57be processed by the kernel after the execution of the XDP program as if they
     58arrived on a physical interface. This mode is activated by setting the
     59``BPF_F_TEST_XDP_LIVE_FRAMES`` flag when supplying an XDP program to
     60``BPF_PROG_RUN``.
     61
     62The live packet mode is optimised for high performance execution of the supplied
     63XDP program many times (suitable for, e.g., running as a traffic generator),
     64which means the semantics are not quite as straight-forward as the regular test
     65run mode. Specifically:
     66
     67- When executing an XDP program in live frame mode, the result of the execution
     68  will not be returned to userspace; instead, the kernel will perform the
     69  operation indicated by the program's return code (drop the packet, redirect
     70  it, etc). For this reason, setting the ``data_out`` or ``ctx_out`` attributes
     71  in the syscall parameters when running in this mode will be rejected. In
     72  addition, not all failures will be reported back to userspace directly;
     73  specifically, only fatal errors in setup or during execution (like memory
     74  allocation errors) will halt execution and return an error. If an error occurs
     75  in packet processing, like a failure to redirect to a given interface,
     76  execution will continue with the next repetition; these errors can be detected
     77  via the same trace points as for regular XDP programs.
     78
     79- Userspace can supply an ifindex as part of the context object, just like in
     80  the regular (non-live) mode. The XDP program will be executed as though the
     81  packet arrived on this interface; i.e., the ``ingress_ifindex`` of the context
     82  object will point to that interface. Furthermore, if the XDP program returns
     83  ``XDP_PASS``, the packet will be injected into the kernel networking stack as
     84  though it arrived on that ifindex, and if it returns ``XDP_TX``, the packet
     85  will be transmitted *out* of that same interface. Do note, though, that
     86  because the program execution is not happening in driver context, an
     87  ``XDP_TX`` is actually turned into the same action as an ``XDP_REDIRECT`` to
     88  that same interface (i.e., it will only work if the driver has support for the
     89  ``ndo_xdp_xmit`` driver op).
     90
     91- When running the program with multiple repetitions, the execution will happen
     92  in batches. The batch size defaults to 64 packets (which is same as the
     93  maximum NAPI receive batch size), but can be specified by userspace through
     94  the ``batch_size`` parameter, up to a maximum of 256 packets. For each batch,
     95  the kernel executes the XDP program repeatedly, each invocation getting a
     96  separate copy of the packet data. For each repetition, if the program drops
     97  the packet, the data page is immediately recycled (see below). Otherwise, the
     98  packet is buffered until the end of the batch, at which point all packets
     99  buffered this way during the batch are transmitted at once.
    100
    101- When setting up the test run, the kernel will initialise a pool of memory
    102  pages of the same size as the batch size. Each memory page will be initialised
    103  with the initial packet data supplied by userspace at ``BPF_PROG_RUN``
    104  invocation. When possible, the pages will be recycled on future program
    105  invocations, to improve performance. Pages will generally be recycled a full
    106  batch at a time, except when a packet is dropped (by return code or because
    107  of, say, a redirection error), in which case that page will be recycled
    108  immediately. If a packet ends up being passed to the regular networking stack
    109  (because the XDP program returns ``XDP_PASS``, or because it ends up being
    110  redirected to an interface that injects it into the stack), the page will be
    111  released and a new one will be allocated when the pool is empty.
    112
    113  When recycling, the page content is not rewritten; only the packet boundary
    114  pointers (``data``, ``data_end`` and ``data_meta``) in the context object will
    115  be reset to the original values. This means that if a program rewrites the
    116  packet contents, it has to be prepared to see either the original content or
    117  the modified version on subsequent invocations.