prog_sk_lookup.rst (3864B)
1.. SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause) 2 3===================== 4BPF sk_lookup program 5===================== 6 7BPF sk_lookup program type (``BPF_PROG_TYPE_SK_LOOKUP``) introduces programmability 8into the socket lookup performed by the transport layer when a packet is to be 9delivered locally. 10 11When invoked BPF sk_lookup program can select a socket that will receive the 12incoming packet by calling the ``bpf_sk_assign()`` BPF helper function. 13 14Hooks for a common attach point (``BPF_SK_LOOKUP``) exist for both TCP and UDP. 15 16Motivation 17========== 18 19BPF sk_lookup program type was introduced to address setup scenarios where 20binding sockets to an address with ``bind()`` socket call is impractical, such 21as: 22 231. receiving connections on a range of IP addresses, e.g. 192.0.2.0/24, when 24 binding to a wildcard address ``INADRR_ANY`` is not possible due to a port 25 conflict, 262. receiving connections on all or a wide range of ports, i.e. an L7 proxy use 27 case. 28 29Such setups would require creating and ``bind()``'ing one socket to each of the 30IP address/port in the range, leading to resource consumption and potential 31latency spikes during socket lookup. 32 33Attachment 34========== 35 36BPF sk_lookup program can be attached to a network namespace with 37``bpf(BPF_LINK_CREATE, ...)`` syscall using the ``BPF_SK_LOOKUP`` attach type and a 38netns FD as attachment ``target_fd``. 39 40Multiple programs can be attached to one network namespace. Programs will be 41invoked in the same order as they were attached. 42 43Hooks 44===== 45 46The attached BPF sk_lookup programs run whenever the transport layer needs to 47find a listening (TCP) or an unconnected (UDP) socket for an incoming packet. 48 49Incoming traffic to established (TCP) and connected (UDP) sockets is delivered 50as usual without triggering the BPF sk_lookup hook. 51 52The attached BPF programs must return with either ``SK_PASS`` or ``SK_DROP`` 53verdict code. As for other BPF program types that are network filters, 54``SK_PASS`` signifies that the socket lookup should continue on to regular 55hashtable-based lookup, while ``SK_DROP`` causes the transport layer to drop the 56packet. 57 58A BPF sk_lookup program can also select a socket to receive the packet by 59calling ``bpf_sk_assign()`` BPF helper. Typically, the program looks up a socket 60in a map holding sockets, such as ``SOCKMAP`` or ``SOCKHASH``, and passes a 61``struct bpf_sock *`` to ``bpf_sk_assign()`` helper to record the 62selection. Selecting a socket only takes effect if the program has terminated 63with ``SK_PASS`` code. 64 65When multiple programs are attached, the end result is determined from return 66codes of all the programs according to the following rules: 67 681. If any program returned ``SK_PASS`` and selected a valid socket, the socket 69 is used as the result of the socket lookup. 702. If more than one program returned ``SK_PASS`` and selected a socket, the last 71 selection takes effect. 723. If any program returned ``SK_DROP``, and no program returned ``SK_PASS`` and 73 selected a socket, socket lookup fails. 744. If all programs returned ``SK_PASS`` and none of them selected a socket, 75 socket lookup continues on. 76 77API 78=== 79 80In its context, an instance of ``struct bpf_sk_lookup``, BPF sk_lookup program 81receives information about the packet that triggered the socket lookup. Namely: 82 83* IP version (``AF_INET`` or ``AF_INET6``), 84* L4 protocol identifier (``IPPROTO_TCP`` or ``IPPROTO_UDP``), 85* source and destination IP address, 86* source and destination L4 port, 87* the socket that has been selected with ``bpf_sk_assign()``. 88 89Refer to ``struct bpf_sk_lookup`` declaration in ``linux/bpf.h`` user API 90header, and `bpf-helpers(7) 91<https://man7.org/linux/man-pages/man7/bpf-helpers.7.html>`_ man-page section 92for ``bpf_sk_assign()`` for details. 93 94Example 95======= 96 97See ``tools/testing/selftests/bpf/prog_tests/sk_lookup.c`` for the reference 98implementation.