cachepc-linux

Fork of AMDESE/linux with modifications for CachePC side-channel attack
git clone https://git.sinitax.com/sinitax/cachepc-linux
Log | Files | Refs | README | LICENSE | sfeed.txt

kcm.rst (10917B)


      1.. SPDX-License-Identifier: GPL-2.0
      2
      3=============================
      4Kernel Connection Multiplexor
      5=============================
      6
      7Kernel Connection Multiplexor (KCM) is a mechanism that provides a message based
      8interface over TCP for generic application protocols. With KCM an application
      9can efficiently send and receive application protocol messages over TCP using
     10datagram sockets.
     11
     12KCM implements an NxM multiplexor in the kernel as diagrammed below::
     13
     14    +------------+   +------------+   +------------+   +------------+
     15    | KCM socket |   | KCM socket |   | KCM socket |   | KCM socket |
     16    +------------+   +------------+   +------------+   +------------+
     17	|                 |               |                |
     18	+-----------+     |               |     +----------+
     19		    |     |               |     |
     20		+----------------------------------+
     21		|           Multiplexor            |
     22		+----------------------------------+
     23		    |   |           |           |  |
     24	+---------+   |           |           |  ------------+
     25	|             |           |           |              |
     26    +----------+  +----------+  +----------+  +----------+ +----------+
     27    |  Psock   |  |  Psock   |  |  Psock   |  |  Psock   | |  Psock   |
     28    +----------+  +----------+  +----------+  +----------+ +----------+
     29	|              |           |            |             |
     30    +----------+  +----------+  +----------+  +----------+ +----------+
     31    | TCP sock |  | TCP sock |  | TCP sock |  | TCP sock | | TCP sock |
     32    +----------+  +----------+  +----------+  +----------+ +----------+
     33
     34KCM sockets
     35===========
     36
     37The KCM sockets provide the user interface to the multiplexor. All the KCM sockets
     38bound to a multiplexor are considered to have equivalent function, and I/O
     39operations in different sockets may be done in parallel without the need for
     40synchronization between threads in userspace.
     41
     42Multiplexor
     43===========
     44
     45The multiplexor provides the message steering. In the transmit path, messages
     46written on a KCM socket are sent atomically on an appropriate TCP socket.
     47Similarly, in the receive path, messages are constructed on each TCP socket
     48(Psock) and complete messages are steered to a KCM socket.
     49
     50TCP sockets & Psocks
     51====================
     52
     53TCP sockets may be bound to a KCM multiplexor. A Psock structure is allocated
     54for each bound TCP socket, this structure holds the state for constructing
     55messages on receive as well as other connection specific information for KCM.
     56
     57Connected mode semantics
     58========================
     59
     60Each multiplexor assumes that all attached TCP connections are to the same
     61destination and can use the different connections for load balancing when
     62transmitting. The normal send and recv calls (include sendmmsg and recvmmsg)
     63can be used to send and receive messages from the KCM socket.
     64
     65Socket types
     66============
     67
     68KCM supports SOCK_DGRAM and SOCK_SEQPACKET socket types.
     69
     70Message delineation
     71-------------------
     72
     73Messages are sent over a TCP stream with some application protocol message
     74format that typically includes a header which frames the messages. The length
     75of a received message can be deduced from the application protocol header
     76(often just a simple length field).
     77
     78A TCP stream must be parsed to determine message boundaries. Berkeley Packet
     79Filter (BPF) is used for this. When attaching a TCP socket to a multiplexor a
     80BPF program must be specified. The program is called at the start of receiving
     81a new message and is given an skbuff that contains the bytes received so far.
     82It parses the message header and returns the length of the message. Given this
     83information, KCM will construct the message of the stated length and deliver it
     84to a KCM socket.
     85
     86TCP socket management
     87---------------------
     88
     89When a TCP socket is attached to a KCM multiplexor data ready (POLLIN) and
     90write space available (POLLOUT) events are handled by the multiplexor. If there
     91is a state change (disconnection) or other error on a TCP socket, an error is
     92posted on the TCP socket so that a POLLERR event happens and KCM discontinues
     93using the socket. When the application gets the error notification for a
     94TCP socket, it should unattach the socket from KCM and then handle the error
     95condition (the typical response is to close the socket and create a new
     96connection if necessary).
     97
     98KCM limits the maximum receive message size to be the size of the receive
     99socket buffer on the attached TCP socket (the socket buffer size can be set by
    100SO_RCVBUF). If the length of a new message reported by the BPF program is
    101greater than this limit a corresponding error (EMSGSIZE) is posted on the TCP
    102socket. The BPF program may also enforce a maximum messages size and report an
    103error when it is exceeded.
    104
    105A timeout may be set for assembling messages on a receive socket. The timeout
    106value is taken from the receive timeout of the attached TCP socket (this is set
    107by SO_RCVTIMEO). If the timer expires before assembly is complete an error
    108(ETIMEDOUT) is posted on the socket.
    109
    110User interface
    111==============
    112
    113Creating a multiplexor
    114----------------------
    115
    116A new multiplexor and initial KCM socket is created by a socket call::
    117
    118  socket(AF_KCM, type, protocol)
    119
    120- type is either SOCK_DGRAM or SOCK_SEQPACKET
    121- protocol is KCMPROTO_CONNECTED
    122
    123Cloning KCM sockets
    124-------------------
    125
    126After the first KCM socket is created using the socket call as described
    127above, additional sockets for the multiplexor can be created by cloning
    128a KCM socket. This is accomplished by an ioctl on a KCM socket::
    129
    130  /* From linux/kcm.h */
    131  struct kcm_clone {
    132	int fd;
    133  };
    134
    135  struct kcm_clone info;
    136
    137  memset(&info, 0, sizeof(info));
    138
    139  err = ioctl(kcmfd, SIOCKCMCLONE, &info);
    140
    141  if (!err)
    142    newkcmfd = info.fd;
    143
    144Attach transport sockets
    145------------------------
    146
    147Attaching of transport sockets to a multiplexor is performed by calling an
    148ioctl on a KCM socket for the multiplexor. e.g.::
    149
    150  /* From linux/kcm.h */
    151  struct kcm_attach {
    152	int fd;
    153	int bpf_fd;
    154  };
    155
    156  struct kcm_attach info;
    157
    158  memset(&info, 0, sizeof(info));
    159
    160  info.fd = tcpfd;
    161  info.bpf_fd = bpf_prog_fd;
    162
    163  ioctl(kcmfd, SIOCKCMATTACH, &info);
    164
    165The kcm_attach structure contains:
    166
    167  - fd: file descriptor for TCP socket being attached
    168  - bpf_prog_fd: file descriptor for compiled BPF program downloaded
    169
    170Unattach transport sockets
    171--------------------------
    172
    173Unattaching a transport socket from a multiplexor is straightforward. An
    174"unattach" ioctl is done with the kcm_unattach structure as the argument::
    175
    176  /* From linux/kcm.h */
    177  struct kcm_unattach {
    178	int fd;
    179  };
    180
    181  struct kcm_unattach info;
    182
    183  memset(&info, 0, sizeof(info));
    184
    185  info.fd = cfd;
    186
    187  ioctl(fd, SIOCKCMUNATTACH, &info);
    188
    189Disabling receive on KCM socket
    190-------------------------------
    191
    192A setsockopt is used to disable or enable receiving on a KCM socket.
    193When receive is disabled, any pending messages in the socket's
    194receive buffer are moved to other sockets. This feature is useful
    195if an application thread knows that it will be doing a lot of
    196work on a request and won't be able to service new messages for a
    197while. Example use::
    198
    199  int val = 1;
    200
    201  setsockopt(kcmfd, SOL_KCM, KCM_RECV_DISABLE, &val, sizeof(val))
    202
    203BFP programs for message delineation
    204------------------------------------
    205
    206BPF programs can be compiled using the BPF LLVM backend. For example,
    207the BPF program for parsing Thrift is::
    208
    209  #include "bpf.h" /* for __sk_buff */
    210  #include "bpf_helpers.h" /* for load_word intrinsic */
    211
    212  SEC("socket_kcm")
    213  int bpf_prog1(struct __sk_buff *skb)
    214  {
    215       return load_word(skb, 0) + 4;
    216  }
    217
    218  char _license[] SEC("license") = "GPL";
    219
    220Use in applications
    221===================
    222
    223KCM accelerates application layer protocols. Specifically, it allows
    224applications to use a message based interface for sending and receiving
    225messages. The kernel provides necessary assurances that messages are sent
    226and received atomically. This relieves much of the burden applications have
    227in mapping a message based protocol onto the TCP stream. KCM also make
    228application layer messages a unit of work in the kernel for the purposes of
    229steering and scheduling, which in turn allows a simpler networking model in
    230multithreaded applications.
    231
    232Configurations
    233--------------
    234
    235In an Nx1 configuration, KCM logically provides multiple socket handles
    236to the same TCP connection. This allows parallelism between in I/O
    237operations on the TCP socket (for instance copyin and copyout of data is
    238parallelized). In an application, a KCM socket can be opened for each
    239processing thread and inserted into the epoll (similar to how SO_REUSEPORT
    240is used to allow multiple listener sockets on the same port).
    241
    242In a MxN configuration, multiple connections are established to the
    243same destination. These are used for simple load balancing.
    244
    245Message batching
    246----------------
    247
    248The primary purpose of KCM is load balancing between KCM sockets and hence
    249threads in a nominal use case. Perfect load balancing, that is steering
    250each received message to a different KCM socket or steering each sent
    251message to a different TCP socket, can negatively impact performance
    252since this doesn't allow for affinities to be established. Balancing
    253based on groups, or batches of messages, can be beneficial for performance.
    254
    255On transmit, there are three ways an application can batch (pipeline)
    256messages on a KCM socket.
    257
    258  1) Send multiple messages in a single sendmmsg.
    259  2) Send a group of messages each with a sendmsg call, where all messages
    260     except the last have MSG_BATCH in the flags of sendmsg call.
    261  3) Create "super message" composed of multiple messages and send this
    262     with a single sendmsg.
    263
    264On receive, the KCM module attempts to queue messages received on the
    265same KCM socket during each TCP ready callback. The targeted KCM socket
    266changes at each receive ready callback on the KCM socket. The application
    267does not need to configure this.
    268
    269Error handling
    270--------------
    271
    272An application should include a thread to monitor errors raised on
    273the TCP connection. Normally, this will be done by placing each
    274TCP socket attached to a KCM multiplexor in epoll set for POLLERR
    275event. If an error occurs on an attached TCP socket, KCM sets an EPIPE
    276on the socket thus waking up the application thread. When the application
    277sees the error (which may just be a disconnect) it should unattach the
    278socket from KCM and then close it. It is assumed that once an error is
    279posted on the TCP socket the data stream is unrecoverable (i.e. an error
    280may have occurred in the middle of receiving a message).
    281
    282TCP connection monitoring
    283-------------------------
    284
    285In KCM there is no means to correlate a message to the TCP socket that
    286was used to send or receive the message (except in the case there is
    287only one attached TCP socket). However, the application does retain
    288an open file descriptor to the socket so it will be able to get statistics
    289from the socket which can be used in detecting issues (such as high
    290retransmissions on the socket).