instruction-set.rst (12919B)
1 2==================== 3eBPF Instruction Set 4==================== 5 6Registers and calling convention 7================================ 8 9eBPF has 10 general purpose registers and a read-only frame pointer register, 10all of which are 64-bits wide. 11 12The eBPF calling convention is defined as: 13 14 * R0: return value from function calls, and exit value for eBPF programs 15 * R1 - R5: arguments for function calls 16 * R6 - R9: callee saved registers that function calls will preserve 17 * R10: read-only frame pointer to access stack 18 19R0 - R5 are scratch registers and eBPF programs needs to spill/fill them if 20necessary across calls. 21 22Instruction encoding 23==================== 24 25eBPF has two instruction encodings: 26 27 * the basic instruction encoding, which uses 64 bits to encode an instruction 28 * the wide instruction encoding, which appends a second 64-bit immediate value 29 (imm64) after the basic instruction for a total of 128 bits. 30 31The basic instruction encoding looks as follows: 32 33 ============= ======= =============== ==================== ============ 34 32 bits (MSB) 16 bits 4 bits 4 bits 8 bits (LSB) 35 ============= ======= =============== ==================== ============ 36 immediate offset source register destination register opcode 37 ============= ======= =============== ==================== ============ 38 39Note that most instructions do not use all of the fields. 40Unused fields shall be cleared to zero. 41 42Instruction classes 43------------------- 44 45The three LSB bits of the 'opcode' field store the instruction class: 46 47 ========= ===== =============================== 48 class value description 49 ========= ===== =============================== 50 BPF_LD 0x00 non-standard load operations 51 BPF_LDX 0x01 load into register operations 52 BPF_ST 0x02 store from immediate operations 53 BPF_STX 0x03 store from register operations 54 BPF_ALU 0x04 32-bit arithmetic operations 55 BPF_JMP 0x05 64-bit jump operations 56 BPF_JMP32 0x06 32-bit jump operations 57 BPF_ALU64 0x07 64-bit arithmetic operations 58 ========= ===== =============================== 59 60Arithmetic and jump instructions 61================================ 62 63For arithmetic and jump instructions (BPF_ALU, BPF_ALU64, BPF_JMP and 64BPF_JMP32), the 8-bit 'opcode' field is divided into three parts: 65 66 ============== ====== ================= 67 4 bits (MSB) 1 bit 3 bits (LSB) 68 ============== ====== ================= 69 operation code source instruction class 70 ============== ====== ================= 71 72The 4th bit encodes the source operand: 73 74 ====== ===== ======================================== 75 source value description 76 ====== ===== ======================================== 77 BPF_K 0x00 use 32-bit immediate as source operand 78 BPF_X 0x08 use 'src_reg' register as source operand 79 ====== ===== ======================================== 80 81The four MSB bits store the operation code. 82 83 84Arithmetic instructions 85----------------------- 86 87BPF_ALU uses 32-bit wide operands while BPF_ALU64 uses 64-bit wide operands for 88otherwise identical operations. 89The code field encodes the operation as below: 90 91 ======== ===== ================================================= 92 code value description 93 ======== ===== ================================================= 94 BPF_ADD 0x00 dst += src 95 BPF_SUB 0x10 dst -= src 96 BPF_MUL 0x20 dst \*= src 97 BPF_DIV 0x30 dst /= src 98 BPF_OR 0x40 dst \|= src 99 BPF_AND 0x50 dst &= src 100 BPF_LSH 0x60 dst <<= src 101 BPF_RSH 0x70 dst >>= src 102 BPF_NEG 0x80 dst = ~src 103 BPF_MOD 0x90 dst %= src 104 BPF_XOR 0xa0 dst ^= src 105 BPF_MOV 0xb0 dst = src 106 BPF_ARSH 0xc0 sign extending shift right 107 BPF_END 0xd0 byte swap operations (see separate section below) 108 ======== ===== ================================================= 109 110BPF_ADD | BPF_X | BPF_ALU means:: 111 112 dst_reg = (u32) dst_reg + (u32) src_reg; 113 114BPF_ADD | BPF_X | BPF_ALU64 means:: 115 116 dst_reg = dst_reg + src_reg 117 118BPF_XOR | BPF_K | BPF_ALU means:: 119 120 src_reg = (u32) src_reg ^ (u32) imm32 121 122BPF_XOR | BPF_K | BPF_ALU64 means:: 123 124 src_reg = src_reg ^ imm32 125 126 127Byte swap instructions 128---------------------- 129 130The byte swap instructions use an instruction class of ``BFP_ALU`` and a 4-bit 131code field of ``BPF_END``. 132 133The byte swap instructions operate on the destination register 134only and do not use a separate source register or immediate value. 135 136The 1-bit source operand field in the opcode is used to to select what byte 137order the operation convert from or to: 138 139 ========= ===== ================================================= 140 source value description 141 ========= ===== ================================================= 142 BPF_TO_LE 0x00 convert between host byte order and little endian 143 BPF_TO_BE 0x08 convert between host byte order and big endian 144 ========= ===== ================================================= 145 146The imm field encodes the width of the swap operations. The following widths 147are supported: 16, 32 and 64. 148 149Examples: 150 151``BPF_ALU | BPF_TO_LE | BPF_END`` with imm = 16 means:: 152 153 dst_reg = htole16(dst_reg) 154 155``BPF_ALU | BPF_TO_BE | BPF_END`` with imm = 64 means:: 156 157 dst_reg = htobe64(dst_reg) 158 159``BPF_FROM_LE`` and ``BPF_FROM_BE`` exist as aliases for ``BPF_TO_LE`` and 160``BPF_TO_BE`` respectively. 161 162 163Jump instructions 164----------------- 165 166BPF_JMP32 uses 32-bit wide operands while BPF_JMP uses 64-bit wide operands for 167otherwise identical operations. 168The code field encodes the operation as below: 169 170 ======== ===== ========================= ============ 171 code value description notes 172 ======== ===== ========================= ============ 173 BPF_JA 0x00 PC += off BPF_JMP only 174 BPF_JEQ 0x10 PC += off if dst == src 175 BPF_JGT 0x20 PC += off if dst > src unsigned 176 BPF_JGE 0x30 PC += off if dst >= src unsigned 177 BPF_JSET 0x40 PC += off if dst & src 178 BPF_JNE 0x50 PC += off if dst != src 179 BPF_JSGT 0x60 PC += off if dst > src signed 180 BPF_JSGE 0x70 PC += off if dst >= src signed 181 BPF_CALL 0x80 function call 182 BPF_EXIT 0x90 function / program return BPF_JMP only 183 BPF_JLT 0xa0 PC += off if dst < src unsigned 184 BPF_JLE 0xb0 PC += off if dst <= src unsigned 185 BPF_JSLT 0xc0 PC += off if dst < src signed 186 BPF_JSLE 0xd0 PC += off if dst <= src signed 187 ======== ===== ========================= ============ 188 189The eBPF program needs to store the return value into register R0 before doing a 190BPF_EXIT. 191 192 193Load and store instructions 194=========================== 195 196For load and store instructions (BPF_LD, BPF_LDX, BPF_ST and BPF_STX), the 1978-bit 'opcode' field is divided as: 198 199 ============ ====== ================= 200 3 bits (MSB) 2 bits 3 bits (LSB) 201 ============ ====== ================= 202 mode size instruction class 203 ============ ====== ================= 204 205The size modifier is one of: 206 207 ============= ===== ===================== 208 size modifier value description 209 ============= ===== ===================== 210 BPF_W 0x00 word (4 bytes) 211 BPF_H 0x08 half word (2 bytes) 212 BPF_B 0x10 byte 213 BPF_DW 0x18 double word (8 bytes) 214 ============= ===== ===================== 215 216The mode modifier is one of: 217 218 ============= ===== ==================================== 219 mode modifier value description 220 ============= ===== ==================================== 221 BPF_IMM 0x00 64-bit immediate instructions 222 BPF_ABS 0x20 legacy BPF packet access (absolute) 223 BPF_IND 0x40 legacy BPF packet access (indirect) 224 BPF_MEM 0x60 regular load and store operations 225 BPF_ATOMIC 0xc0 atomic operations 226 ============= ===== ==================================== 227 228 229Regular load and store operations 230--------------------------------- 231 232The ``BPF_MEM`` mode modifier is used to encode regular load and store 233instructions that transfer data between a register and memory. 234 235``BPF_MEM | <size> | BPF_STX`` means:: 236 237 *(size *) (dst_reg + off) = src_reg 238 239``BPF_MEM | <size> | BPF_ST`` means:: 240 241 *(size *) (dst_reg + off) = imm32 242 243``BPF_MEM | <size> | BPF_LDX`` means:: 244 245 dst_reg = *(size *) (src_reg + off) 246 247Where size is one of: ``BPF_B``, ``BPF_H``, ``BPF_W``, or ``BPF_DW``. 248 249Atomic operations 250----------------- 251 252Atomic operations are operations that operate on memory and can not be 253interrupted or corrupted by other access to the same memory region 254by other eBPF programs or means outside of this specification. 255 256All atomic operations supported by eBPF are encoded as store operations 257that use the ``BPF_ATOMIC`` mode modifier as follows: 258 259 * ``BPF_ATOMIC | BPF_W | BPF_STX`` for 32-bit operations 260 * ``BPF_ATOMIC | BPF_DW | BPF_STX`` for 64-bit operations 261 * 8-bit and 16-bit wide atomic operations are not supported. 262 263The imm field is used to encode the actual atomic operation. 264Simple atomic operation use a subset of the values defined to encode 265arithmetic operations in the imm field to encode the atomic operation: 266 267 ======== ===== =========== 268 imm value description 269 ======== ===== =========== 270 BPF_ADD 0x00 atomic add 271 BPF_OR 0x40 atomic or 272 BPF_AND 0x50 atomic and 273 BPF_XOR 0xa0 atomic xor 274 ======== ===== =========== 275 276 277``BPF_ATOMIC | BPF_W | BPF_STX`` with imm = BPF_ADD means:: 278 279 *(u32 *)(dst_reg + off16) += src_reg 280 281``BPF_ATOMIC | BPF_DW | BPF_STX`` with imm = BPF ADD means:: 282 283 *(u64 *)(dst_reg + off16) += src_reg 284 285``BPF_XADD`` is a deprecated name for ``BPF_ATOMIC | BPF_ADD``. 286 287In addition to the simple atomic operations, there also is a modifier and 288two complex atomic operations: 289 290 =========== ================ =========================== 291 imm value description 292 =========== ================ =========================== 293 BPF_FETCH 0x01 modifier: return old value 294 BPF_XCHG 0xe0 | BPF_FETCH atomic exchange 295 BPF_CMPXCHG 0xf0 | BPF_FETCH atomic compare and exchange 296 =========== ================ =========================== 297 298The ``BPF_FETCH`` modifier is optional for simple atomic operations, and 299always set for the complex atomic operations. If the ``BPF_FETCH`` flag 300is set, then the operation also overwrites ``src_reg`` with the value that 301was in memory before it was modified. 302 303The ``BPF_XCHG`` operation atomically exchanges ``src_reg`` with the value 304addressed by ``dst_reg + off``. 305 306The ``BPF_CMPXCHG`` operation atomically compares the value addressed by 307``dst_reg + off`` with ``R0``. If they match, the value addressed by 308``dst_reg + off`` is replaced with ``src_reg``. In either case, the 309value that was at ``dst_reg + off`` before the operation is zero-extended 310and loaded back to ``R0``. 311 312Clang can generate atomic instructions by default when ``-mcpu=v3`` is 313enabled. If a lower version for ``-mcpu`` is set, the only atomic instruction 314Clang can generate is ``BPF_ADD`` *without* ``BPF_FETCH``. If you need to enable 315the atomics features, while keeping a lower ``-mcpu`` version, you can use 316``-Xclang -target-feature -Xclang +alu32``. 317 31864-bit immediate instructions 319----------------------------- 320 321Instructions with the ``BPF_IMM`` mode modifier use the wide instruction 322encoding for an extra imm64 value. 323 324There is currently only one such instruction. 325 326``BPF_LD | BPF_DW | BPF_IMM`` means:: 327 328 dst_reg = imm64 329 330 331Legacy BPF Packet access instructions 332------------------------------------- 333 334eBPF has special instructions for access to packet data that have been 335carried over from classic BPF to retain the performance of legacy socket 336filters running in the eBPF interpreter. 337 338The instructions come in two forms: ``BPF_ABS | <size> | BPF_LD`` and 339``BPF_IND | <size> | BPF_LD``. 340 341These instructions are used to access packet data and can only be used when 342the program context is a pointer to networking packet. ``BPF_ABS`` 343accesses packet data at an absolute offset specified by the immediate data 344and ``BPF_IND`` access packet data at an offset that includes the value of 345a register in addition to the immediate data. 346 347These instructions have seven implicit operands: 348 349 * Register R6 is an implicit input that must contain pointer to a 350 struct sk_buff. 351 * Register R0 is an implicit output which contains the data fetched from 352 the packet. 353 * Registers R1-R5 are scratch registers that are clobbered after a call to 354 ``BPF_ABS | BPF_LD`` or ``BPF_IND`` | BPF_LD instructions. 355 356These instructions have an implicit program exit condition as well. When an 357eBPF program is trying to access the data beyond the packet boundary, the 358program execution will be aborted. 359 360``BPF_ABS | BPF_W | BPF_LD`` means:: 361 362 R0 = ntohl(*(u32 *) (((struct sk_buff *) R6)->data + imm32)) 363 364``BPF_IND | BPF_W | BPF_LD`` means:: 365 366 R0 = ntohl(*(u32 *) (((struct sk_buff *) R6)->data + src_reg + imm32))