asm-annotations.rst (9629B)
1Assembler Annotations 2===================== 3 4Copyright (c) 2017-2019 Jiri Slaby 5 6This document describes the new macros for annotation of data and code in 7assembly. In particular, it contains information about ``SYM_FUNC_START``, 8``SYM_FUNC_END``, ``SYM_CODE_START``, and similar. 9 10Rationale 11--------- 12Some code like entries, trampolines, or boot code needs to be written in 13assembly. The same as in C, such code is grouped into functions and 14accompanied with data. Standard assemblers do not force users into precisely 15marking these pieces as code, data, or even specifying their length. 16Nevertheless, assemblers provide developers with such annotations to aid 17debuggers throughout assembly. On top of that, developers also want to mark 18some functions as *global* in order to be visible outside of their translation 19units. 20 21Over time, the Linux kernel has adopted macros from various projects (like 22``binutils``) to facilitate such annotations. So for historic reasons, 23developers have been using ``ENTRY``, ``END``, ``ENDPROC``, and other 24annotations in assembly. Due to the lack of their documentation, the macros 25are used in rather wrong contexts at some locations. Clearly, ``ENTRY`` was 26intended to denote the beginning of global symbols (be it data or code). 27``END`` used to mark the end of data or end of special functions with 28*non-standard* calling convention. In contrast, ``ENDPROC`` should annotate 29only ends of *standard* functions. 30 31When these macros are used correctly, they help assemblers generate a nice 32object with both sizes and types set correctly. For example, the result of 33``arch/x86/lib/putuser.S``:: 34 35 Num: Value Size Type Bind Vis Ndx Name 36 25: 0000000000000000 33 FUNC GLOBAL DEFAULT 1 __put_user_1 37 29: 0000000000000030 37 FUNC GLOBAL DEFAULT 1 __put_user_2 38 32: 0000000000000060 36 FUNC GLOBAL DEFAULT 1 __put_user_4 39 35: 0000000000000090 37 FUNC GLOBAL DEFAULT 1 __put_user_8 40 41This is not only important for debugging purposes. When there are properly 42annotated objects like this, tools can be run on them to generate more useful 43information. In particular, on properly annotated objects, ``objtool`` can be 44run to check and fix the object if needed. Currently, ``objtool`` can report 45missing frame pointer setup/destruction in functions. It can also 46automatically generate annotations for :doc:`ORC unwinder <x86/orc-unwinder>` 47for most code. Both of these are especially important to support reliable 48stack traces which are in turn necessary for :doc:`Kernel live patching 49<livepatch/livepatch>`. 50 51Caveat and Discussion 52--------------------- 53As one might realize, there were only three macros previously. That is indeed 54insufficient to cover all the combinations of cases: 55 56* standard/non-standard function 57* code/data 58* global/local symbol 59 60There was a discussion_ and instead of extending the current ``ENTRY/END*`` 61macros, it was decided that brand new macros should be introduced instead:: 62 63 So how about using macro names that actually show the purpose, instead 64 of importing all the crappy, historic, essentially randomly chosen 65 debug symbol macro names from the binutils and older kernels? 66 67.. _discussion: https://lore.kernel.org/r/20170217104757.28588-1-jslaby@suse.cz 68 69Macros Description 70------------------ 71 72The new macros are prefixed with the ``SYM_`` prefix and can be divided into 73three main groups: 74 751. ``SYM_FUNC_*`` -- to annotate C-like functions. This means functions with 76 standard C calling conventions. For example, on x86, this means that the 77 stack contains a return address at the predefined place and a return from 78 the function can happen in a standard way. When frame pointers are enabled, 79 save/restore of frame pointer shall happen at the start/end of a function, 80 respectively, too. 81 82 Checking tools like ``objtool`` should ensure such marked functions conform 83 to these rules. The tools can also easily annotate these functions with 84 debugging information (like *ORC data*) automatically. 85 862. ``SYM_CODE_*`` -- special functions called with special stack. Be it 87 interrupt handlers with special stack content, trampolines, or startup 88 functions. 89 90 Checking tools mostly ignore checking of these functions. But some debug 91 information still can be generated automatically. For correct debug data, 92 this code needs hints like ``UNWIND_HINT_REGS`` provided by developers. 93 943. ``SYM_DATA*`` -- obviously data belonging to ``.data`` sections and not to 95 ``.text``. Data do not contain instructions, so they have to be treated 96 specially by the tools: they should not treat the bytes as instructions, 97 nor assign any debug information to them. 98 99Instruction Macros 100~~~~~~~~~~~~~~~~~~ 101This section covers ``SYM_FUNC_*`` and ``SYM_CODE_*`` enumerated above. 102 103``objtool`` requires that all code must be contained in an ELF symbol. Symbol 104names that have a ``.L`` prefix do not emit symbol table entries. ``.L`` 105prefixed symbols can be used within a code region, but should be avoided for 106denoting a range of code via ``SYM_*_START/END`` annotations. 107 108* ``SYM_FUNC_START`` and ``SYM_FUNC_START_LOCAL`` are supposed to be **the 109 most frequent markings**. They are used for functions with standard calling 110 conventions -- global and local. Like in C, they both align the functions to 111 architecture specific ``__ALIGN`` bytes. There are also ``_NOALIGN`` variants 112 for special cases where developers do not want this implicit alignment. 113 114 ``SYM_FUNC_START_WEAK`` and ``SYM_FUNC_START_WEAK_NOALIGN`` markings are 115 also offered as an assembler counterpart to the *weak* attribute known from 116 C. 117 118 All of these **shall** be coupled with ``SYM_FUNC_END``. First, it marks 119 the sequence of instructions as a function and computes its size to the 120 generated object file. Second, it also eases checking and processing such 121 object files as the tools can trivially find exact function boundaries. 122 123 So in most cases, developers should write something like in the following 124 example, having some asm instructions in between the macros, of course:: 125 126 SYM_FUNC_START(memset) 127 ... asm insns ... 128 SYM_FUNC_END(memset) 129 130 In fact, this kind of annotation corresponds to the now deprecated ``ENTRY`` 131 and ``ENDPROC`` macros. 132 133* ``SYM_FUNC_ALIAS``, ``SYM_FUNC_ALIAS_LOCAL``, and ``SYM_FUNC_ALIAS_WEAK`` can 134 be used to define multiple names for a function. The typical use is:: 135 136 SYM_FUNC_START(__memset) 137 ... asm insns ... 138 SYN_FUNC_END(__memset) 139 SYM_FUNC_ALIAS(memset, __memset) 140 141 In this example, one can call ``__memset`` or ``memset`` with the same 142 result, except the debug information for the instructions is generated to 143 the object file only once -- for the non-``ALIAS`` case. 144 145* ``SYM_CODE_START`` and ``SYM_CODE_START_LOCAL`` should be used only in 146 special cases -- if you know what you are doing. This is used exclusively 147 for interrupt handlers and similar where the calling convention is not the C 148 one. ``_NOALIGN`` variants exist too. The use is the same as for the ``FUNC`` 149 category above:: 150 151 SYM_CODE_START_LOCAL(bad_put_user) 152 ... asm insns ... 153 SYM_CODE_END(bad_put_user) 154 155 Again, every ``SYM_CODE_START*`` **shall** be coupled by ``SYM_CODE_END``. 156 157 To some extent, this category corresponds to deprecated ``ENTRY`` and 158 ``END``. Except ``END`` had several other meanings too. 159 160* ``SYM_INNER_LABEL*`` is used to denote a label inside some 161 ``SYM_{CODE,FUNC}_START`` and ``SYM_{CODE,FUNC}_END``. They are very similar 162 to C labels, except they can be made global. An example of use:: 163 164 SYM_CODE_START(ftrace_caller) 165 /* save_mcount_regs fills in first two parameters */ 166 ... 167 168 SYM_INNER_LABEL(ftrace_caller_op_ptr, SYM_L_GLOBAL) 169 /* Load the ftrace_ops into the 3rd parameter */ 170 ... 171 172 SYM_INNER_LABEL(ftrace_call, SYM_L_GLOBAL) 173 call ftrace_stub 174 ... 175 retq 176 SYM_CODE_END(ftrace_caller) 177 178Data Macros 179~~~~~~~~~~~ 180Similar to instructions, there is a couple of macros to describe data in the 181assembly. 182 183* ``SYM_DATA_START`` and ``SYM_DATA_START_LOCAL`` mark the start of some data 184 and shall be used in conjunction with either ``SYM_DATA_END``, or 185 ``SYM_DATA_END_LABEL``. The latter adds also a label to the end, so that 186 people can use ``lstack`` and (local) ``lstack_end`` in the following 187 example:: 188 189 SYM_DATA_START_LOCAL(lstack) 190 .skip 4096 191 SYM_DATA_END_LABEL(lstack, SYM_L_LOCAL, lstack_end) 192 193* ``SYM_DATA`` and ``SYM_DATA_LOCAL`` are variants for simple, mostly one-line 194 data:: 195 196 SYM_DATA(HEAP, .long rm_heap) 197 SYM_DATA(heap_end, .long rm_stack) 198 199 In the end, they expand to ``SYM_DATA_START`` with ``SYM_DATA_END`` 200 internally. 201 202Support Macros 203~~~~~~~~~~~~~~ 204All the above reduce themselves to some invocation of ``SYM_START``, 205``SYM_END``, or ``SYM_ENTRY`` at last. Normally, developers should avoid using 206these. 207 208Further, in the above examples, one could see ``SYM_L_LOCAL``. There are also 209``SYM_L_GLOBAL`` and ``SYM_L_WEAK``. All are intended to denote linkage of a 210symbol marked by them. They are used either in ``_LABEL`` variants of the 211earlier macros, or in ``SYM_START``. 212 213 214Overriding Macros 215~~~~~~~~~~~~~~~~~ 216Architecture can also override any of the macros in their own 217``asm/linkage.h``, including macros specifying the type of a symbol 218(``SYM_T_FUNC``, ``SYM_T_OBJECT``, and ``SYM_T_NONE``). As every macro 219described in this file is surrounded by ``#ifdef`` + ``#endif``, it is enough 220to define the macros differently in the aforementioned architecture-dependent 221header.