cachepc-linux

Fork of AMDESE/linux with modifications for CachePC side-channel attack
git clone https://git.sinitax.com/sinitax/cachepc-linux
Log | Files | Refs | README | LICENSE | sfeed.txt

asm-annotations.rst (9629B)


      1Assembler Annotations
      2=====================
      3
      4Copyright (c) 2017-2019 Jiri Slaby
      5
      6This document describes the new macros for annotation of data and code in
      7assembly. In particular, it contains information about ``SYM_FUNC_START``,
      8``SYM_FUNC_END``, ``SYM_CODE_START``, and similar.
      9
     10Rationale
     11---------
     12Some code like entries, trampolines, or boot code needs to be written in
     13assembly. The same as in C, such code is grouped into functions and
     14accompanied with data. Standard assemblers do not force users into precisely
     15marking these pieces as code, data, or even specifying their length.
     16Nevertheless, assemblers provide developers with such annotations to aid
     17debuggers throughout assembly. On top of that, developers also want to mark
     18some functions as *global* in order to be visible outside of their translation
     19units.
     20
     21Over time, the Linux kernel has adopted macros from various projects (like
     22``binutils``) to facilitate such annotations. So for historic reasons,
     23developers have been using ``ENTRY``, ``END``, ``ENDPROC``, and other
     24annotations in assembly.  Due to the lack of their documentation, the macros
     25are used in rather wrong contexts at some locations. Clearly, ``ENTRY`` was
     26intended to denote the beginning of global symbols (be it data or code).
     27``END`` used to mark the end of data or end of special functions with
     28*non-standard* calling convention. In contrast, ``ENDPROC`` should annotate
     29only ends of *standard* functions.
     30
     31When these macros are used correctly, they help assemblers generate a nice
     32object with both sizes and types set correctly. For example, the result of
     33``arch/x86/lib/putuser.S``::
     34
     35   Num:    Value          Size Type    Bind   Vis      Ndx Name
     36    25: 0000000000000000    33 FUNC    GLOBAL DEFAULT    1 __put_user_1
     37    29: 0000000000000030    37 FUNC    GLOBAL DEFAULT    1 __put_user_2
     38    32: 0000000000000060    36 FUNC    GLOBAL DEFAULT    1 __put_user_4
     39    35: 0000000000000090    37 FUNC    GLOBAL DEFAULT    1 __put_user_8
     40
     41This is not only important for debugging purposes. When there are properly
     42annotated objects like this, tools can be run on them to generate more useful
     43information. In particular, on properly annotated objects, ``objtool`` can be
     44run to check and fix the object if needed. Currently, ``objtool`` can report
     45missing frame pointer setup/destruction in functions. It can also
     46automatically generate annotations for :doc:`ORC unwinder <x86/orc-unwinder>`
     47for most code. Both of these are especially important to support reliable
     48stack traces which are in turn necessary for :doc:`Kernel live patching
     49<livepatch/livepatch>`.
     50
     51Caveat and Discussion
     52---------------------
     53As one might realize, there were only three macros previously. That is indeed
     54insufficient to cover all the combinations of cases:
     55
     56* standard/non-standard function
     57* code/data
     58* global/local symbol
     59
     60There was a discussion_ and instead of extending the current ``ENTRY/END*``
     61macros, it was decided that brand new macros should be introduced instead::
     62
     63    So how about using macro names that actually show the purpose, instead
     64    of importing all the crappy, historic, essentially randomly chosen
     65    debug symbol macro names from the binutils and older kernels?
     66
     67.. _discussion: https://lore.kernel.org/r/20170217104757.28588-1-jslaby@suse.cz
     68
     69Macros Description
     70------------------
     71
     72The new macros are prefixed with the ``SYM_`` prefix and can be divided into
     73three main groups:
     74
     751. ``SYM_FUNC_*`` -- to annotate C-like functions. This means functions with
     76   standard C calling conventions. For example, on x86, this means that the
     77   stack contains a return address at the predefined place and a return from
     78   the function can happen in a standard way. When frame pointers are enabled,
     79   save/restore of frame pointer shall happen at the start/end of a function,
     80   respectively, too.
     81
     82   Checking tools like ``objtool`` should ensure such marked functions conform
     83   to these rules. The tools can also easily annotate these functions with
     84   debugging information (like *ORC data*) automatically.
     85
     862. ``SYM_CODE_*`` -- special functions called with special stack. Be it
     87   interrupt handlers with special stack content, trampolines, or startup
     88   functions.
     89
     90   Checking tools mostly ignore checking of these functions. But some debug
     91   information still can be generated automatically. For correct debug data,
     92   this code needs hints like ``UNWIND_HINT_REGS`` provided by developers.
     93
     943. ``SYM_DATA*`` -- obviously data belonging to ``.data`` sections and not to
     95   ``.text``. Data do not contain instructions, so they have to be treated
     96   specially by the tools: they should not treat the bytes as instructions,
     97   nor assign any debug information to them.
     98
     99Instruction Macros
    100~~~~~~~~~~~~~~~~~~
    101This section covers ``SYM_FUNC_*`` and ``SYM_CODE_*`` enumerated above.
    102
    103``objtool`` requires that all code must be contained in an ELF symbol. Symbol
    104names that have a ``.L`` prefix do not emit symbol table entries. ``.L``
    105prefixed symbols can be used within a code region, but should be avoided for
    106denoting a range of code via ``SYM_*_START/END`` annotations.
    107
    108* ``SYM_FUNC_START`` and ``SYM_FUNC_START_LOCAL`` are supposed to be **the
    109  most frequent markings**. They are used for functions with standard calling
    110  conventions -- global and local. Like in C, they both align the functions to
    111  architecture specific ``__ALIGN`` bytes. There are also ``_NOALIGN`` variants
    112  for special cases where developers do not want this implicit alignment.
    113
    114  ``SYM_FUNC_START_WEAK`` and ``SYM_FUNC_START_WEAK_NOALIGN`` markings are
    115  also offered as an assembler counterpart to the *weak* attribute known from
    116  C.
    117
    118  All of these **shall** be coupled with ``SYM_FUNC_END``. First, it marks
    119  the sequence of instructions as a function and computes its size to the
    120  generated object file. Second, it also eases checking and processing such
    121  object files as the tools can trivially find exact function boundaries.
    122
    123  So in most cases, developers should write something like in the following
    124  example, having some asm instructions in between the macros, of course::
    125
    126    SYM_FUNC_START(memset)
    127        ... asm insns ...
    128    SYM_FUNC_END(memset)
    129
    130  In fact, this kind of annotation corresponds to the now deprecated ``ENTRY``
    131  and ``ENDPROC`` macros.
    132
    133* ``SYM_FUNC_ALIAS``, ``SYM_FUNC_ALIAS_LOCAL``, and ``SYM_FUNC_ALIAS_WEAK`` can
    134  be used to define multiple names for a function. The typical use is::
    135
    136    SYM_FUNC_START(__memset)
    137        ... asm insns ...
    138    SYN_FUNC_END(__memset)
    139    SYM_FUNC_ALIAS(memset, __memset)
    140
    141  In this example, one can call ``__memset`` or ``memset`` with the same
    142  result, except the debug information for the instructions is generated to
    143  the object file only once -- for the non-``ALIAS`` case.
    144
    145* ``SYM_CODE_START`` and ``SYM_CODE_START_LOCAL`` should be used only in
    146  special cases -- if you know what you are doing. This is used exclusively
    147  for interrupt handlers and similar where the calling convention is not the C
    148  one. ``_NOALIGN`` variants exist too. The use is the same as for the ``FUNC``
    149  category above::
    150
    151    SYM_CODE_START_LOCAL(bad_put_user)
    152        ... asm insns ...
    153    SYM_CODE_END(bad_put_user)
    154
    155  Again, every ``SYM_CODE_START*`` **shall** be coupled by ``SYM_CODE_END``.
    156
    157  To some extent, this category corresponds to deprecated ``ENTRY`` and
    158  ``END``. Except ``END`` had several other meanings too.
    159
    160* ``SYM_INNER_LABEL*`` is used to denote a label inside some
    161  ``SYM_{CODE,FUNC}_START`` and ``SYM_{CODE,FUNC}_END``.  They are very similar
    162  to C labels, except they can be made global. An example of use::
    163
    164    SYM_CODE_START(ftrace_caller)
    165        /* save_mcount_regs fills in first two parameters */
    166        ...
    167
    168    SYM_INNER_LABEL(ftrace_caller_op_ptr, SYM_L_GLOBAL)
    169        /* Load the ftrace_ops into the 3rd parameter */
    170        ...
    171
    172    SYM_INNER_LABEL(ftrace_call, SYM_L_GLOBAL)
    173        call ftrace_stub
    174        ...
    175        retq
    176    SYM_CODE_END(ftrace_caller)
    177
    178Data Macros
    179~~~~~~~~~~~
    180Similar to instructions, there is a couple of macros to describe data in the
    181assembly.
    182
    183* ``SYM_DATA_START`` and ``SYM_DATA_START_LOCAL`` mark the start of some data
    184  and shall be used in conjunction with either ``SYM_DATA_END``, or
    185  ``SYM_DATA_END_LABEL``. The latter adds also a label to the end, so that
    186  people can use ``lstack`` and (local) ``lstack_end`` in the following
    187  example::
    188
    189    SYM_DATA_START_LOCAL(lstack)
    190        .skip 4096
    191    SYM_DATA_END_LABEL(lstack, SYM_L_LOCAL, lstack_end)
    192
    193* ``SYM_DATA`` and ``SYM_DATA_LOCAL`` are variants for simple, mostly one-line
    194  data::
    195
    196    SYM_DATA(HEAP,     .long rm_heap)
    197    SYM_DATA(heap_end, .long rm_stack)
    198
    199  In the end, they expand to ``SYM_DATA_START`` with ``SYM_DATA_END``
    200  internally.
    201
    202Support Macros
    203~~~~~~~~~~~~~~
    204All the above reduce themselves to some invocation of ``SYM_START``,
    205``SYM_END``, or ``SYM_ENTRY`` at last. Normally, developers should avoid using
    206these.
    207
    208Further, in the above examples, one could see ``SYM_L_LOCAL``. There are also
    209``SYM_L_GLOBAL`` and ``SYM_L_WEAK``. All are intended to denote linkage of a
    210symbol marked by them. They are used either in ``_LABEL`` variants of the
    211earlier macros, or in ``SYM_START``.
    212
    213
    214Overriding Macros
    215~~~~~~~~~~~~~~~~~
    216Architecture can also override any of the macros in their own
    217``asm/linkage.h``, including macros specifying the type of a symbol
    218(``SYM_T_FUNC``, ``SYM_T_OBJECT``, and ``SYM_T_NONE``).  As every macro
    219described in this file is surrounded by ``#ifdef`` + ``#endif``, it is enough
    220to define the macros differently in the aforementioned architecture-dependent
    221header.