cachepc-qemu

Fork of AMDESE/qemu with changes for cachepc side-channel attack
git clone https://git.sinitax.com/sinitax/cachepc-qemu
Log | Files | Refs | Submodules | LICENSE | sfeed.txt

decodetree.rst (8959B)


      1========================
      2Decodetree Specification
      3========================
      4
      5A *decodetree* is built from instruction *patterns*.  A pattern may
      6represent a single architectural instruction or a group of same, depending
      7on what is convenient for further processing.
      8
      9Each pattern has both *fixedbits* and *fixedmask*, the combination of which
     10describes the condition under which the pattern is matched::
     11
     12  (insn & fixedmask) == fixedbits
     13
     14Each pattern may have *fields*, which are extracted from the insn and
     15passed along to the translator.  Examples of such are registers,
     16immediates, and sub-opcodes.
     17
     18In support of patterns, one may declare *fields*, *argument sets*, and
     19*formats*, each of which may be re-used to simplify further definitions.
     20
     21Fields
     22======
     23
     24Syntax::
     25
     26  field_def     := '%' identifier ( unnamed_field )* ( !function=identifier )?
     27  unnamed_field := number ':' ( 's' ) number
     28
     29For *unnamed_field*, the first number is the least-significant bit position
     30of the field and the second number is the length of the field.  If the 's' is
     31present, the field is considered signed.  If multiple ``unnamed_fields`` are
     32present, they are concatenated.  In this way one can define disjoint fields.
     33
     34If ``!function`` is specified, the concatenated result is passed through the
     35named function, taking and returning an integral value.
     36
     37One may use ``!function`` with zero ``unnamed_fields``.  This case is called
     38a *parameter*, and the named function is only passed the ``DisasContext``
     39and returns an integral value extracted from there.
     40
     41A field with no ``unnamed_fields`` and no ``!function`` is in error.
     42
     43Field examples:
     44
     45+---------------------------+---------------------------------------------+
     46| Input                     | Generated code                              |
     47+===========================+=============================================+
     48| %disp   0:s16             | sextract(i, 0, 16)                          |
     49+---------------------------+---------------------------------------------+
     50| %imm9   16:6 10:3         | extract(i, 16, 6) << 3 | extract(i, 10, 3)  |
     51+---------------------------+---------------------------------------------+
     52| %disp12 0:s1 1:1 2:10     | sextract(i, 0, 1) << 11 |                   |
     53|                           |    extract(i, 1, 1) << 10 |                 |
     54|                           |    extract(i, 2, 10)                        |
     55+---------------------------+---------------------------------------------+
     56| %shimm8 5:s8 13:1         | expand_shimm8(sextract(i, 5, 8) << 1 |      |
     57|   !function=expand_shimm8 |               extract(i, 13, 1))            |
     58+---------------------------+---------------------------------------------+
     59
     60Argument Sets
     61=============
     62
     63Syntax::
     64
     65  args_def    := '&' identifier ( args_elt )+ ( !extern )?
     66  args_elt    := identifier (':' identifier)?
     67
     68Each *args_elt* defines an argument within the argument set.
     69If the form of the *args_elt* contains a colon, the first
     70identifier is the argument name and the second identifier is
     71the argument type.  If the colon is missing, the argument
     72type will be ``int``.
     73
     74Each argument set will be rendered as a C structure "arg_$name"
     75with each of the fields being one of the member arguments.
     76
     77If ``!extern`` is specified, the backing structure is assumed
     78to have been already declared, typically via a second decoder.
     79
     80Argument sets are useful when one wants to define helper functions
     81for the translator functions that can perform operations on a common
     82set of arguments.  This can ensure, for instance, that the ``AND``
     83pattern and the ``OR`` pattern put their operands into the same named
     84structure, so that a common ``gen_logic_insn`` may be able to handle
     85the operations common between the two.
     86
     87Argument set examples::
     88
     89  &reg3       ra rb rc
     90  &loadstore  reg base offset
     91  &longldst   reg base offset:int64_t
     92
     93
     94Formats
     95=======
     96
     97Syntax::
     98
     99  fmt_def      := '@' identifier ( fmt_elt )+
    100  fmt_elt      := fixedbit_elt | field_elt | field_ref | args_ref
    101  fixedbit_elt := [01.-]+
    102  field_elt    := identifier ':' 's'? number
    103  field_ref    := '%' identifier | identifier '=' '%' identifier
    104  args_ref     := '&' identifier
    105
    106Defining a format is a handy way to avoid replicating groups of fields
    107across many instruction patterns.
    108
    109A *fixedbit_elt* describes a contiguous sequence of bits that must
    110be 1, 0, or don't care.  The difference between '.' and '-'
    111is that '.' means that the bit will be covered with a field or a
    112final 0 or 1 from the pattern, and '-' means that the bit is really
    113ignored by the cpu and will not be specified.
    114
    115A *field_elt* describes a simple field only given a width; the position of
    116the field is implied by its position with respect to other *fixedbit_elt*
    117and *field_elt*.
    118
    119If any *fixedbit_elt* or *field_elt* appear, then all bits must be defined.
    120Padding with a *fixedbit_elt* of all '.' is an easy way to accomplish that.
    121
    122A *field_ref* incorporates a field by reference.  This is the only way to
    123add a complex field to a format.  A field may be renamed in the process
    124via assignment to another identifier.  This is intended to allow the
    125same argument set be used with disjoint named fields.
    126
    127A single *args_ref* may specify an argument set to use for the format.
    128The set of fields in the format must be a subset of the arguments in
    129the argument set.  If an argument set is not specified, one will be
    130inferred from the set of fields.
    131
    132It is recommended, but not required, that all *field_ref* and *args_ref*
    133appear at the end of the line, not interleaving with *fixedbit_elf* or
    134*field_elt*.
    135
    136Format examples::
    137
    138  @opr    ...... ra:5 rb:5 ... 0 ....... rc:5
    139  @opi    ...... ra:5 lit:8    1 ....... rc:5
    140
    141Patterns
    142========
    143
    144Syntax::
    145
    146  pat_def      := identifier ( pat_elt )+
    147  pat_elt      := fixedbit_elt | field_elt | field_ref | args_ref | fmt_ref | const_elt
    148  fmt_ref      := '@' identifier
    149  const_elt    := identifier '=' number
    150
    151The *fixedbit_elt* and *field_elt* specifiers are unchanged from formats.
    152A pattern that does not specify a named format will have one inferred
    153from a referenced argument set (if present) and the set of fields.
    154
    155A *const_elt* allows a argument to be set to a constant value.  This may
    156come in handy when fields overlap between patterns and one has to
    157include the values in the *fixedbit_elt* instead.
    158
    159The decoder will call a translator function for each pattern matched.
    160
    161Pattern examples::
    162
    163  addl_r   010000 ..... ..... .... 0000000 ..... @opr
    164  addl_i   010000 ..... ..... .... 0000000 ..... @opi
    165
    166which will, in part, invoke::
    167
    168  trans_addl_r(ctx, &arg_opr, insn)
    169
    170and::
    171
    172  trans_addl_i(ctx, &arg_opi, insn)
    173
    174Pattern Groups
    175==============
    176
    177Syntax::
    178
    179  group            := overlap_group | no_overlap_group
    180  overlap_group    := '{' ( pat_def | group )+ '}'
    181  no_overlap_group := '[' ( pat_def | group )+ ']'
    182
    183A *group* begins with a lone open-brace or open-bracket, with all
    184subsequent lines indented two spaces, and ending with a lone
    185close-brace or close-bracket.  Groups may be nested, increasing the
    186required indentation of the lines within the nested group to two
    187spaces per nesting level.
    188
    189Patterns within overlap groups are allowed to overlap.  Conflicts are
    190resolved by selecting the patterns in order.  If all of the fixedbits
    191for a pattern match, its translate function will be called.  If the
    192translate function returns false, then subsequent patterns within the
    193group will be matched.
    194
    195Patterns within no-overlap groups are not allowed to overlap, just
    196the same as ungrouped patterns.  Thus no-overlap groups are intended
    197to be nested inside overlap groups.
    198
    199The following example from PA-RISC shows specialization of the *or*
    200instruction::
    201
    202  {
    203    {
    204      nop   000010 ----- ----- 0000 001001 0 00000
    205      copy  000010 00000 r1:5  0000 001001 0 rt:5
    206    }
    207    or      000010 rt2:5 r1:5  cf:4 001001 0 rt:5
    208  }
    209
    210When the *cf* field is zero, the instruction has no side effects,
    211and may be specialized.  When the *rt* field is zero, the output
    212is discarded and so the instruction has no effect.  When the *rt2*
    213field is zero, the operation is ``reg[r1] | 0`` and so encodes
    214the canonical register copy operation.
    215
    216The output from the generator might look like::
    217
    218  switch (insn & 0xfc000fe0) {
    219  case 0x08000240:
    220    /* 000010.. ........ ....0010 010..... */
    221    if ((insn & 0x0000f000) == 0x00000000) {
    222        /* 000010.. ........ 00000010 010..... */
    223        if ((insn & 0x0000001f) == 0x00000000) {
    224            /* 000010.. ........ 00000010 01000000 */
    225            extract_decode_Fmt_0(&u.f_decode0, insn);
    226            if (trans_nop(ctx, &u.f_decode0)) return true;
    227        }
    228        if ((insn & 0x03e00000) == 0x00000000) {
    229            /* 00001000 000..... 00000010 010..... */
    230            extract_decode_Fmt_1(&u.f_decode1, insn);
    231            if (trans_copy(ctx, &u.f_decode1)) return true;
    232        }
    233    }
    234    extract_decode_Fmt_2(&u.f_decode2, insn);
    235    if (trans_or(ctx, &u.f_decode2)) return true;
    236    return false;
    237  }