tcg-icount.rst - cachepc-qemu - Fork of AMDESE/qemu with changes for cachepc side-channel attack

	cachepc-qemu Fork of AMDESE/qemu with changes for cachepc side-channel attack
	git clone https://git.sinitax.com/sinitax/cachepc-qemu
	Log \| Files \| Refs \| Submodules \| LICENSE \| sfeed.txt
tcg-icount.rst (3895B)
      1..
      2   Copyright (c) 2020, Linaro Limited
      3   Written by Alex Bennée
      4
      5
      6========================
      7TCG Instruction Counting
      8========================
      9
     10TCG has long supported a feature known as icount which allows for
     11instruction counting during execution. This should not be confused
     12with cycle accurate emulation - QEMU does not attempt to emulate how
     13long an instruction would take on real hardware. That is a job for
     14other more detailed (and slower) tools that simulate the rest of a
     15micro-architecture.
     16
     17This feature is only available for system emulation and is
     18incompatible with multi-threaded TCG. It can be used to better align
     19execution time with wall-clock time so a "slow" device doesn't run too
     20fast on modern hardware. It can also provides for a degree of
     21deterministic execution and is an essential part of the record/replay
     22support in QEMU.
     23
     24Core Concepts
     25=============
     26
     27At its heart icount is simply a count of executed instructions which
     28is stored in the TimersState of QEMU's timer sub-system. The number of
     29executed instructions can then be used to calculate QEMU_CLOCK_VIRTUAL
     30which represents the amount of elapsed time in the system since
     31execution started. Depending on the icount mode this may either be a
     32fixed number of ns per instruction or adjusted as execution continues
     33to keep wall clock time and virtual time in sync.
     34
     35To be able to calculate the number of executed instructions the
     36translator starts by allocating a budget of instructions to be
     37executed. The budget of instructions is limited by how long it will be
     38until the next timer will expire. We store this budget as part of a
     39vCPU icount_decr field which shared with the machinery for handling
     40cpu_exit(). The whole field is checked at the start of every
     41translated block and will cause a return to the outer loop to deal
     42with whatever caused the exit.
     43
     44In the case of icount, before the flag is checked we subtract the
     45number of instructions the translation block would execute. If this
     46would cause the instruction budget to go negative we exit the main
     47loop and regenerate a new translation block with exactly the right
     48number of instructions to take the budget to 0 meaning whatever timer
     49was due to expire will expire exactly when we exit the main run loop.
     50
     51Dealing with MMIO
     52-----------------
     53
     54While we can adjust the instruction budget for known events like timer
     55expiry we cannot do the same for MMIO. Every load/store we execute
     56might potentially trigger an I/O event, at which point we will need an
     57up to date and accurate reading of the icount number.
     58
     59To deal with this case, when an I/O access is made we:
     60
     61  - restore un-executed instructions to the icount budget
     62  - re-compile a single [1]_ instruction block for the current PC
     63  - exit the cpu loop and execute the re-compiled block
     64
     65The new block is created with the CF_LAST_IO compile flag which
     66ensures the final instruction translation starts with a call to
     67gen_io_start() so we don't enter a perpetual loop constantly
     68recompiling a single instruction block. For translators using the
     69common translator_loop this is done automatically.
     70  
     71.. [1] sometimes two instructions if dealing with delay slots  
     72
     73Other I/O operations
     74--------------------
     75
     76MMIO isn't the only type of operation for which we might need a
     77correct and accurate clock. IO port instructions and accesses to
     78system registers are the common examples here. These instructions have
     79to be handled by the individual translators which have the knowledge
     80of which operations are I/O operations.
     81
     82When the translator is handling an instruction of this kind:
     83
     84* it must call gen_io_start() if icount is enabled, at some
     85   point before the generation of the code which actually does
     86   the I/O, using a code fragment similar to:
     87
     88.. code:: c
     89
     90    if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) {
     91        gen_io_start();
     92    }
     93
     94* it must end the TB immediately after this instruction