cachepc-linux

Fork of AMDESE/linux with modifications for CachePC side-channel attack
git clone https://git.sinitax.com/sinitax/cachepc-linux
Log | Files | Refs | README | LICENSE | sfeed.txt

vcpu.rst (10317B)


      1.. SPDX-License-Identifier: GPL-2.0
      2
      3======================
      4Generic vcpu interface
      5======================
      6
      7The virtual cpu "device" also accepts the ioctls KVM_SET_DEVICE_ATTR,
      8KVM_GET_DEVICE_ATTR, and KVM_HAS_DEVICE_ATTR. The interface uses the same struct
      9kvm_device_attr as other devices, but targets VCPU-wide settings and controls.
     10
     11The groups and attributes per virtual cpu, if any, are architecture specific.
     12
     131. GROUP: KVM_ARM_VCPU_PMU_V3_CTRL
     14==================================
     15
     16:Architectures: ARM64
     17
     181.1. ATTRIBUTE: KVM_ARM_VCPU_PMU_V3_IRQ
     19---------------------------------------
     20
     21:Parameters: in kvm_device_attr.addr the address for PMU overflow interrupt is a
     22	     pointer to an int
     23
     24Returns:
     25
     26	 =======  ========================================================
     27	 -EBUSY   The PMU overflow interrupt is already set
     28	 -EFAULT  Error reading interrupt number
     29	 -ENXIO   PMUv3 not supported or the overflow interrupt not set
     30		  when attempting to get it
     31	 -ENODEV  KVM_ARM_VCPU_PMU_V3 feature missing from VCPU
     32	 -EINVAL  Invalid PMU overflow interrupt number supplied or
     33		  trying to set the IRQ number without using an in-kernel
     34		  irqchip.
     35	 =======  ========================================================
     36
     37A value describing the PMUv3 (Performance Monitor Unit v3) overflow interrupt
     38number for this vcpu. This interrupt could be a PPI or SPI, but the interrupt
     39type must be same for each vcpu. As a PPI, the interrupt number is the same for
     40all vcpus, while as an SPI it must be a separate number per vcpu.
     41
     421.2 ATTRIBUTE: KVM_ARM_VCPU_PMU_V3_INIT
     43---------------------------------------
     44
     45:Parameters: no additional parameter in kvm_device_attr.addr
     46
     47Returns:
     48
     49	 =======  ======================================================
     50	 -EEXIST  Interrupt number already used
     51	 -ENODEV  PMUv3 not supported or GIC not initialized
     52	 -ENXIO   PMUv3 not supported, missing VCPU feature or interrupt
     53		  number not set
     54	 -EBUSY   PMUv3 already initialized
     55	 =======  ======================================================
     56
     57Request the initialization of the PMUv3.  If using the PMUv3 with an in-kernel
     58virtual GIC implementation, this must be done after initializing the in-kernel
     59irqchip.
     60
     611.3 ATTRIBUTE: KVM_ARM_VCPU_PMU_V3_FILTER
     62-----------------------------------------
     63
     64:Parameters: in kvm_device_attr.addr the address for a PMU event filter is a
     65             pointer to a struct kvm_pmu_event_filter
     66
     67:Returns:
     68
     69	 =======  ======================================================
     70	 -ENODEV  PMUv3 not supported or GIC not initialized
     71	 -ENXIO   PMUv3 not properly configured or in-kernel irqchip not
     72	 	  configured as required prior to calling this attribute
     73	 -EBUSY   PMUv3 already initialized or a VCPU has already run
     74	 -EINVAL  Invalid filter range
     75	 =======  ======================================================
     76
     77Request the installation of a PMU event filter described as follows::
     78
     79    struct kvm_pmu_event_filter {
     80	    __u16	base_event;
     81	    __u16	nevents;
     82
     83    #define KVM_PMU_EVENT_ALLOW	0
     84    #define KVM_PMU_EVENT_DENY	1
     85
     86	    __u8	action;
     87	    __u8	pad[3];
     88    };
     89
     90A filter range is defined as the range [@base_event, @base_event + @nevents),
     91together with an @action (KVM_PMU_EVENT_ALLOW or KVM_PMU_EVENT_DENY). The
     92first registered range defines the global policy (global ALLOW if the first
     93@action is DENY, global DENY if the first @action is ALLOW). Multiple ranges
     94can be programmed, and must fit within the event space defined by the PMU
     95architecture (10 bits on ARMv8.0, 16 bits from ARMv8.1 onwards).
     96
     97Note: "Cancelling" a filter by registering the opposite action for the same
     98range doesn't change the default action. For example, installing an ALLOW
     99filter for event range [0:10) as the first filter and then applying a DENY
    100action for the same range will leave the whole range as disabled.
    101
    102Restrictions: Event 0 (SW_INCR) is never filtered, as it doesn't count a
    103hardware event. Filtering event 0x1E (CHAIN) has no effect either, as it
    104isn't strictly speaking an event. Filtering the cycle counter is possible
    105using event 0x11 (CPU_CYCLES).
    106
    1071.4 ATTRIBUTE: KVM_ARM_VCPU_PMU_V3_SET_PMU
    108------------------------------------------
    109
    110:Parameters: in kvm_device_attr.addr the address to an int representing the PMU
    111             identifier.
    112
    113:Returns:
    114
    115	 =======  ====================================================
    116	 -EBUSY   PMUv3 already initialized, a VCPU has already run or
    117                  an event filter has already been set
    118	 -EFAULT  Error accessing the PMU identifier
    119	 -ENXIO   PMU not found
    120	 -ENODEV  PMUv3 not supported or GIC not initialized
    121	 -ENOMEM  Could not allocate memory
    122	 =======  ====================================================
    123
    124Request that the VCPU uses the specified hardware PMU when creating guest events
    125for the purpose of PMU emulation. The PMU identifier can be read from the "type"
    126file for the desired PMU instance under /sys/devices (or, equivalent,
    127/sys/bus/even_source). This attribute is particularly useful on heterogeneous
    128systems where there are at least two CPU PMUs on the system. The PMU that is set
    129for one VCPU will be used by all the other VCPUs. It isn't possible to set a PMU
    130if a PMU event filter is already present.
    131
    132Note that KVM will not make any attempts to run the VCPU on the physical CPUs
    133associated with the PMU specified by this attribute. This is entirely left to
    134userspace. However, attempting to run the VCPU on a physical CPU not supported
    135by the PMU will fail and KVM_RUN will return with
    136exit_reason = KVM_EXIT_FAIL_ENTRY and populate the fail_entry struct by setting
    137hardare_entry_failure_reason field to KVM_EXIT_FAIL_ENTRY_CPU_UNSUPPORTED and
    138the cpu field to the processor id.
    139
    1402. GROUP: KVM_ARM_VCPU_TIMER_CTRL
    141=================================
    142
    143:Architectures: ARM64
    144
    1452.1. ATTRIBUTES: KVM_ARM_VCPU_TIMER_IRQ_VTIMER, KVM_ARM_VCPU_TIMER_IRQ_PTIMER
    146-----------------------------------------------------------------------------
    147
    148:Parameters: in kvm_device_attr.addr the address for the timer interrupt is a
    149	     pointer to an int
    150
    151Returns:
    152
    153	 =======  =================================
    154	 -EINVAL  Invalid timer interrupt number
    155	 -EBUSY   One or more VCPUs has already run
    156	 =======  =================================
    157
    158A value describing the architected timer interrupt number when connected to an
    159in-kernel virtual GIC.  These must be a PPI (16 <= intid < 32).  Setting the
    160attribute overrides the default values (see below).
    161
    162=============================  ==========================================
    163KVM_ARM_VCPU_TIMER_IRQ_VTIMER  The EL1 virtual timer intid (default: 27)
    164KVM_ARM_VCPU_TIMER_IRQ_PTIMER  The EL1 physical timer intid (default: 30)
    165=============================  ==========================================
    166
    167Setting the same PPI for different timers will prevent the VCPUs from running.
    168Setting the interrupt number on a VCPU configures all VCPUs created at that
    169time to use the number provided for a given timer, overwriting any previously
    170configured values on other VCPUs.  Userspace should configure the interrupt
    171numbers on at least one VCPU after creating all VCPUs and before running any
    172VCPUs.
    173
    1743. GROUP: KVM_ARM_VCPU_PVTIME_CTRL
    175==================================
    176
    177:Architectures: ARM64
    178
    1793.1 ATTRIBUTE: KVM_ARM_VCPU_PVTIME_IPA
    180--------------------------------------
    181
    182:Parameters: 64-bit base address
    183
    184Returns:
    185
    186	 =======  ======================================
    187	 -ENXIO   Stolen time not implemented
    188	 -EEXIST  Base address already set for this VCPU
    189	 -EINVAL  Base address not 64 byte aligned
    190	 =======  ======================================
    191
    192Specifies the base address of the stolen time structure for this VCPU. The
    193base address must be 64 byte aligned and exist within a valid guest memory
    194region. See Documentation/virt/kvm/arm/pvtime.rst for more information
    195including the layout of the stolen time structure.
    196
    1974. GROUP: KVM_VCPU_TSC_CTRL
    198===========================
    199
    200:Architectures: x86
    201
    2024.1 ATTRIBUTE: KVM_VCPU_TSC_OFFSET
    203
    204:Parameters: 64-bit unsigned TSC offset
    205
    206Returns:
    207
    208	 ======= ======================================
    209	 -EFAULT Error reading/writing the provided
    210		 parameter address.
    211	 -ENXIO  Attribute not supported
    212	 ======= ======================================
    213
    214Specifies the guest's TSC offset relative to the host's TSC. The guest's
    215TSC is then derived by the following equation:
    216
    217  guest_tsc = host_tsc + KVM_VCPU_TSC_OFFSET
    218
    219This attribute is useful to adjust the guest's TSC on live migration,
    220so that the TSC counts the time during which the VM was paused. The
    221following describes a possible algorithm to use for this purpose.
    222
    223From the source VMM process:
    224
    2251. Invoke the KVM_GET_CLOCK ioctl to record the host TSC (tsc_src),
    226   kvmclock nanoseconds (guest_src), and host CLOCK_REALTIME nanoseconds
    227   (host_src).
    228
    2292. Read the KVM_VCPU_TSC_OFFSET attribute for every vCPU to record the
    230   guest TSC offset (ofs_src[i]).
    231
    2323. Invoke the KVM_GET_TSC_KHZ ioctl to record the frequency of the
    233   guest's TSC (freq).
    234
    235From the destination VMM process:
    236
    2374. Invoke the KVM_SET_CLOCK ioctl, providing the source nanoseconds from
    238   kvmclock (guest_src) and CLOCK_REALTIME (host_src) in their respective
    239   fields.  Ensure that the KVM_CLOCK_REALTIME flag is set in the provided
    240   structure.
    241
    242   KVM will advance the VM's kvmclock to account for elapsed time since
    243   recording the clock values.  Note that this will cause problems in
    244   the guest (e.g., timeouts) unless CLOCK_REALTIME is synchronized
    245   between the source and destination, and a reasonably short time passes
    246   between the source pausing the VMs and the destination executing
    247   steps 4-7.
    248
    2495. Invoke the KVM_GET_CLOCK ioctl to record the host TSC (tsc_dest) and
    250   kvmclock nanoseconds (guest_dest).
    251
    2526. Adjust the guest TSC offsets for every vCPU to account for (1) time
    253   elapsed since recording state and (2) difference in TSCs between the
    254   source and destination machine:
    255
    256   ofs_dst[i] = ofs_src[i] -
    257     (guest_src - guest_dest) * freq +
    258     (tsc_src - tsc_dest)
    259
    260   ("ofs[i] + tsc - guest * freq" is the guest TSC value corresponding to
    261   a time of 0 in kvmclock.  The above formula ensures that it is the
    262   same on the destination as it was on the source).
    263
    2647. Write the KVM_VCPU_TSC_OFFSET attribute for every vCPU with the
    265   respective value derived in the previous step.