cachepc-linux

Fork of AMDESE/linux with modifications for CachePC side-channel attack
git clone https://git.sinitax.com/sinitax/cachepc-linux
Log | Files | Refs | README | LICENSE | sfeed.txt

halt-polling.rst (6764B)


      1.. SPDX-License-Identifier: GPL-2.0
      2
      3===========================
      4The KVM halt polling system
      5===========================
      6
      7The KVM halt polling system provides a feature within KVM whereby the latency
      8of a guest can, under some circumstances, be reduced by polling in the host
      9for some time period after the guest has elected to no longer run by cedeing.
     10That is, when a guest vcpu has ceded, or in the case of powerpc when all of the
     11vcpus of a single vcore have ceded, the host kernel polls for wakeup conditions
     12before giving up the cpu to the scheduler in order to let something else run.
     13
     14Polling provides a latency advantage in cases where the guest can be run again
     15very quickly by at least saving us a trip through the scheduler, normally on
     16the order of a few micro-seconds, although performance benefits are workload
     17dependant. In the event that no wakeup source arrives during the polling
     18interval or some other task on the runqueue is runnable the scheduler is
     19invoked. Thus halt polling is especially useful on workloads with very short
     20wakeup periods where the time spent halt polling is minimised and the time
     21savings of not invoking the scheduler are distinguishable.
     22
     23The generic halt polling code is implemented in:
     24
     25	virt/kvm/kvm_main.c: kvm_vcpu_block()
     26
     27The powerpc kvm-hv specific case is implemented in:
     28
     29	arch/powerpc/kvm/book3s_hv.c: kvmppc_vcore_blocked()
     30
     31Halt Polling Interval
     32=====================
     33
     34The maximum time for which to poll before invoking the scheduler, referred to
     35as the halt polling interval, is increased and decreased based on the perceived
     36effectiveness of the polling in an attempt to limit pointless polling.
     37This value is stored in either the vcpu struct:
     38
     39	kvm_vcpu->halt_poll_ns
     40
     41or in the case of powerpc kvm-hv, in the vcore struct:
     42
     43	kvmppc_vcore->halt_poll_ns
     44
     45Thus this is a per vcpu (or vcore) value.
     46
     47During polling if a wakeup source is received within the halt polling interval,
     48the interval is left unchanged. In the event that a wakeup source isn't
     49received during the polling interval (and thus schedule is invoked) there are
     50two options, either the polling interval and total block time[0] were less than
     51the global max polling interval (see module params below), or the total block
     52time was greater than the global max polling interval.
     53
     54In the event that both the polling interval and total block time were less than
     55the global max polling interval then the polling interval can be increased in
     56the hope that next time during the longer polling interval the wake up source
     57will be received while the host is polling and the latency benefits will be
     58received. The polling interval is grown in the function grow_halt_poll_ns() and
     59is multiplied by the module parameters halt_poll_ns_grow and
     60halt_poll_ns_grow_start.
     61
     62In the event that the total block time was greater than the global max polling
     63interval then the host will never poll for long enough (limited by the global
     64max) to wakeup during the polling interval so it may as well be shrunk in order
     65to avoid pointless polling. The polling interval is shrunk in the function
     66shrink_halt_poll_ns() and is divided by the module parameter
     67halt_poll_ns_shrink, or set to 0 iff halt_poll_ns_shrink == 0.
     68
     69It is worth noting that this adjustment process attempts to hone in on some
     70steady state polling interval but will only really do a good job for wakeups
     71which come at an approximately constant rate, otherwise there will be constant
     72adjustment of the polling interval.
     73
     74[0] total block time:
     75		      the time between when the halt polling function is
     76		      invoked and a wakeup source received (irrespective of
     77		      whether the scheduler is invoked within that function).
     78
     79Module Parameters
     80=================
     81
     82The kvm module has 3 tuneable module parameters to adjust the global max
     83polling interval as well as the rate at which the polling interval is grown and
     84shrunk. These variables are defined in include/linux/kvm_host.h and as module
     85parameters in virt/kvm/kvm_main.c, or arch/powerpc/kvm/book3s_hv.c in the
     86powerpc kvm-hv case.
     87
     88+-----------------------+---------------------------+-------------------------+
     89|Module Parameter	|   Description		    |	     Default Value    |
     90+-----------------------+---------------------------+-------------------------+
     91|halt_poll_ns		| The global max polling    | KVM_HALT_POLL_NS_DEFAULT|
     92|			| interval which defines    |			      |
     93|			| the ceiling value of the  |			      |
     94|			| polling interval for      | (per arch value)	      |
     95|			| each vcpu.		    |			      |
     96+-----------------------+---------------------------+-------------------------+
     97|halt_poll_ns_grow	| The value by which the    | 2			      |
     98|			| halt polling interval is  |			      |
     99|			| multiplied in the	    |			      |
    100|			| grow_halt_poll_ns()	    |			      |
    101|			| function.		    |			      |
    102+-----------------------+---------------------------+-------------------------+
    103|halt_poll_ns_grow_start| The initial value to grow | 10000		      |
    104|			| to from zero in the	    |			      |
    105|			| grow_halt_poll_ns()	    |			      |
    106|			| function.		    |			      |
    107+-----------------------+---------------------------+-------------------------+
    108|halt_poll_ns_shrink	| The value by which the    | 0			      |
    109|			| halt polling interval is  |			      |
    110|			| divided in the	    |			      |
    111|			| shrink_halt_poll_ns()	    |			      |
    112|			| function.		    |			      |
    113+-----------------------+---------------------------+-------------------------+
    114
    115These module parameters can be set from the debugfs files in:
    116
    117	/sys/module/kvm/parameters/
    118
    119Note: that these module parameters are system wide values and are not able to
    120      be tuned on a per vm basis.
    121
    122Further Notes
    123=============
    124
    125- Care should be taken when setting the halt_poll_ns module parameter as a large value
    126  has the potential to drive the cpu usage to 100% on a machine which would be almost
    127  entirely idle otherwise. This is because even if a guest has wakeups during which very
    128  little work is done and which are quite far apart, if the period is shorter than the
    129  global max polling interval (halt_poll_ns) then the host will always poll for the
    130  entire block time and thus cpu utilisation will go to 100%.
    131
    132- Halt polling essentially presents a trade off between power usage and latency and
    133  the module parameters should be used to tune the affinity for this. Idle cpu time is
    134  essentially converted to host kernel time with the aim of decreasing latency when
    135  entering the guest.
    136
    137- Halt polling will only be conducted by the host when no other tasks are runnable on
    138  that cpu, otherwise the polling will cease immediately and schedule will be invoked to
    139  allow that other task to run. Thus this doesn't allow a guest to denial of service the
    140  cpu.