cachepc-linux

Fork of AMDESE/linux with modifications for CachePC side-channel attack
git clone https://git.sinitax.com/sinitax/cachepc-linux
Log | Files | Refs | README | LICENSE | sfeed.txt

vfio-ap.rst (35815B)


      1===============================
      2Adjunct Processor (AP) facility
      3===============================
      4
      5
      6Introduction
      7============
      8The Adjunct Processor (AP) facility is an IBM Z cryptographic facility comprised
      9of three AP instructions and from 1 up to 256 PCIe cryptographic adapter cards.
     10The AP devices provide cryptographic functions to all CPUs assigned to a
     11linux system running in an IBM Z system LPAR.
     12
     13The AP adapter cards are exposed via the AP bus. The motivation for vfio-ap
     14is to make AP cards available to KVM guests using the VFIO mediated device
     15framework. This implementation relies considerably on the s390 virtualization
     16facilities which do most of the hard work of providing direct access to AP
     17devices.
     18
     19AP Architectural Overview
     20=========================
     21To facilitate the comprehension of the design, let's start with some
     22definitions:
     23
     24* AP adapter
     25
     26  An AP adapter is an IBM Z adapter card that can perform cryptographic
     27  functions. There can be from 0 to 256 adapters assigned to an LPAR. Adapters
     28  assigned to the LPAR in which a linux host is running will be available to
     29  the linux host. Each adapter is identified by a number from 0 to 255; however,
     30  the maximum adapter number is determined by machine model and/or adapter type.
     31  When installed, an AP adapter is accessed by AP instructions executed by any
     32  CPU.
     33
     34  The AP adapter cards are assigned to a given LPAR via the system's Activation
     35  Profile which can be edited via the HMC. When the linux host system is IPL'd
     36  in the LPAR, the AP bus detects the AP adapter cards assigned to the LPAR and
     37  creates a sysfs device for each assigned adapter. For example, if AP adapters
     38  4 and 10 (0x0a) are assigned to the LPAR, the AP bus will create the following
     39  sysfs device entries::
     40
     41    /sys/devices/ap/card04
     42    /sys/devices/ap/card0a
     43
     44  Symbolic links to these devices will also be created in the AP bus devices
     45  sub-directory::
     46
     47    /sys/bus/ap/devices/[card04]
     48    /sys/bus/ap/devices/[card04]
     49
     50* AP domain
     51
     52  An adapter is partitioned into domains. An adapter can hold up to 256 domains
     53  depending upon the adapter type and hardware configuration. A domain is
     54  identified by a number from 0 to 255; however, the maximum domain number is
     55  determined by machine model and/or adapter type.. A domain can be thought of
     56  as a set of hardware registers and memory used for processing AP commands. A
     57  domain can be configured with a secure private key used for clear key
     58  encryption. A domain is classified in one of two ways depending upon how it
     59  may be accessed:
     60
     61    * Usage domains are domains that are targeted by an AP instruction to
     62      process an AP command.
     63
     64    * Control domains are domains that are changed by an AP command sent to a
     65      usage domain; for example, to set the secure private key for the control
     66      domain.
     67
     68  The AP usage and control domains are assigned to a given LPAR via the system's
     69  Activation Profile which can be edited via the HMC. When a linux host system
     70  is IPL'd in the LPAR, the AP bus module detects the AP usage and control
     71  domains assigned to the LPAR. The domain number of each usage domain and
     72  adapter number of each AP adapter are combined to create AP queue devices
     73  (see AP Queue section below). The domain number of each control domain will be
     74  represented in a bitmask and stored in a sysfs file
     75  /sys/bus/ap/ap_control_domain_mask. The bits in the mask, from most to least
     76  significant bit, correspond to domains 0-255.
     77
     78* AP Queue
     79
     80  An AP queue is the means by which an AP command is sent to a usage domain
     81  inside a specific adapter. An AP queue is identified by a tuple
     82  comprised of an AP adapter ID (APID) and an AP queue index (APQI). The
     83  APQI corresponds to a given usage domain number within the adapter. This tuple
     84  forms an AP Queue Number (APQN) uniquely identifying an AP queue. AP
     85  instructions include a field containing the APQN to identify the AP queue to
     86  which the AP command is to be sent for processing.
     87
     88  The AP bus will create a sysfs device for each APQN that can be derived from
     89  the cross product of the AP adapter and usage domain numbers detected when the
     90  AP bus module is loaded. For example, if adapters 4 and 10 (0x0a) and usage
     91  domains 6 and 71 (0x47) are assigned to the LPAR, the AP bus will create the
     92  following sysfs entries::
     93
     94    /sys/devices/ap/card04/04.0006
     95    /sys/devices/ap/card04/04.0047
     96    /sys/devices/ap/card0a/0a.0006
     97    /sys/devices/ap/card0a/0a.0047
     98
     99  The following symbolic links to these devices will be created in the AP bus
    100  devices subdirectory::
    101
    102    /sys/bus/ap/devices/[04.0006]
    103    /sys/bus/ap/devices/[04.0047]
    104    /sys/bus/ap/devices/[0a.0006]
    105    /sys/bus/ap/devices/[0a.0047]
    106
    107* AP Instructions:
    108
    109  There are three AP instructions:
    110
    111  * NQAP: to enqueue an AP command-request message to a queue
    112  * DQAP: to dequeue an AP command-reply message from a queue
    113  * PQAP: to administer the queues
    114
    115  AP instructions identify the domain that is targeted to process the AP
    116  command; this must be one of the usage domains. An AP command may modify a
    117  domain that is not one of the usage domains, but the modified domain
    118  must be one of the control domains.
    119
    120AP and SIE
    121==========
    122Let's now take a look at how AP instructions executed on a guest are interpreted
    123by the hardware.
    124
    125A satellite control block called the Crypto Control Block (CRYCB) is attached to
    126our main hardware virtualization control block. The CRYCB contains three fields
    127to identify the adapters, usage domains and control domains assigned to the KVM
    128guest:
    129
    130* The AP Mask (APM) field is a bit mask that identifies the AP adapters assigned
    131  to the KVM guest. Each bit in the mask, from left to right (i.e. from most
    132  significant to least significant bit in big endian order), corresponds to
    133  an APID from 0-255. If a bit is set, the corresponding adapter is valid for
    134  use by the KVM guest.
    135
    136* The AP Queue Mask (AQM) field is a bit mask identifying the AP usage domains
    137  assigned to the KVM guest. Each bit in the mask, from left to right (i.e. from
    138  most significant to least significant bit in big endian order), corresponds to
    139  an AP queue index (APQI) from 0-255. If a bit is set, the corresponding queue
    140  is valid for use by the KVM guest.
    141
    142* The AP Domain Mask field is a bit mask that identifies the AP control domains
    143  assigned to the KVM guest. The ADM bit mask controls which domains can be
    144  changed by an AP command-request message sent to a usage domain from the
    145  guest. Each bit in the mask, from left to right (i.e. from most significant to
    146  least significant bit in big endian order), corresponds to a domain from
    147  0-255. If a bit is set, the corresponding domain can be modified by an AP
    148  command-request message sent to a usage domain.
    149
    150If you recall from the description of an AP Queue, AP instructions include
    151an APQN to identify the AP queue to which an AP command-request message is to be
    152sent (NQAP and PQAP instructions), or from which a command-reply message is to
    153be received (DQAP instruction). The validity of an APQN is defined by the matrix
    154calculated from the APM and AQM; it is the cross product of all assigned adapter
    155numbers (APM) with all assigned queue indexes (AQM). For example, if adapters 1
    156and 2 and usage domains 5 and 6 are assigned to a guest, the APQNs (1,5), (1,6),
    157(2,5) and (2,6) will be valid for the guest.
    158
    159The APQNs can provide secure key functionality - i.e., a private key is stored
    160on the adapter card for each of its domains - so each APQN must be assigned to
    161at most one guest or to the linux host::
    162
    163   Example 1: Valid configuration:
    164   ------------------------------
    165   Guest1: adapters 1,2  domains 5,6
    166   Guest2: adapter  1,2  domain 7
    167
    168   This is valid because both guests have a unique set of APQNs:
    169      Guest1 has APQNs (1,5), (1,6), (2,5), (2,6);
    170      Guest2 has APQNs (1,7), (2,7)
    171
    172   Example 2: Valid configuration:
    173   ------------------------------
    174   Guest1: adapters 1,2 domains 5,6
    175   Guest2: adapters 3,4 domains 5,6
    176
    177   This is also valid because both guests have a unique set of APQNs:
    178      Guest1 has APQNs (1,5), (1,6), (2,5), (2,6);
    179      Guest2 has APQNs (3,5), (3,6), (4,5), (4,6)
    180
    181   Example 3: Invalid configuration:
    182   --------------------------------
    183   Guest1: adapters 1,2  domains 5,6
    184   Guest2: adapter  1    domains 6,7
    185
    186   This is an invalid configuration because both guests have access to
    187   APQN (1,6).
    188
    189The Design
    190==========
    191The design introduces three new objects:
    192
    1931. AP matrix device
    1942. VFIO AP device driver (vfio_ap.ko)
    1953. VFIO AP mediated matrix pass-through device
    196
    197The VFIO AP device driver
    198-------------------------
    199The VFIO AP (vfio_ap) device driver serves the following purposes:
    200
    2011. Provides the interfaces to secure APQNs for exclusive use of KVM guests.
    202
    2032. Sets up the VFIO mediated device interfaces to manage a mediated matrix
    204   device and creates the sysfs interfaces for assigning adapters, usage
    205   domains, and control domains comprising the matrix for a KVM guest.
    206
    2073. Configures the APM, AQM and ADM in the CRYCB referenced by a KVM guest's
    208   SIE state description to grant the guest access to a matrix of AP devices
    209
    210Reserve APQNs for exclusive use of KVM guests
    211---------------------------------------------
    212The following block diagram illustrates the mechanism by which APQNs are
    213reserved::
    214
    215				+------------------+
    216		 7 remove       |                  |
    217	   +--------------------> cex4queue driver |
    218	   |                    |                  |
    219	   |                    +------------------+
    220	   |
    221	   |
    222	   |                    +------------------+          +----------------+
    223	   |  5 register driver |                  | 3 create |                |
    224	   |   +---------------->   Device core    +---------->  matrix device |
    225	   |   |                |                  |          |                |
    226	   |   |                +--------^---------+          +----------------+
    227	   |   |                         |
    228	   |   |                         +-------------------+
    229	   |   | +-----------------------------------+       |
    230	   |   | |      4 register AP driver         |       | 2 register device
    231	   |   | |                                   |       |
    232  +--------+---+-v---+                      +--------+-------+-+
    233  |                  |                      |                  |
    234  |      ap_bus      +--------------------- >  vfio_ap driver  |
    235  |                  |       8 probe        |                  |
    236  +--------^---------+                      +--^--^------------+
    237  6 edit   |                                   |  |
    238    apmask |     +-----------------------------+  | 9 mdev create
    239    aqmask |     |           1 modprobe           |
    240  +--------+-----+---+           +----------------+-+         +----------------+
    241  |                  |           |                  |8 create |     mediated   |
    242  |      admin       |           | VFIO device core |--------->     matrix     |
    243  |                  +           |                  |         |     device     |
    244  +------+-+---------+           +--------^---------+         +--------^-------+
    245	 | |                              |                            |
    246	 | | 9 create vfio_ap-passthrough |                            |
    247	 | +------------------------------+                            |
    248	 +-------------------------------------------------------------+
    249		     10  assign adapter/domain/control domain
    250
    251The process for reserving an AP queue for use by a KVM guest is:
    252
    2531. The administrator loads the vfio_ap device driver
    2542. The vfio-ap driver during its initialization will register a single 'matrix'
    255   device with the device core. This will serve as the parent device for
    256   all mediated matrix devices used to configure an AP matrix for a guest.
    2573. The /sys/devices/vfio_ap/matrix device is created by the device core
    2584. The vfio_ap device driver will register with the AP bus for AP queue devices
    259   of type 10 and higher (CEX4 and newer). The driver will provide the vfio_ap
    260   driver's probe and remove callback interfaces. Devices older than CEX4 queues
    261   are not supported to simplify the implementation by not needlessly
    262   complicating the design by supporting older devices that will go out of
    263   service in the relatively near future, and for which there are few older
    264   systems around on which to test.
    2655. The AP bus registers the vfio_ap device driver with the device core
    2666. The administrator edits the AP adapter and queue masks to reserve AP queues
    267   for use by the vfio_ap device driver.
    2687. The AP bus removes the AP queues reserved for the vfio_ap driver from the
    269   default zcrypt cex4queue driver.
    2708. The AP bus probes the vfio_ap device driver to bind the queues reserved for
    271   it.
    2729. The administrator creates a passthrough type mediated matrix device to be
    273   used by a guest
    27410. The administrator assigns the adapters, usage domains and control domains
    275    to be exclusively used by a guest.
    276
    277Set up the VFIO mediated device interfaces
    278------------------------------------------
    279The VFIO AP device driver utilizes the common interface of the VFIO mediated
    280device core driver to:
    281
    282* Register an AP mediated bus driver to add a mediated matrix device to and
    283  remove it from a VFIO group.
    284* Create and destroy a mediated matrix device
    285* Add a mediated matrix device to and remove it from the AP mediated bus driver
    286* Add a mediated matrix device to and remove it from an IOMMU group
    287
    288The following high-level block diagram shows the main components and interfaces
    289of the VFIO AP mediated matrix device driver::
    290
    291   +-------------+
    292   |             |
    293   | +---------+ | mdev_register_driver() +--------------+
    294   | |  Mdev   | +<-----------------------+              |
    295   | |  bus    | |                        | vfio_mdev.ko |
    296   | | driver  | +----------------------->+              |<-> VFIO user
    297   | +---------+ |    probe()/remove()    +--------------+    APIs
    298   |             |
    299   |  MDEV CORE  |
    300   |   MODULE    |
    301   |   mdev.ko   |
    302   | +---------+ | mdev_register_device() +--------------+
    303   | |Physical | +<-----------------------+              |
    304   | | device  | |                        |  vfio_ap.ko  |<-> matrix
    305   | |interface| +----------------------->+              |    device
    306   | +---------+ |       callback         +--------------+
    307   +-------------+
    308
    309During initialization of the vfio_ap module, the matrix device is registered
    310with an 'mdev_parent_ops' structure that provides the sysfs attribute
    311structures, mdev functions and callback interfaces for managing the mediated
    312matrix device.
    313
    314* sysfs attribute structures:
    315
    316  supported_type_groups
    317    The VFIO mediated device framework supports creation of user-defined
    318    mediated device types. These mediated device types are specified
    319    via the 'supported_type_groups' structure when a device is registered
    320    with the mediated device framework. The registration process creates the
    321    sysfs structures for each mediated device type specified in the
    322    'mdev_supported_types' sub-directory of the device being registered. Along
    323    with the device type, the sysfs attributes of the mediated device type are
    324    provided.
    325
    326    The VFIO AP device driver will register one mediated device type for
    327    passthrough devices:
    328
    329      /sys/devices/vfio_ap/matrix/mdev_supported_types/vfio_ap-passthrough
    330
    331    Only the read-only attributes required by the VFIO mdev framework will
    332    be provided::
    333
    334	... name
    335	... device_api
    336	... available_instances
    337	... device_api
    338
    339    Where:
    340
    341	* name:
    342	    specifies the name of the mediated device type
    343	* device_api:
    344	    the mediated device type's API
    345	* available_instances:
    346	    the number of mediated matrix passthrough devices
    347	    that can be created
    348	* device_api:
    349	    specifies the VFIO API
    350  mdev_attr_groups
    351    This attribute group identifies the user-defined sysfs attributes of the
    352    mediated device. When a device is registered with the VFIO mediated device
    353    framework, the sysfs attribute files identified in the 'mdev_attr_groups'
    354    structure will be created in the mediated matrix device's directory. The
    355    sysfs attributes for a mediated matrix device are:
    356
    357    assign_adapter / unassign_adapter:
    358      Write-only attributes for assigning/unassigning an AP adapter to/from the
    359      mediated matrix device. To assign/unassign an adapter, the APID of the
    360      adapter is echoed to the respective attribute file.
    361    assign_domain / unassign_domain:
    362      Write-only attributes for assigning/unassigning an AP usage domain to/from
    363      the mediated matrix device. To assign/unassign a domain, the domain
    364      number of the usage domain is echoed to the respective attribute
    365      file.
    366    matrix:
    367      A read-only file for displaying the APQNs derived from the cross product
    368      of the adapter and domain numbers assigned to the mediated matrix device.
    369    assign_control_domain / unassign_control_domain:
    370      Write-only attributes for assigning/unassigning an AP control domain
    371      to/from the mediated matrix device. To assign/unassign a control domain,
    372      the ID of the domain to be assigned/unassigned is echoed to the respective
    373      attribute file.
    374    control_domains:
    375      A read-only file for displaying the control domain numbers assigned to the
    376      mediated matrix device.
    377
    378* functions:
    379
    380  create:
    381    allocates the ap_matrix_mdev structure used by the vfio_ap driver to:
    382
    383    * Store the reference to the KVM structure for the guest using the mdev
    384    * Store the AP matrix configuration for the adapters, domains, and control
    385      domains assigned via the corresponding sysfs attributes files
    386
    387  remove:
    388    deallocates the mediated matrix device's ap_matrix_mdev structure. This will
    389    be allowed only if a running guest is not using the mdev.
    390
    391* callback interfaces
    392
    393  open:
    394    The vfio_ap driver uses this callback to register a
    395    VFIO_GROUP_NOTIFY_SET_KVM notifier callback function for the mdev matrix
    396    device. The open is invoked when QEMU connects the VFIO iommu group
    397    for the mdev matrix device to the MDEV bus. Access to the KVM structure used
    398    to configure the KVM guest is provided via this callback. The KVM structure,
    399    is used to configure the guest's access to the AP matrix defined via the
    400    mediated matrix device's sysfs attribute files.
    401  release:
    402    unregisters the VFIO_GROUP_NOTIFY_SET_KVM notifier callback function for the
    403    mdev matrix device and deconfigures the guest's AP matrix.
    404
    405Configure the APM, AQM and ADM in the CRYCB
    406-------------------------------------------
    407Configuring the AP matrix for a KVM guest will be performed when the
    408VFIO_GROUP_NOTIFY_SET_KVM notifier callback is invoked. The notifier
    409function is called when QEMU connects to KVM. The guest's AP matrix is
    410configured via it's CRYCB by:
    411
    412* Setting the bits in the APM corresponding to the APIDs assigned to the
    413  mediated matrix device via its 'assign_adapter' interface.
    414* Setting the bits in the AQM corresponding to the domains assigned to the
    415  mediated matrix device via its 'assign_domain' interface.
    416* Setting the bits in the ADM corresponding to the domain dIDs assigned to the
    417  mediated matrix device via its 'assign_control_domains' interface.
    418
    419The CPU model features for AP
    420-----------------------------
    421The AP stack relies on the presence of the AP instructions as well as two
    422facilities: The AP Facilities Test (APFT) facility; and the AP Query
    423Configuration Information (QCI) facility. These features/facilities are made
    424available to a KVM guest via the following CPU model features:
    425
    4261. ap: Indicates whether the AP instructions are installed on the guest. This
    427   feature will be enabled by KVM only if the AP instructions are installed
    428   on the host.
    429
    4302. apft: Indicates the APFT facility is available on the guest. This facility
    431   can be made available to the guest only if it is available on the host (i.e.,
    432   facility bit 15 is set).
    433
    4343. apqci: Indicates the AP QCI facility is available on the guest. This facility
    435   can be made available to the guest only if it is available on the host (i.e.,
    436   facility bit 12 is set).
    437
    438Note: If the user chooses to specify a CPU model different than the 'host'
    439model to QEMU, the CPU model features and facilities need to be turned on
    440explicitly; for example::
    441
    442     /usr/bin/qemu-system-s390x ... -cpu z13,ap=on,apqci=on,apft=on
    443
    444A guest can be precluded from using AP features/facilities by turning them off
    445explicitly; for example::
    446
    447     /usr/bin/qemu-system-s390x ... -cpu host,ap=off,apqci=off,apft=off
    448
    449Note: If the APFT facility is turned off (apft=off) for the guest, the guest
    450will not see any AP devices. The zcrypt device drivers that register for type 10
    451and newer AP devices - i.e., the cex4card and cex4queue device drivers - need
    452the APFT facility to ascertain the facilities installed on a given AP device. If
    453the APFT facility is not installed on the guest, then the probe of device
    454drivers will fail since only type 10 and newer devices can be configured for
    455guest use.
    456
    457Example
    458=======
    459Let's now provide an example to illustrate how KVM guests may be given
    460access to AP facilities. For this example, we will show how to configure
    461three guests such that executing the lszcrypt command on the guests would
    462look like this:
    463
    464Guest1
    465------
    466=========== ===== ============
    467CARD.DOMAIN TYPE  MODE
    468=========== ===== ============
    46905          CEX5C CCA-Coproc
    47005.0004     CEX5C CCA-Coproc
    47105.00ab     CEX5C CCA-Coproc
    47206          CEX5A Accelerator
    47306.0004     CEX5A Accelerator
    47406.00ab     CEX5C CCA-Coproc
    475=========== ===== ============
    476
    477Guest2
    478------
    479=========== ===== ============
    480CARD.DOMAIN TYPE  MODE
    481=========== ===== ============
    48205          CEX5A Accelerator
    48305.0047     CEX5A Accelerator
    48405.00ff     CEX5A Accelerator
    485=========== ===== ============
    486
    487Guest3
    488------
    489=========== ===== ============
    490CARD.DOMAIN TYPE  MODE
    491=========== ===== ============
    49206          CEX5A Accelerator
    49306.0047     CEX5A Accelerator
    49406.00ff     CEX5A Accelerator
    495=========== ===== ============
    496
    497These are the steps:
    498
    4991. Install the vfio_ap module on the linux host. The dependency chain for the
    500   vfio_ap module is:
    501   * iommu
    502   * s390
    503   * zcrypt
    504   * vfio
    505   * vfio_mdev
    506   * vfio_mdev_device
    507   * KVM
    508
    509   To build the vfio_ap module, the kernel build must be configured with the
    510   following Kconfig elements selected:
    511   * IOMMU_SUPPORT
    512   * S390
    513   * ZCRYPT
    514   * S390_AP_IOMMU
    515   * VFIO
    516   * VFIO_MDEV
    517   * KVM
    518
    519   If using make menuconfig select the following to build the vfio_ap module::
    520
    521     -> Device Drivers
    522	-> IOMMU Hardware Support
    523	   select S390 AP IOMMU Support
    524	-> VFIO Non-Privileged userspace driver framework
    525	   -> Mediated device driver frramework
    526	      -> VFIO driver for Mediated devices
    527     -> I/O subsystem
    528	-> VFIO support for AP devices
    529
    5302. Secure the AP queues to be used by the three guests so that the host can not
    531   access them. To secure them, there are two sysfs files that specify
    532   bitmasks marking a subset of the APQN range as 'usable by the default AP
    533   queue device drivers' or 'not usable by the default device drivers' and thus
    534   available for use by the vfio_ap device driver'. The location of the sysfs
    535   files containing the masks are::
    536
    537     /sys/bus/ap/apmask
    538     /sys/bus/ap/aqmask
    539
    540   The 'apmask' is a 256-bit mask that identifies a set of AP adapter IDs
    541   (APID). Each bit in the mask, from left to right (i.e., from most significant
    542   to least significant bit in big endian order), corresponds to an APID from
    543   0-255. If a bit is set, the APID is marked as usable only by the default AP
    544   queue device drivers; otherwise, the APID is usable by the vfio_ap
    545   device driver.
    546
    547   The 'aqmask' is a 256-bit mask that identifies a set of AP queue indexes
    548   (APQI). Each bit in the mask, from left to right (i.e., from most significant
    549   to least significant bit in big endian order), corresponds to an APQI from
    550   0-255. If a bit is set, the APQI is marked as usable only by the default AP
    551   queue device drivers; otherwise, the APQI is usable by the vfio_ap device
    552   driver.
    553
    554   Take, for example, the following mask::
    555
    556      0x7dffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff
    557
    558    It indicates:
    559
    560      1, 2, 3, 4, 5, and 7-255 belong to the default drivers' pool, and 0 and 6
    561      belong to the vfio_ap device driver's pool.
    562
    563   The APQN of each AP queue device assigned to the linux host is checked by the
    564   AP bus against the set of APQNs derived from the cross product of APIDs
    565   and APQIs marked as usable only by the default AP queue device drivers. If a
    566   match is detected,  only the default AP queue device drivers will be probed;
    567   otherwise, the vfio_ap device driver will be probed.
    568
    569   By default, the two masks are set to reserve all APQNs for use by the default
    570   AP queue device drivers. There are two ways the default masks can be changed:
    571
    572   1. The sysfs mask files can be edited by echoing a string into the
    573      respective sysfs mask file in one of two formats:
    574
    575      * An absolute hex string starting with 0x - like "0x12345678" - sets
    576	the mask. If the given string is shorter than the mask, it is padded
    577	with 0s on the right; for example, specifying a mask value of 0x41 is
    578	the same as specifying::
    579
    580	   0x4100000000000000000000000000000000000000000000000000000000000000
    581
    582	Keep in mind that the mask reads from left to right (i.e., most
    583	significant to least significant bit in big endian order), so the mask
    584	above identifies device numbers 1 and 7 (01000001).
    585
    586	If the string is longer than the mask, the operation is terminated with
    587	an error (EINVAL).
    588
    589      * Individual bits in the mask can be switched on and off by specifying
    590	each bit number to be switched in a comma separated list. Each bit
    591	number string must be prepended with a ('+') or minus ('-') to indicate
    592	the corresponding bit is to be switched on ('+') or off ('-'). Some
    593	valid values are:
    594
    595	   - "+0"    switches bit 0 on
    596	   - "-13"   switches bit 13 off
    597	   - "+0x41" switches bit 65 on
    598	   - "-0xff" switches bit 255 off
    599
    600	The following example:
    601
    602	      +0,-6,+0x47,-0xf0
    603
    604	Switches bits 0 and 71 (0x47) on
    605
    606	Switches bits 6 and 240 (0xf0) off
    607
    608	Note that the bits not specified in the list remain as they were before
    609	the operation.
    610
    611   2. The masks can also be changed at boot time via parameters on the kernel
    612      command line like this:
    613
    614	 ap.apmask=0xffff ap.aqmask=0x40
    615
    616	 This would create the following masks::
    617
    618	    apmask:
    619	    0xffff000000000000000000000000000000000000000000000000000000000000
    620
    621	    aqmask:
    622	    0x4000000000000000000000000000000000000000000000000000000000000000
    623
    624	 Resulting in these two pools::
    625
    626	    default drivers pool:    adapter 0-15, domain 1
    627	    alternate drivers pool:  adapter 16-255, domains 0, 2-255
    628
    629Securing the APQNs for our example
    630----------------------------------
    631   To secure the AP queues 05.0004, 05.0047, 05.00ab, 05.00ff, 06.0004, 06.0047,
    632   06.00ab, and 06.00ff for use by the vfio_ap device driver, the corresponding
    633   APQNs can either be removed from the default masks::
    634
    635      echo -5,-6 > /sys/bus/ap/apmask
    636
    637      echo -4,-0x47,-0xab,-0xff > /sys/bus/ap/aqmask
    638
    639   Or the masks can be set as follows::
    640
    641      echo 0xf9ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff \
    642      > apmask
    643
    644      echo 0xf7fffffffffffffffeffffffffffffffffffffffffeffffffffffffffffffffe \
    645      > aqmask
    646
    647   This will result in AP queues 05.0004, 05.0047, 05.00ab, 05.00ff, 06.0004,
    648   06.0047, 06.00ab, and 06.00ff getting bound to the vfio_ap device driver. The
    649   sysfs directory for the vfio_ap device driver will now contain symbolic links
    650   to the AP queue devices bound to it::
    651
    652     /sys/bus/ap
    653     ... [drivers]
    654     ...... [vfio_ap]
    655     ......... [05.0004]
    656     ......... [05.0047]
    657     ......... [05.00ab]
    658     ......... [05.00ff]
    659     ......... [06.0004]
    660     ......... [06.0047]
    661     ......... [06.00ab]
    662     ......... [06.00ff]
    663
    664   Keep in mind that only type 10 and newer adapters (i.e., CEX4 and later)
    665   can be bound to the vfio_ap device driver. The reason for this is to
    666   simplify the implementation by not needlessly complicating the design by
    667   supporting older devices that will go out of service in the relatively near
    668   future and for which there are few older systems on which to test.
    669
    670   The administrator, therefore, must take care to secure only AP queues that
    671   can be bound to the vfio_ap device driver. The device type for a given AP
    672   queue device can be read from the parent card's sysfs directory. For example,
    673   to see the hardware type of the queue 05.0004:
    674
    675     cat /sys/bus/ap/devices/card05/hwtype
    676
    677   The hwtype must be 10 or higher (CEX4 or newer) in order to be bound to the
    678   vfio_ap device driver.
    679
    6803. Create the mediated devices needed to configure the AP matrixes for the
    681   three guests and to provide an interface to the vfio_ap driver for
    682   use by the guests::
    683
    684     /sys/devices/vfio_ap/matrix/
    685     --- [mdev_supported_types]
    686     ------ [vfio_ap-passthrough] (passthrough mediated matrix device type)
    687     --------- create
    688     --------- [devices]
    689
    690   To create the mediated devices for the three guests::
    691
    692	uuidgen > create
    693	uuidgen > create
    694	uuidgen > create
    695
    696	or
    697
    698	echo $uuid1 > create
    699	echo $uuid2 > create
    700	echo $uuid3 > create
    701
    702   This will create three mediated devices in the [devices] subdirectory named
    703   after the UUID written to the create attribute file. We call them $uuid1,
    704   $uuid2 and $uuid3 and this is the sysfs directory structure after creation::
    705
    706     /sys/devices/vfio_ap/matrix/
    707     --- [mdev_supported_types]
    708     ------ [vfio_ap-passthrough]
    709     --------- [devices]
    710     ------------ [$uuid1]
    711     --------------- assign_adapter
    712     --------------- assign_control_domain
    713     --------------- assign_domain
    714     --------------- matrix
    715     --------------- unassign_adapter
    716     --------------- unassign_control_domain
    717     --------------- unassign_domain
    718
    719     ------------ [$uuid2]
    720     --------------- assign_adapter
    721     --------------- assign_control_domain
    722     --------------- assign_domain
    723     --------------- matrix
    724     --------------- unassign_adapter
    725     ----------------unassign_control_domain
    726     ----------------unassign_domain
    727
    728     ------------ [$uuid3]
    729     --------------- assign_adapter
    730     --------------- assign_control_domain
    731     --------------- assign_domain
    732     --------------- matrix
    733     --------------- unassign_adapter
    734     ----------------unassign_control_domain
    735     ----------------unassign_domain
    736
    7374. The administrator now needs to configure the matrixes for the mediated
    738   devices $uuid1 (for Guest1), $uuid2 (for Guest2) and $uuid3 (for Guest3).
    739
    740   This is how the matrix is configured for Guest1::
    741
    742      echo 5 > assign_adapter
    743      echo 6 > assign_adapter
    744      echo 4 > assign_domain
    745      echo 0xab > assign_domain
    746
    747   Control domains can similarly be assigned using the assign_control_domain
    748   sysfs file.
    749
    750   If a mistake is made configuring an adapter, domain or control domain,
    751   you can use the unassign_xxx files to unassign the adapter, domain or
    752   control domain.
    753
    754   To display the matrix configuration for Guest1::
    755
    756	 cat matrix
    757
    758   This is how the matrix is configured for Guest2::
    759
    760      echo 5 > assign_adapter
    761      echo 0x47 > assign_domain
    762      echo 0xff > assign_domain
    763
    764   This is how the matrix is configured for Guest3::
    765
    766      echo 6 > assign_adapter
    767      echo 0x47 > assign_domain
    768      echo 0xff > assign_domain
    769
    770   In order to successfully assign an adapter:
    771
    772   * The adapter number specified must represent a value from 0 up to the
    773     maximum adapter number configured for the system. If an adapter number
    774     higher than the maximum is specified, the operation will terminate with
    775     an error (ENODEV).
    776
    777   * All APQNs that can be derived from the adapter ID and the IDs of
    778     the previously assigned domains must be bound to the vfio_ap device
    779     driver. If no domains have yet been assigned, then there must be at least
    780     one APQN with the specified APID bound to the vfio_ap driver. If no such
    781     APQNs are bound to the driver, the operation will terminate with an
    782     error (EADDRNOTAVAIL).
    783
    784     No APQN that can be derived from the adapter ID and the IDs of the
    785     previously assigned domains can be assigned to another mediated matrix
    786     device. If an APQN is assigned to another mediated matrix device, the
    787     operation will terminate with an error (EADDRINUSE).
    788
    789   In order to successfully assign a domain:
    790
    791   * The domain number specified must represent a value from 0 up to the
    792     maximum domain number configured for the system. If a domain number
    793     higher than the maximum is specified, the operation will terminate with
    794     an error (ENODEV).
    795
    796   * All APQNs that can be derived from the domain ID and the IDs of
    797     the previously assigned adapters must be bound to the vfio_ap device
    798     driver. If no domains have yet been assigned, then there must be at least
    799     one APQN with the specified APQI bound to the vfio_ap driver. If no such
    800     APQNs are bound to the driver, the operation will terminate with an
    801     error (EADDRNOTAVAIL).
    802
    803     No APQN that can be derived from the domain ID and the IDs of the
    804     previously assigned adapters can be assigned to another mediated matrix
    805     device. If an APQN is assigned to another mediated matrix device, the
    806     operation will terminate with an error (EADDRINUSE).
    807
    808   In order to successfully assign a control domain, the domain number
    809   specified must represent a value from 0 up to the maximum domain number
    810   configured for the system. If a control domain number higher than the maximum
    811   is specified, the operation will terminate with an error (ENODEV).
    812
    8135. Start Guest1::
    814
    815     /usr/bin/qemu-system-s390x ... -cpu host,ap=on,apqci=on,apft=on \
    816	-device vfio-ap,sysfsdev=/sys/devices/vfio_ap/matrix/$uuid1 ...
    817
    8187. Start Guest2::
    819
    820     /usr/bin/qemu-system-s390x ... -cpu host,ap=on,apqci=on,apft=on \
    821	-device vfio-ap,sysfsdev=/sys/devices/vfio_ap/matrix/$uuid2 ...
    822
    8237. Start Guest3::
    824
    825     /usr/bin/qemu-system-s390x ... -cpu host,ap=on,apqci=on,apft=on \
    826	-device vfio-ap,sysfsdev=/sys/devices/vfio_ap/matrix/$uuid3 ...
    827
    828When the guest is shut down, the mediated matrix devices may be removed.
    829
    830Using our example again, to remove the mediated matrix device $uuid1::
    831
    832   /sys/devices/vfio_ap/matrix/
    833      --- [mdev_supported_types]
    834      ------ [vfio_ap-passthrough]
    835      --------- [devices]
    836      ------------ [$uuid1]
    837      --------------- remove
    838
    839::
    840
    841   echo 1 > remove
    842
    843This will remove all of the mdev matrix device's sysfs structures including
    844the mdev device itself. To recreate and reconfigure the mdev matrix device,
    845all of the steps starting with step 3 will have to be performed again. Note
    846that the remove will fail if a guest using the mdev is still running.
    847
    848It is not necessary to remove an mdev matrix device, but one may want to
    849remove it if no guest will use it during the remaining lifetime of the linux
    850host. If the mdev matrix device is removed, one may want to also reconfigure
    851the pool of adapters and queues reserved for use by the default drivers.
    852
    853Limitations
    854===========
    855* The KVM/kernel interfaces do not provide a way to prevent restoring an APQN
    856  to the default drivers pool of a queue that is still assigned to a mediated
    857  device in use by a guest. It is incumbent upon the administrator to
    858  ensure there is no mediated device in use by a guest to which the APQN is
    859  assigned lest the host be given access to the private data of the AP queue
    860  device such as a private key configured specifically for the guest.
    861
    862* Dynamically modifying the AP matrix for a running guest (which would amount to
    863  hot(un)plug of AP devices for the guest) is currently not supported
    864
    865* Live guest migration is not supported for guests using AP devices.