cachepc-qemu

Fork of AMDESE/qemu with changes for cachepc side-channel attack
git clone https://git.sinitax.com/sinitax/cachepc-qemu
Log | Files | Refs | Submodules | LICENSE | sfeed.txt

vfio-ap.rst (35036B)


      1Adjunct Processor (AP) Device
      2=============================
      3
      4.. contents::
      5
      6Introduction
      7------------
      8
      9The IBM Adjunct Processor (AP) Cryptographic Facility is comprised
     10of three AP instructions and from 1 to 256 PCIe cryptographic adapter cards.
     11These AP devices provide cryptographic functions to all CPUs assigned to a
     12linux system running in an IBM Z system LPAR.
     13
     14On s390x, AP adapter cards are exposed via the AP bus. This document
     15describes how those cards may be made available to KVM guests using the
     16VFIO mediated device framework.
     17
     18AP Architectural Overview
     19-------------------------
     20
     21In order understand the terminology used in the rest of this document, let's
     22start with some definitions:
     23
     24* AP adapter
     25
     26  An AP adapter is an IBM Z adapter card that can perform cryptographic
     27  functions. There can be from 0 to 256 adapters assigned to an LPAR depending
     28  on the machine model. Adapters assigned to the LPAR in which a linux host is
     29  running will be available to the linux host. Each adapter is identified by a
     30  number from 0 to 255; however, the maximum adapter number allowed is
     31  determined by machine model. When installed, an AP adapter is accessed by
     32  AP instructions executed by any CPU.
     33
     34* AP domain
     35
     36  An adapter is partitioned into domains. Each domain can be thought of as
     37  a set of hardware registers for processing AP instructions. An adapter can
     38  hold up to 256 domains; however, the maximum domain number allowed is
     39  determined by machine model. Each domain is identified by a number from 0 to
     40  255. Domains can be further classified into two types:
     41
     42    * Usage domains are domains that can be accessed directly to process AP
     43      commands
     44
     45    * Control domains are domains that are accessed indirectly by AP
     46      commands sent to a usage domain to control or change the domain; for
     47      example, to set a secure private key for the domain.
     48
     49* AP Queue
     50
     51  An AP queue is the means by which an AP command-request message is sent to an
     52  AP usage domain inside a specific AP. An AP queue is identified by a tuple
     53  comprised of an AP adapter ID (APID) and an AP queue index (APQI). The
     54  APQI corresponds to a given usage domain number within the adapter. This tuple
     55  forms an AP Queue Number (APQN) uniquely identifying an AP queue. AP
     56  instructions include a field containing the APQN to identify the AP queue to
     57  which the AP command-request message is to be sent for processing.
     58
     59* AP Instructions:
     60
     61  There are three AP instructions:
     62
     63  * NQAP: to enqueue an AP command-request message to a queue
     64  * DQAP: to dequeue an AP command-reply message from a queue
     65  * PQAP: to administer the queues
     66
     67  AP instructions identify the domain that is targeted to process the AP
     68  command; this must be one of the usage domains. An AP command may modify a
     69  domain that is not one of the usage domains, but the modified domain
     70  must be one of the control domains.
     71
     72Start Interpretive Execution (SIE) Instruction
     73----------------------------------------------
     74
     75A KVM guest is started by executing the Start Interpretive Execution (SIE)
     76instruction. The SIE state description is a control block that contains the
     77state information for a KVM guest and is supplied as input to the SIE
     78instruction. The SIE state description contains a satellite control block called
     79the Crypto Control Block (CRYCB). The CRYCB contains three fields to identify
     80the adapters, usage domains and control domains assigned to the KVM guest:
     81
     82* The AP Mask (APM) field is a bit mask that identifies the AP adapters assigned
     83  to the KVM guest. Each bit in the mask, from left to right, corresponds to
     84  an APID from 0-255. If a bit is set, the corresponding adapter is valid for
     85  use by the KVM guest.
     86
     87* The AP Queue Mask (AQM) field is a bit mask identifying the AP usage domains
     88  assigned to the KVM guest. Each bit in the mask, from left to right,
     89  corresponds to  an AP queue index (APQI) from 0-255. If a bit is set, the
     90  corresponding queue is valid for use by the KVM guest.
     91
     92* The AP Domain Mask field is a bit mask that identifies the AP control domains
     93  assigned to the KVM guest. The ADM bit mask controls which domains can be
     94  changed by an AP command-request message sent to a usage domain from the
     95  guest. Each bit in the mask, from left to right, corresponds to a domain from
     96  0-255. If a bit is set, the corresponding domain can be modified by an AP
     97  command-request message sent to a usage domain.
     98
     99If you recall from the description of an AP Queue, AP instructions include
    100an APQN to identify the AP adapter and AP queue to which an AP command-request
    101message is to be sent (NQAP and PQAP instructions), or from which a
    102command-reply message is to be received (DQAP instruction). The validity of an
    103APQN is defined by the matrix calculated from the APM and AQM; it is the
    104cross product of all assigned adapter numbers (APM) with all assigned queue
    105indexes (AQM). For example, if adapters 1 and 2 and usage domains 5 and 6 are
    106assigned to a guest, the APQNs (1,5), (1,6), (2,5) and (2,6) will be valid for
    107the guest.
    108
    109The APQNs can provide secure key functionality - i.e., a private key is stored
    110on the adapter card for each of its domains - so each APQN must be assigned to
    111at most one guest or the linux host.
    112
    113Example 1: Valid configuration
    114~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    115
    116+----------+--------+--------+
    117|          | Guest1 | Guest2 |
    118+==========+========+========+
    119| adapters |  1, 2  |  1, 2  |
    120+----------+--------+--------+
    121| domains  |  5, 6  |  7     |
    122+----------+--------+--------+
    123
    124This is valid because both guests have a unique set of APQNs:
    125
    126* Guest1 has APQNs (1,5), (1,6), (2,5) and (2,6);
    127* Guest2 has APQNs (1,7) and (2,7).
    128
    129Example 2: Valid configuration
    130~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    131
    132+----------+--------+--------+
    133|          | Guest1 | Guest2 |
    134+==========+========+========+
    135| adapters |  1, 2  |  3, 4  |
    136+----------+--------+--------+
    137| domains  |  5, 6  |  5, 6  |
    138+----------+--------+--------+
    139
    140This is also valid because both guests have a unique set of APQNs:
    141
    142* Guest1 has APQNs (1,5), (1,6), (2,5), (2,6);
    143* Guest2 has APQNs (3,5), (3,6), (4,5), (4,6)
    144
    145Example 3: Invalid configuration
    146~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    147
    148+----------+--------+--------+
    149|          | Guest1 | Guest2 |
    150+==========+========+========+
    151| adapters |  1, 2  |  1     |
    152+----------+--------+--------+
    153| domains  |  5, 6  |  6, 7  |
    154+----------+--------+--------+
    155
    156This is an invalid configuration because both guests have access to
    157APQN (1,6).
    158
    159AP Matrix Configuration on Linux Host
    160-------------------------------------
    161
    162A linux system is a guest of the LPAR in which it is running and has access to
    163the AP resources configured for the LPAR. The LPAR's AP matrix is
    164configured via its Activation Profile which can be edited on the HMC. When the
    165linux system is started, the AP bus will detect the AP devices assigned to the
    166LPAR and create the following in sysfs::
    167
    168  /sys/bus/ap
    169  ... [devices]
    170  ...... xx.yyyy
    171  ...... ...
    172  ...... cardxx
    173  ...... ...
    174
    175Where:
    176
    177``cardxx``
    178  is AP adapter number xx (in hex)
    179
    180``xx.yyyy``
    181  is an APQN with xx specifying the APID and yyyy specifying the APQI
    182
    183For example, if AP adapters 5 and 6 and domains 4, 71 (0x47), 171 (0xab) and
    184255 (0xff) are configured for the LPAR, the sysfs representation on the linux
    185host system would look like this::
    186
    187  /sys/bus/ap
    188  ... [devices]
    189  ...... 05.0004
    190  ...... 05.0047
    191  ...... 05.00ab
    192  ...... 05.00ff
    193  ...... 06.0004
    194  ...... 06.0047
    195  ...... 06.00ab
    196  ...... 06.00ff
    197  ...... card05
    198  ...... card06
    199
    200A set of default device drivers are also created to control each type of AP
    201device that can be assigned to the LPAR on which a linux host is running::
    202
    203  /sys/bus/ap
    204  ... [drivers]
    205  ...... [cex2acard]        for Crypto Express 2/3 accelerator cards
    206  ...... [cex2aqueue]       for AP queues served by Crypto Express 2/3
    207                            accelerator cards
    208  ...... [cex4card]         for Crypto Express 4/5/6 accelerator and coprocessor
    209                            cards
    210  ...... [cex4queue]        for AP queues served by Crypto Express 4/5/6
    211                            accelerator and coprocessor cards
    212  ...... [pcixcccard]       for Crypto Express 2/3 coprocessor cards
    213  ...... [pcixccqueue]      for AP queues served by Crypto Express 2/3
    214                            coprocessor cards
    215
    216Binding AP devices to device drivers
    217~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    218
    219There are two sysfs files that specify bitmasks marking a subset of the APQN
    220range as 'usable by the default AP queue device drivers' or 'not usable by the
    221default device drivers' and thus available for use by the alternate device
    222driver(s). The sysfs locations of the masks are::
    223
    224   /sys/bus/ap/apmask
    225   /sys/bus/ap/aqmask
    226
    227The ``apmask`` is a 256-bit mask that identifies a set of AP adapter IDs
    228(APID). Each bit in the mask, from left to right (i.e., from most significant
    229to least significant bit in big endian order), corresponds to an APID from
    2300-255. If a bit is set, the APID is marked as usable only by the default AP
    231queue device drivers; otherwise, the APID is usable by the vfio_ap
    232device driver.
    233
    234The ``aqmask`` is a 256-bit mask that identifies a set of AP queue indexes
    235(APQI). Each bit in the mask, from left to right (i.e., from most significant
    236to least significant bit in big endian order), corresponds to an APQI from
    2370-255. If a bit is set, the APQI is marked as usable only by the default AP
    238queue device drivers; otherwise, the APQI is usable by the vfio_ap device
    239driver.
    240
    241Take, for example, the following mask::
    242
    243      0x7dffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff
    244
    245It indicates:
    246
    247      1, 2, 3, 4, 5, and 7-255 belong to the default drivers' pool, and 0 and 6
    248      belong to the vfio_ap device driver's pool.
    249
    250The APQN of each AP queue device assigned to the linux host is checked by the
    251AP bus against the set of APQNs derived from the cross product of APIDs
    252and APQIs marked as usable only by the default AP queue device drivers. If a
    253match is detected,  only the default AP queue device drivers will be probed;
    254otherwise, the vfio_ap device driver will be probed.
    255
    256By default, the two masks are set to reserve all APQNs for use by the default
    257AP queue device drivers. There are two ways the default masks can be changed:
    258
    259 1. The sysfs mask files can be edited by echoing a string into the
    260    respective sysfs mask file in one of two formats:
    261
    262    * An absolute hex string starting with 0x - like "0x12345678" - sets
    263      the mask. If the given string is shorter than the mask, it is padded
    264      with 0s on the right; for example, specifying a mask value of 0x41 is
    265      the same as specifying::
    266
    267           0x4100000000000000000000000000000000000000000000000000000000000000
    268
    269      Keep in mind that the mask reads from left to right (i.e., most
    270      significant to least significant bit in big endian order), so the mask
    271      above identifies device numbers 1 and 7 (``01000001``).
    272
    273      If the string is longer than the mask, the operation is terminated with
    274      an error (EINVAL).
    275
    276    * Individual bits in the mask can be switched on and off by specifying
    277      each bit number to be switched in a comma separated list. Each bit
    278      number string must be prepended with a (``+``) or minus (``-``) to indicate
    279      the corresponding bit is to be switched on (``+``) or off (``-``). Some
    280      valid values are::
    281
    282           "+0"    switches bit 0 on
    283           "-13"   switches bit 13 off
    284           "+0x41" switches bit 65 on
    285           "-0xff" switches bit 255 off
    286
    287      The following example::
    288
    289              +0,-6,+0x47,-0xf0
    290
    291      Switches bits 0 and 71 (0x47) on
    292      Switches bits 6 and 240 (0xf0) off
    293
    294      Note that the bits not specified in the list remain as they were before
    295      the operation.
    296
    297 2. The masks can also be changed at boot time via parameters on the kernel
    298    command line like this::
    299
    300         ap.apmask=0xffff ap.aqmask=0x40
    301
    302    This would create the following masks:
    303
    304    apmask::
    305
    306            0xffff000000000000000000000000000000000000000000000000000000000000
    307
    308    aqmask::
    309
    310            0x4000000000000000000000000000000000000000000000000000000000000000
    311
    312    Resulting in these two pools::
    313
    314            default drivers pool:    adapter 0-15, domain 1
    315            alternate drivers pool:  adapter 16-255, domains 0, 2-255
    316
    317Configuring an AP matrix for a linux guest
    318~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    319
    320The sysfs interfaces for configuring an AP matrix for a guest are built on the
    321VFIO mediated device framework. To configure an AP matrix for a guest, a
    322mediated matrix device must first be created for the ``/sys/devices/vfio_ap/matrix``
    323device. When the vfio_ap device driver is loaded, it registers with the VFIO
    324mediated device framework. When the driver registers, the sysfs interfaces for
    325creating mediated matrix devices is created::
    326
    327  /sys/devices
    328  ... [vfio_ap]
    329  ......[matrix]
    330  ......... [mdev_supported_types]
    331  ............ [vfio_ap-passthrough]
    332  ............... create
    333  ............... [devices]
    334
    335A mediated AP matrix device is created by writing a UUID to the attribute file
    336named ``create``, for example::
    337
    338   uuidgen > create
    339
    340or
    341
    342::
    343
    344   echo $uuid > create
    345
    346When a mediated AP matrix device is created, a sysfs directory named after
    347the UUID is created in the ``devices`` subdirectory::
    348
    349  /sys/devices
    350  ... [vfio_ap]
    351  ......[matrix]
    352  ......... [mdev_supported_types]
    353  ............ [vfio_ap-passthrough]
    354  ............... create
    355  ............... [devices]
    356  .................. [$uuid]
    357
    358There will also be three sets of attribute files created in the mediated
    359matrix device's sysfs directory to configure an AP matrix for the
    360KVM guest::
    361
    362  /sys/devices
    363  ... [vfio_ap]
    364  ......[matrix]
    365  ......... [mdev_supported_types]
    366  ............ [vfio_ap-passthrough]
    367  ............... create
    368  ............... [devices]
    369  .................. [$uuid]
    370  ..................... assign_adapter
    371  ..................... assign_control_domain
    372  ..................... assign_domain
    373  ..................... matrix
    374  ..................... unassign_adapter
    375  ..................... unassign_control_domain
    376  ..................... unassign_domain
    377
    378``assign_adapter``
    379   To assign an AP adapter to the mediated matrix device, its APID is written
    380   to the ``assign_adapter`` file. This may be done multiple times to assign more
    381   than one adapter. The APID may be specified using conventional semantics
    382   as a decimal, hexadecimal, or octal number. For example, to assign adapters
    383   4, 5 and 16 to a mediated matrix device in decimal, hexadecimal and octal
    384   respectively::
    385
    386       echo 4 > assign_adapter
    387       echo 0x5 > assign_adapter
    388       echo 020 > assign_adapter
    389
    390   In order to successfully assign an adapter:
    391
    392   * The adapter number specified must represent a value from 0 up to the
    393     maximum adapter number allowed by the machine model. If an adapter number
    394     higher than the maximum is specified, the operation will terminate with
    395     an error (ENODEV).
    396
    397   * All APQNs that can be derived from the adapter ID being assigned and the
    398     IDs of the previously assigned domains must be bound to the vfio_ap device
    399     driver. If no domains have yet been assigned, then there must be at least
    400     one APQN with the specified APID bound to the vfio_ap driver. If no such
    401     APQNs are bound to the driver, the operation will terminate with an
    402     error (EADDRNOTAVAIL).
    403
    404   * No APQN that can be derived from the adapter ID and the IDs of the
    405     previously assigned domains can be assigned to another mediated matrix
    406     device. If an APQN is assigned to another mediated matrix device, the
    407     operation will terminate with an error (EADDRINUSE).
    408
    409``unassign_adapter``
    410   To unassign an AP adapter, its APID is written to the ``unassign_adapter``
    411   file. This may also be done multiple times to unassign more than one adapter.
    412
    413``assign_domain``
    414   To assign a usage domain, the domain number is written into the
    415   ``assign_domain`` file. This may be done multiple times to assign more than one
    416   usage domain. The domain number is specified using conventional semantics as
    417   a decimal, hexadecimal, or octal number. For example, to assign usage domains
    418   4, 8, and 71 to a mediated matrix device in decimal, hexadecimal and octal
    419   respectively::
    420
    421      echo 4 > assign_domain
    422      echo 0x8 > assign_domain
    423      echo 0107 > assign_domain
    424
    425   In order to successfully assign a domain:
    426
    427   * The domain number specified must represent a value from 0 up to the
    428     maximum domain number allowed by the machine model. If a domain number
    429     higher than the maximum is specified, the operation will terminate with
    430     an error (ENODEV).
    431
    432   * All APQNs that can be derived from the domain ID being assigned and the IDs
    433     of the previously assigned adapters must be bound to the vfio_ap device
    434     driver. If no domains have yet been assigned, then there must be at least
    435     one APQN with the specified APQI bound to the vfio_ap driver. If no such
    436     APQNs are bound to the driver, the operation will terminate with an
    437     error (EADDRNOTAVAIL).
    438
    439   * No APQN that can be derived from the domain ID being assigned and the IDs
    440     of the previously assigned adapters can be assigned to another mediated
    441     matrix device. If an APQN is assigned to another mediated matrix device,
    442     the operation will terminate with an error (EADDRINUSE).
    443
    444``unassign_domain``
    445   To unassign a usage domain, the domain number is written into the
    446   ``unassign_domain`` file. This may be done multiple times to unassign more than
    447   one usage domain.
    448
    449``assign_control_domain``
    450   To assign a control domain, the domain number is written into the
    451   ``assign_control_domain`` file. This may be done multiple times to
    452   assign more than one control domain. The domain number may be specified using
    453   conventional semantics as a decimal, hexadecimal, or octal number. For
    454   example, to assign  control domains 4, 8, and 71 to  a mediated matrix device
    455   in decimal, hexadecimal and octal respectively::
    456
    457      echo 4 > assign_domain
    458      echo 0x8 > assign_domain
    459      echo 0107 > assign_domain
    460
    461   In order to successfully assign a control domain, the domain number
    462   specified must represent a value from 0 up to the maximum domain number
    463   allowed by the machine model. If a control domain number higher than the
    464   maximum is specified, the operation will terminate with an error (ENODEV).
    465
    466``unassign_control_domain``
    467   To unassign a control domain, the domain number is written into the
    468   ``unassign_domain`` file. This may be done multiple times to unassign more than
    469   one control domain.
    470
    471Notes: No changes to the AP matrix will be allowed while a guest using
    472the mediated matrix device is running. Attempts to assign an adapter,
    473domain or control domain will be rejected and an error (EBUSY) returned.
    474
    475Starting a Linux Guest Configured with an AP Matrix
    476~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    477
    478To provide a mediated matrix device for use by a guest, the following option
    479must be specified on the QEMU command line::
    480
    481   -device vfio_ap,sysfsdev=$path-to-mdev
    482
    483The sysfsdev parameter specifies the path to the mediated matrix device.
    484There are a number of ways to specify this path::
    485
    486  /sys/devices/vfio_ap/matrix/$uuid
    487  /sys/bus/mdev/devices/$uuid
    488  /sys/bus/mdev/drivers/vfio_mdev/$uuid
    489  /sys/devices/vfio_ap/matrix/mdev_supported_types/vfio_ap-passthrough/devices/$uuid
    490
    491When the linux guest is started, the guest will open the mediated
    492matrix device's file descriptor to get information about the mediated matrix
    493device. The ``vfio_ap`` device driver will update the APM, AQM, and ADM fields in
    494the guest's CRYCB with the adapter, usage domain and control domains assigned
    495via the mediated matrix device's sysfs attribute files. Programs running on the
    496linux guest will then:
    497
    4981. Have direct access to the APQNs derived from the cross product of the AP
    499   adapter numbers (APID) and queue indexes (APQI) specified in the APM and AQM
    500   fields of the guests's CRYCB respectively. These APQNs identify the AP queues
    501   that are valid for use by the guest; meaning, AP commands can be sent by the
    502   guest to any of these queues for processing.
    503
    5042. Have authorization to process AP commands to change a control domain
    505   identified in the ADM field of the guest's CRYCB. The AP command must be sent
    506   to a valid APQN (see 1 above).
    507
    508CPU model features:
    509
    510Three CPU model features are available for controlling guest access to AP
    511facilities:
    512
    5131. AP facilities feature
    514
    515   The AP facilities feature indicates that AP facilities are installed on the
    516   guest. This feature will be exposed for use only if the AP facilities
    517   are installed on the host system. The feature is s390-specific and is
    518   represented as a parameter of the -cpu option on the QEMU command line::
    519
    520      qemu-system-s390x -cpu $model,ap=on|off
    521
    522   Where:
    523
    524      ``$model``
    525        is the CPU model defined for the guest (defaults to the model of
    526        the host system if not specified).
    527
    528      ``ap=on|off``
    529        indicates whether AP facilities are installed (on) or not
    530        (off). The default for CPU models zEC12 or newer
    531        is ``ap=on``. AP facilities must be installed on the guest if a
    532        vfio-ap device (``-device vfio-ap,sysfsdev=$path``) is configured
    533        for the guest, or the guest will fail to start.
    534
    5352. Query Configuration Information (QCI) facility
    536
    537   The QCI facility is used by the AP bus running on the guest to query the
    538   configuration of the AP facilities. This facility will be available
    539   only if the QCI facility is installed on the host system. The feature is
    540   s390-specific and is represented as a parameter of the -cpu option on the
    541   QEMU command line::
    542
    543      qemu-system-s390x -cpu $model,apqci=on|off
    544
    545   Where:
    546
    547      ``$model``
    548        is the CPU model defined for the guest
    549
    550      ``apqci=on|off``
    551        indicates whether the QCI facility is installed (on) or
    552        not (off). The default for CPU models zEC12 or newer
    553        is ``apqci=on``; for older models, QCI will not be installed.
    554
    555        If QCI is installed (``apqci=on``) but AP facilities are not
    556        (``ap=off``), an error message will be logged, but the guest
    557        will be allowed to start. It makes no sense to have QCI
    558        installed if the AP facilities are not; this is considered
    559        an invalid configuration.
    560
    561        If the QCI facility is not installed, APQNs with an APQI
    562        greater than 15 will not be detected by the AP bus
    563        running on the guest.
    564
    5653. Adjunct Process Facility Test (APFT) facility
    566
    567   The APFT facility is used by the AP bus running on the guest to test the
    568   AP facilities available for a given AP queue. This facility will be available
    569   only if the APFT facility is installed on the host system. The feature is
    570   s390-specific and is represented as a parameter of the -cpu option on the
    571   QEMU command line::
    572
    573      qemu-system-s390x -cpu $model,apft=on|off
    574
    575   Where:
    576
    577      ``$model``
    578        is the CPU model defined for the guest (defaults to the model of
    579        the host system if not specified).
    580
    581      ``apft=on|off``
    582        indicates whether the APFT facility is installed (on) or
    583        not (off). The default for CPU models zEC12 and
    584        newer is ``apft=on`` for older models, APFT will not be
    585        installed.
    586
    587        If APFT is installed (``apft=on``) but AP facilities are not
    588        (``ap=off``), an error message will be logged, but the guest
    589        will be allowed to start. It makes no sense to have APFT
    590        installed if the AP facilities are not; this is considered
    591        an invalid configuration.
    592
    593        It also makes no sense to turn APFT off because the AP bus
    594        running on the guest will not detect CEX4 and newer devices
    595        without it. Since only CEX4 and newer devices are supported
    596        for guest usage, no AP devices can be made accessible to a
    597        guest started without APFT installed.
    598
    599Hot plug a vfio-ap device into a running guest
    600~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    601
    602Only one vfio-ap device can be attached to the virtual machine's ap-bus, so a
    603vfio-ap device can be hot plugged if and only if no vfio-ap device is attached
    604to the bus already, whether via the QEMU command line or a prior hot plug
    605action.
    606
    607To hot plug a vfio-ap device, use the QEMU ``device_add`` command::
    608
    609    (qemu) device_add vfio-ap,sysfsdev="$path-to-mdev",id="$id"
    610
    611Where the ``$path-to-mdev`` value specifies the absolute path to a mediated
    612device to which AP resources to be used by the guest have been assigned.
    613``$id`` is the name value for the optional id parameter.
    614
    615Note that on Linux guests, the AP devices will be created in the
    616``/sys/bus/ap/devices`` directory when the AP bus subsequently performs its periodic
    617scan, so there may be a short delay before the AP devices are accessible on the
    618guest.
    619
    620The command will fail if:
    621
    622* A vfio-ap device has already been attached to the virtual machine's ap-bus.
    623
    624* The CPU model features for controlling guest access to AP facilities are not
    625  enabled (see 'CPU model features' subsection in the previous section).
    626
    627Hot unplug a vfio-ap device from a running guest
    628~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    629
    630A vfio-ap device can be unplugged from a running KVM guest if a vfio-ap device
    631has been attached to the virtual machine's ap-bus via the QEMU command line
    632or a prior hot plug action.
    633
    634To hot unplug a vfio-ap device, use the QEMU ``device_del`` command::
    635
    636    (qemu) device_del "$id"
    637
    638Where ``$id`` is the same id that was specified at device creation.
    639
    640On a Linux guest, the AP devices will be removed from the ``/sys/bus/ap/devices``
    641directory on the guest when the AP bus subsequently performs its periodic scan,
    642so there may be a short delay before the AP devices are no longer accessible by
    643the guest.
    644
    645The command will fail if the ``$path-to-mdev`` specified on the ``device_del`` command
    646does not match the value specified when the vfio-ap device was attached to
    647the virtual machine's ap-bus.
    648
    649Example: Configure AP Matrices for Three Linux Guests
    650-----------------------------------------------------
    651
    652Let's now provide an example to illustrate how KVM guests may be given
    653access to AP facilities. For this example, we will show how to configure
    654three guests such that executing the lszcrypt command on the guests would
    655look like this:
    656
    657Guest1::
    658
    659  CARD.DOMAIN TYPE  MODE
    660  ------------------------------
    661  05          CEX5C CCA-Coproc
    662  05.0004     CEX5C CCA-Coproc
    663  05.00ab     CEX5C CCA-Coproc
    664  06          CEX5A Accelerator
    665  06.0004     CEX5A Accelerator
    666  06.00ab     CEX5C CCA-Coproc
    667
    668Guest2::
    669
    670  CARD.DOMAIN TYPE  MODE
    671  ------------------------------
    672  05          CEX5A Accelerator
    673  05.0047     CEX5A Accelerator
    674  05.00ff     CEX5A Accelerator
    675
    676Guest3::
    677
    678  CARD.DOMAIN TYPE  MODE
    679  ------------------------------
    680  06          CEX5A Accelerator
    681  06.0047     CEX5A Accelerator
    682  06.00ff     CEX5A Accelerator
    683
    684These are the steps:
    685
    6861. Install the vfio_ap module on the linux host. The dependency chain for the
    687   vfio_ap module is:
    688
    689   * iommu
    690   * s390
    691   * zcrypt
    692   * vfio
    693   * vfio_mdev
    694   * vfio_mdev_device
    695   * KVM
    696
    697   To build the vfio_ap module, the kernel build must be configured with the
    698   following Kconfig elements selected:
    699
    700   * IOMMU_SUPPORT
    701   * S390
    702   * ZCRYPT
    703   * S390_AP_IOMMU
    704   * VFIO
    705   * VFIO_MDEV
    706   * VFIO_MDEV_DEVICE
    707   * KVM
    708
    709   If using make menuconfig select the following to build the vfio_ap module::
    710     -> Device Drivers
    711        -> IOMMU Hardware Support
    712           select S390 AP IOMMU Support
    713        -> VFIO Non-Privileged userspace driver framework
    714           -> Mediated device driver framework
    715              -> VFIO driver for Mediated devices
    716     -> I/O subsystem
    717        -> VFIO support for AP devices
    718
    7192. Secure the AP queues to be used by the three guests so that the host can not
    720   access them. To secure the AP queues 05.0004, 05.0047, 05.00ab, 05.00ff,
    721   06.0004, 06.0047, 06.00ab, and 06.00ff for use by the vfio_ap device driver,
    722   the corresponding APQNs must be removed from the default queue drivers pool
    723   as follows::
    724
    725      echo -5,-6 > /sys/bus/ap/apmask
    726
    727      echo -4,-0x47,-0xab,-0xff > /sys/bus/ap/aqmask
    728
    729   This will result in AP queues 05.0004, 05.0047, 05.00ab, 05.00ff, 06.0004,
    730   06.0047, 06.00ab, and 06.00ff getting bound to the vfio_ap device driver. The
    731   sysfs directory for the vfio_ap device driver will now contain symbolic links
    732   to the AP queue devices bound to it::
    733
    734     /sys/bus/ap
    735     ... [drivers]
    736     ...... [vfio_ap]
    737     ......... [05.0004]
    738     ......... [05.0047]
    739     ......... [05.00ab]
    740     ......... [05.00ff]
    741     ......... [06.0004]
    742     ......... [06.0047]
    743     ......... [06.00ab]
    744     ......... [06.00ff]
    745
    746   Keep in mind that only type 10 and newer adapters (i.e., CEX4 and later)
    747   can be bound to the vfio_ap device driver. The reason for this is to
    748   simplify the implementation by not needlessly complicating the design by
    749   supporting older devices that will go out of service in the relatively near
    750   future, and for which there are few older systems on which to test.
    751
    752   The administrator, therefore, must take care to secure only AP queues that
    753   can be bound to the vfio_ap device driver. The device type for a given AP
    754   queue device can be read from the parent card's sysfs directory. For example,
    755   to see the hardware type of the queue 05.0004::
    756
    757     cat /sys/bus/ap/devices/card05/hwtype
    758
    759   The hwtype must be 10 or higher (CEX4 or newer) in order to be bound to the
    760   vfio_ap device driver.
    761
    7623. Create the mediated devices needed to configure the AP matrixes for the
    763   three guests and to provide an interface to the vfio_ap driver for
    764   use by the guests::
    765
    766     /sys/devices/vfio_ap/matrix/
    767     ... [mdev_supported_types]
    768     ...... [vfio_ap-passthrough] (passthrough mediated matrix device type)
    769     ......... create
    770     ......... [devices]
    771
    772   To create the mediated devices for the three guests::
    773
    774       uuidgen > create
    775       uuidgen > create
    776       uuidgen > create
    777
    778   or
    779
    780   ::
    781
    782       echo $uuid1 > create
    783       echo $uuid2 > create
    784       echo $uuid3 > create
    785
    786   This will create three mediated devices in the [devices] subdirectory named
    787   after the UUID used to create the mediated device. We'll call them $uuid1,
    788   $uuid2 and $uuid3 and this is the sysfs directory structure after creation::
    789
    790     /sys/devices/vfio_ap/matrix/
    791     ... [mdev_supported_types]
    792     ...... [vfio_ap-passthrough]
    793     ......... [devices]
    794     ............ [$uuid1]
    795     ............... assign_adapter
    796     ............... assign_control_domain
    797     ............... assign_domain
    798     ............... matrix
    799     ............... unassign_adapter
    800     ............... unassign_control_domain
    801     ............... unassign_domain
    802
    803     ............ [$uuid2]
    804     ............... assign_adapter
    805     ............... assign_control_domain
    806     ............... assign_domain
    807     ............... matrix
    808     ............... unassign_adapter
    809     ............... unassign_control_domain
    810     ............... unassign_domain
    811
    812     ............ [$uuid3]
    813     ............... assign_adapter
    814     ............... assign_control_domain
    815     ............... assign_domain
    816     ............... matrix
    817     ............... unassign_adapter
    818     ............... unassign_control_domain
    819     ............... unassign_domain
    820
    8214. The administrator now needs to configure the matrixes for the mediated
    822   devices $uuid1 (for Guest1), $uuid2 (for Guest2) and $uuid3 (for Guest3).
    823
    824   This is how the matrix is configured for Guest1::
    825
    826      echo 5 > assign_adapter
    827      echo 6 > assign_adapter
    828      echo 4 > assign_domain
    829      echo 0xab > assign_domain
    830
    831   Control domains can similarly be assigned using the assign_control_domain
    832   sysfs file.
    833
    834   If a mistake is made configuring an adapter, domain or control domain,
    835   you can use the ``unassign_xxx`` interfaces to unassign the adapter, domain or
    836   control domain.
    837
    838   To display the matrix configuration for Guest1::
    839
    840         cat matrix
    841
    842   The output will display the APQNs in the format ``xx.yyyy``, where xx is
    843   the adapter number and yyyy is the domain number. The output for Guest1
    844   will look like this::
    845
    846         05.0004
    847         05.00ab
    848         06.0004
    849         06.00ab
    850
    851   This is how the matrix is configured for Guest2::
    852
    853      echo 5 > assign_adapter
    854      echo 0x47 > assign_domain
    855      echo 0xff > assign_domain
    856
    857   This is how the matrix is configured for Guest3::
    858
    859      echo 6 > assign_adapter
    860      echo 0x47 > assign_domain
    861      echo 0xff > assign_domain
    862
    8635. Start Guest1::
    864
    865   /usr/bin/qemu-system-s390x ... -cpu host,ap=on,apqci=on,apft=on -device vfio-ap,sysfsdev=/sys/devices/vfio_ap/matrix/$uuid1 ...
    866
    8677. Start Guest2::
    868
    869   /usr/bin/qemu-system-s390x ... -cpu host,ap=on,apqci=on,apft=on -device vfio-ap,sysfsdev=/sys/devices/vfio_ap/matrix/$uuid2 ...
    870
    8717. Start Guest3::
    872
    873   /usr/bin/qemu-system-s390x ... -cpu host,ap=on,apqci=on,apft=on -device vfio-ap,sysfsdev=/sys/devices/vfio_ap/matrix/$uuid3 ...
    874
    875When the guest is shut down, the mediated matrix devices may be removed.
    876
    877Using our example again, to remove the mediated matrix device $uuid1::
    878
    879   /sys/devices/vfio_ap/matrix/
    880   ... [mdev_supported_types]
    881   ...... [vfio_ap-passthrough]
    882   ......... [devices]
    883   ............ [$uuid1]
    884   ............... remove
    885
    886
    887   echo 1 > remove
    888
    889This will remove all of the mdev matrix device's sysfs structures including
    890the mdev device itself. To recreate and reconfigure the mdev matrix device,
    891all of the steps starting with step 3 will have to be performed again. Note
    892that the remove will fail if a guest using the mdev is still running.
    893
    894It is not necessary to remove an mdev matrix device, but one may want to
    895remove it if no guest will use it during the remaining lifetime of the linux
    896host. If the mdev matrix device is removed, one may want to also reconfigure
    897the pool of adapters and queues reserved for use by the default drivers.
    898
    899Limitations
    900-----------
    901
    902* The KVM/kernel interfaces do not provide a way to prevent restoring an APQN
    903  to the default drivers pool of a queue that is still assigned to a mediated
    904  device in use by a guest. It is incumbent upon the administrator to
    905  ensure there is no mediated device in use by a guest to which the APQN is
    906  assigned lest the host be given access to the private data of the AP queue
    907  device, such as a private key configured specifically for the guest.
    908
    909* Dynamically assigning AP resources to or unassigning AP resources from a
    910  mediated matrix device - see `Configuring an AP matrix for a linux guest`_
    911  section above - while a running guest is using it is currently not supported.
    912
    913* Live guest migration is not supported for guests using AP devices. If a guest
    914  is using AP devices, the vfio-ap device configured for the guest must be
    915  unplugged before migrating the guest (see `Hot unplug a vfio-ap device from a
    916  running guest`_ section above.)