vfio-ap.rst (35036B)
1Adjunct Processor (AP) Device 2============================= 3 4.. contents:: 5 6Introduction 7------------ 8 9The IBM Adjunct Processor (AP) Cryptographic Facility is comprised 10of three AP instructions and from 1 to 256 PCIe cryptographic adapter cards. 11These AP devices provide cryptographic functions to all CPUs assigned to a 12linux system running in an IBM Z system LPAR. 13 14On s390x, AP adapter cards are exposed via the AP bus. This document 15describes how those cards may be made available to KVM guests using the 16VFIO mediated device framework. 17 18AP Architectural Overview 19------------------------- 20 21In order understand the terminology used in the rest of this document, let's 22start with some definitions: 23 24* AP adapter 25 26 An AP adapter is an IBM Z adapter card that can perform cryptographic 27 functions. There can be from 0 to 256 adapters assigned to an LPAR depending 28 on the machine model. Adapters assigned to the LPAR in which a linux host is 29 running will be available to the linux host. Each adapter is identified by a 30 number from 0 to 255; however, the maximum adapter number allowed is 31 determined by machine model. When installed, an AP adapter is accessed by 32 AP instructions executed by any CPU. 33 34* AP domain 35 36 An adapter is partitioned into domains. Each domain can be thought of as 37 a set of hardware registers for processing AP instructions. An adapter can 38 hold up to 256 domains; however, the maximum domain number allowed is 39 determined by machine model. Each domain is identified by a number from 0 to 40 255. Domains can be further classified into two types: 41 42 * Usage domains are domains that can be accessed directly to process AP 43 commands 44 45 * Control domains are domains that are accessed indirectly by AP 46 commands sent to a usage domain to control or change the domain; for 47 example, to set a secure private key for the domain. 48 49* AP Queue 50 51 An AP queue is the means by which an AP command-request message is sent to an 52 AP usage domain inside a specific AP. An AP queue is identified by a tuple 53 comprised of an AP adapter ID (APID) and an AP queue index (APQI). The 54 APQI corresponds to a given usage domain number within the adapter. This tuple 55 forms an AP Queue Number (APQN) uniquely identifying an AP queue. AP 56 instructions include a field containing the APQN to identify the AP queue to 57 which the AP command-request message is to be sent for processing. 58 59* AP Instructions: 60 61 There are three AP instructions: 62 63 * NQAP: to enqueue an AP command-request message to a queue 64 * DQAP: to dequeue an AP command-reply message from a queue 65 * PQAP: to administer the queues 66 67 AP instructions identify the domain that is targeted to process the AP 68 command; this must be one of the usage domains. An AP command may modify a 69 domain that is not one of the usage domains, but the modified domain 70 must be one of the control domains. 71 72Start Interpretive Execution (SIE) Instruction 73---------------------------------------------- 74 75A KVM guest is started by executing the Start Interpretive Execution (SIE) 76instruction. The SIE state description is a control block that contains the 77state information for a KVM guest and is supplied as input to the SIE 78instruction. The SIE state description contains a satellite control block called 79the Crypto Control Block (CRYCB). The CRYCB contains three fields to identify 80the adapters, usage domains and control domains assigned to the KVM guest: 81 82* The AP Mask (APM) field is a bit mask that identifies the AP adapters assigned 83 to the KVM guest. Each bit in the mask, from left to right, corresponds to 84 an APID from 0-255. If a bit is set, the corresponding adapter is valid for 85 use by the KVM guest. 86 87* The AP Queue Mask (AQM) field is a bit mask identifying the AP usage domains 88 assigned to the KVM guest. Each bit in the mask, from left to right, 89 corresponds to an AP queue index (APQI) from 0-255. If a bit is set, the 90 corresponding queue is valid for use by the KVM guest. 91 92* The AP Domain Mask field is a bit mask that identifies the AP control domains 93 assigned to the KVM guest. The ADM bit mask controls which domains can be 94 changed by an AP command-request message sent to a usage domain from the 95 guest. Each bit in the mask, from left to right, corresponds to a domain from 96 0-255. If a bit is set, the corresponding domain can be modified by an AP 97 command-request message sent to a usage domain. 98 99If you recall from the description of an AP Queue, AP instructions include 100an APQN to identify the AP adapter and AP queue to which an AP command-request 101message is to be sent (NQAP and PQAP instructions), or from which a 102command-reply message is to be received (DQAP instruction). The validity of an 103APQN is defined by the matrix calculated from the APM and AQM; it is the 104cross product of all assigned adapter numbers (APM) with all assigned queue 105indexes (AQM). For example, if adapters 1 and 2 and usage domains 5 and 6 are 106assigned to a guest, the APQNs (1,5), (1,6), (2,5) and (2,6) will be valid for 107the guest. 108 109The APQNs can provide secure key functionality - i.e., a private key is stored 110on the adapter card for each of its domains - so each APQN must be assigned to 111at most one guest or the linux host. 112 113Example 1: Valid configuration 114~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 115 116+----------+--------+--------+ 117| | Guest1 | Guest2 | 118+==========+========+========+ 119| adapters | 1, 2 | 1, 2 | 120+----------+--------+--------+ 121| domains | 5, 6 | 7 | 122+----------+--------+--------+ 123 124This is valid because both guests have a unique set of APQNs: 125 126* Guest1 has APQNs (1,5), (1,6), (2,5) and (2,6); 127* Guest2 has APQNs (1,7) and (2,7). 128 129Example 2: Valid configuration 130~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 131 132+----------+--------+--------+ 133| | Guest1 | Guest2 | 134+==========+========+========+ 135| adapters | 1, 2 | 3, 4 | 136+----------+--------+--------+ 137| domains | 5, 6 | 5, 6 | 138+----------+--------+--------+ 139 140This is also valid because both guests have a unique set of APQNs: 141 142* Guest1 has APQNs (1,5), (1,6), (2,5), (2,6); 143* Guest2 has APQNs (3,5), (3,6), (4,5), (4,6) 144 145Example 3: Invalid configuration 146~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 147 148+----------+--------+--------+ 149| | Guest1 | Guest2 | 150+==========+========+========+ 151| adapters | 1, 2 | 1 | 152+----------+--------+--------+ 153| domains | 5, 6 | 6, 7 | 154+----------+--------+--------+ 155 156This is an invalid configuration because both guests have access to 157APQN (1,6). 158 159AP Matrix Configuration on Linux Host 160------------------------------------- 161 162A linux system is a guest of the LPAR in which it is running and has access to 163the AP resources configured for the LPAR. The LPAR's AP matrix is 164configured via its Activation Profile which can be edited on the HMC. When the 165linux system is started, the AP bus will detect the AP devices assigned to the 166LPAR and create the following in sysfs:: 167 168 /sys/bus/ap 169 ... [devices] 170 ...... xx.yyyy 171 ...... ... 172 ...... cardxx 173 ...... ... 174 175Where: 176 177``cardxx`` 178 is AP adapter number xx (in hex) 179 180``xx.yyyy`` 181 is an APQN with xx specifying the APID and yyyy specifying the APQI 182 183For example, if AP adapters 5 and 6 and domains 4, 71 (0x47), 171 (0xab) and 184255 (0xff) are configured for the LPAR, the sysfs representation on the linux 185host system would look like this:: 186 187 /sys/bus/ap 188 ... [devices] 189 ...... 05.0004 190 ...... 05.0047 191 ...... 05.00ab 192 ...... 05.00ff 193 ...... 06.0004 194 ...... 06.0047 195 ...... 06.00ab 196 ...... 06.00ff 197 ...... card05 198 ...... card06 199 200A set of default device drivers are also created to control each type of AP 201device that can be assigned to the LPAR on which a linux host is running:: 202 203 /sys/bus/ap 204 ... [drivers] 205 ...... [cex2acard] for Crypto Express 2/3 accelerator cards 206 ...... [cex2aqueue] for AP queues served by Crypto Express 2/3 207 accelerator cards 208 ...... [cex4card] for Crypto Express 4/5/6 accelerator and coprocessor 209 cards 210 ...... [cex4queue] for AP queues served by Crypto Express 4/5/6 211 accelerator and coprocessor cards 212 ...... [pcixcccard] for Crypto Express 2/3 coprocessor cards 213 ...... [pcixccqueue] for AP queues served by Crypto Express 2/3 214 coprocessor cards 215 216Binding AP devices to device drivers 217~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 218 219There are two sysfs files that specify bitmasks marking a subset of the APQN 220range as 'usable by the default AP queue device drivers' or 'not usable by the 221default device drivers' and thus available for use by the alternate device 222driver(s). The sysfs locations of the masks are:: 223 224 /sys/bus/ap/apmask 225 /sys/bus/ap/aqmask 226 227The ``apmask`` is a 256-bit mask that identifies a set of AP adapter IDs 228(APID). Each bit in the mask, from left to right (i.e., from most significant 229to least significant bit in big endian order), corresponds to an APID from 2300-255. If a bit is set, the APID is marked as usable only by the default AP 231queue device drivers; otherwise, the APID is usable by the vfio_ap 232device driver. 233 234The ``aqmask`` is a 256-bit mask that identifies a set of AP queue indexes 235(APQI). Each bit in the mask, from left to right (i.e., from most significant 236to least significant bit in big endian order), corresponds to an APQI from 2370-255. If a bit is set, the APQI is marked as usable only by the default AP 238queue device drivers; otherwise, the APQI is usable by the vfio_ap device 239driver. 240 241Take, for example, the following mask:: 242 243 0x7dffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff 244 245It indicates: 246 247 1, 2, 3, 4, 5, and 7-255 belong to the default drivers' pool, and 0 and 6 248 belong to the vfio_ap device driver's pool. 249 250The APQN of each AP queue device assigned to the linux host is checked by the 251AP bus against the set of APQNs derived from the cross product of APIDs 252and APQIs marked as usable only by the default AP queue device drivers. If a 253match is detected, only the default AP queue device drivers will be probed; 254otherwise, the vfio_ap device driver will be probed. 255 256By default, the two masks are set to reserve all APQNs for use by the default 257AP queue device drivers. There are two ways the default masks can be changed: 258 259 1. The sysfs mask files can be edited by echoing a string into the 260 respective sysfs mask file in one of two formats: 261 262 * An absolute hex string starting with 0x - like "0x12345678" - sets 263 the mask. If the given string is shorter than the mask, it is padded 264 with 0s on the right; for example, specifying a mask value of 0x41 is 265 the same as specifying:: 266 267 0x4100000000000000000000000000000000000000000000000000000000000000 268 269 Keep in mind that the mask reads from left to right (i.e., most 270 significant to least significant bit in big endian order), so the mask 271 above identifies device numbers 1 and 7 (``01000001``). 272 273 If the string is longer than the mask, the operation is terminated with 274 an error (EINVAL). 275 276 * Individual bits in the mask can be switched on and off by specifying 277 each bit number to be switched in a comma separated list. Each bit 278 number string must be prepended with a (``+``) or minus (``-``) to indicate 279 the corresponding bit is to be switched on (``+``) or off (``-``). Some 280 valid values are:: 281 282 "+0" switches bit 0 on 283 "-13" switches bit 13 off 284 "+0x41" switches bit 65 on 285 "-0xff" switches bit 255 off 286 287 The following example:: 288 289 +0,-6,+0x47,-0xf0 290 291 Switches bits 0 and 71 (0x47) on 292 Switches bits 6 and 240 (0xf0) off 293 294 Note that the bits not specified in the list remain as they were before 295 the operation. 296 297 2. The masks can also be changed at boot time via parameters on the kernel 298 command line like this:: 299 300 ap.apmask=0xffff ap.aqmask=0x40 301 302 This would create the following masks: 303 304 apmask:: 305 306 0xffff000000000000000000000000000000000000000000000000000000000000 307 308 aqmask:: 309 310 0x4000000000000000000000000000000000000000000000000000000000000000 311 312 Resulting in these two pools:: 313 314 default drivers pool: adapter 0-15, domain 1 315 alternate drivers pool: adapter 16-255, domains 0, 2-255 316 317Configuring an AP matrix for a linux guest 318~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 319 320The sysfs interfaces for configuring an AP matrix for a guest are built on the 321VFIO mediated device framework. To configure an AP matrix for a guest, a 322mediated matrix device must first be created for the ``/sys/devices/vfio_ap/matrix`` 323device. When the vfio_ap device driver is loaded, it registers with the VFIO 324mediated device framework. When the driver registers, the sysfs interfaces for 325creating mediated matrix devices is created:: 326 327 /sys/devices 328 ... [vfio_ap] 329 ......[matrix] 330 ......... [mdev_supported_types] 331 ............ [vfio_ap-passthrough] 332 ............... create 333 ............... [devices] 334 335A mediated AP matrix device is created by writing a UUID to the attribute file 336named ``create``, for example:: 337 338 uuidgen > create 339 340or 341 342:: 343 344 echo $uuid > create 345 346When a mediated AP matrix device is created, a sysfs directory named after 347the UUID is created in the ``devices`` subdirectory:: 348 349 /sys/devices 350 ... [vfio_ap] 351 ......[matrix] 352 ......... [mdev_supported_types] 353 ............ [vfio_ap-passthrough] 354 ............... create 355 ............... [devices] 356 .................. [$uuid] 357 358There will also be three sets of attribute files created in the mediated 359matrix device's sysfs directory to configure an AP matrix for the 360KVM guest:: 361 362 /sys/devices 363 ... [vfio_ap] 364 ......[matrix] 365 ......... [mdev_supported_types] 366 ............ [vfio_ap-passthrough] 367 ............... create 368 ............... [devices] 369 .................. [$uuid] 370 ..................... assign_adapter 371 ..................... assign_control_domain 372 ..................... assign_domain 373 ..................... matrix 374 ..................... unassign_adapter 375 ..................... unassign_control_domain 376 ..................... unassign_domain 377 378``assign_adapter`` 379 To assign an AP adapter to the mediated matrix device, its APID is written 380 to the ``assign_adapter`` file. This may be done multiple times to assign more 381 than one adapter. The APID may be specified using conventional semantics 382 as a decimal, hexadecimal, or octal number. For example, to assign adapters 383 4, 5 and 16 to a mediated matrix device in decimal, hexadecimal and octal 384 respectively:: 385 386 echo 4 > assign_adapter 387 echo 0x5 > assign_adapter 388 echo 020 > assign_adapter 389 390 In order to successfully assign an adapter: 391 392 * The adapter number specified must represent a value from 0 up to the 393 maximum adapter number allowed by the machine model. If an adapter number 394 higher than the maximum is specified, the operation will terminate with 395 an error (ENODEV). 396 397 * All APQNs that can be derived from the adapter ID being assigned and the 398 IDs of the previously assigned domains must be bound to the vfio_ap device 399 driver. If no domains have yet been assigned, then there must be at least 400 one APQN with the specified APID bound to the vfio_ap driver. If no such 401 APQNs are bound to the driver, the operation will terminate with an 402 error (EADDRNOTAVAIL). 403 404 * No APQN that can be derived from the adapter ID and the IDs of the 405 previously assigned domains can be assigned to another mediated matrix 406 device. If an APQN is assigned to another mediated matrix device, the 407 operation will terminate with an error (EADDRINUSE). 408 409``unassign_adapter`` 410 To unassign an AP adapter, its APID is written to the ``unassign_adapter`` 411 file. This may also be done multiple times to unassign more than one adapter. 412 413``assign_domain`` 414 To assign a usage domain, the domain number is written into the 415 ``assign_domain`` file. This may be done multiple times to assign more than one 416 usage domain. The domain number is specified using conventional semantics as 417 a decimal, hexadecimal, or octal number. For example, to assign usage domains 418 4, 8, and 71 to a mediated matrix device in decimal, hexadecimal and octal 419 respectively:: 420 421 echo 4 > assign_domain 422 echo 0x8 > assign_domain 423 echo 0107 > assign_domain 424 425 In order to successfully assign a domain: 426 427 * The domain number specified must represent a value from 0 up to the 428 maximum domain number allowed by the machine model. If a domain number 429 higher than the maximum is specified, the operation will terminate with 430 an error (ENODEV). 431 432 * All APQNs that can be derived from the domain ID being assigned and the IDs 433 of the previously assigned adapters must be bound to the vfio_ap device 434 driver. If no domains have yet been assigned, then there must be at least 435 one APQN with the specified APQI bound to the vfio_ap driver. If no such 436 APQNs are bound to the driver, the operation will terminate with an 437 error (EADDRNOTAVAIL). 438 439 * No APQN that can be derived from the domain ID being assigned and the IDs 440 of the previously assigned adapters can be assigned to another mediated 441 matrix device. If an APQN is assigned to another mediated matrix device, 442 the operation will terminate with an error (EADDRINUSE). 443 444``unassign_domain`` 445 To unassign a usage domain, the domain number is written into the 446 ``unassign_domain`` file. This may be done multiple times to unassign more than 447 one usage domain. 448 449``assign_control_domain`` 450 To assign a control domain, the domain number is written into the 451 ``assign_control_domain`` file. This may be done multiple times to 452 assign more than one control domain. The domain number may be specified using 453 conventional semantics as a decimal, hexadecimal, or octal number. For 454 example, to assign control domains 4, 8, and 71 to a mediated matrix device 455 in decimal, hexadecimal and octal respectively:: 456 457 echo 4 > assign_domain 458 echo 0x8 > assign_domain 459 echo 0107 > assign_domain 460 461 In order to successfully assign a control domain, the domain number 462 specified must represent a value from 0 up to the maximum domain number 463 allowed by the machine model. If a control domain number higher than the 464 maximum is specified, the operation will terminate with an error (ENODEV). 465 466``unassign_control_domain`` 467 To unassign a control domain, the domain number is written into the 468 ``unassign_domain`` file. This may be done multiple times to unassign more than 469 one control domain. 470 471Notes: No changes to the AP matrix will be allowed while a guest using 472the mediated matrix device is running. Attempts to assign an adapter, 473domain or control domain will be rejected and an error (EBUSY) returned. 474 475Starting a Linux Guest Configured with an AP Matrix 476~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 477 478To provide a mediated matrix device for use by a guest, the following option 479must be specified on the QEMU command line:: 480 481 -device vfio_ap,sysfsdev=$path-to-mdev 482 483The sysfsdev parameter specifies the path to the mediated matrix device. 484There are a number of ways to specify this path:: 485 486 /sys/devices/vfio_ap/matrix/$uuid 487 /sys/bus/mdev/devices/$uuid 488 /sys/bus/mdev/drivers/vfio_mdev/$uuid 489 /sys/devices/vfio_ap/matrix/mdev_supported_types/vfio_ap-passthrough/devices/$uuid 490 491When the linux guest is started, the guest will open the mediated 492matrix device's file descriptor to get information about the mediated matrix 493device. The ``vfio_ap`` device driver will update the APM, AQM, and ADM fields in 494the guest's CRYCB with the adapter, usage domain and control domains assigned 495via the mediated matrix device's sysfs attribute files. Programs running on the 496linux guest will then: 497 4981. Have direct access to the APQNs derived from the cross product of the AP 499 adapter numbers (APID) and queue indexes (APQI) specified in the APM and AQM 500 fields of the guests's CRYCB respectively. These APQNs identify the AP queues 501 that are valid for use by the guest; meaning, AP commands can be sent by the 502 guest to any of these queues for processing. 503 5042. Have authorization to process AP commands to change a control domain 505 identified in the ADM field of the guest's CRYCB. The AP command must be sent 506 to a valid APQN (see 1 above). 507 508CPU model features: 509 510Three CPU model features are available for controlling guest access to AP 511facilities: 512 5131. AP facilities feature 514 515 The AP facilities feature indicates that AP facilities are installed on the 516 guest. This feature will be exposed for use only if the AP facilities 517 are installed on the host system. The feature is s390-specific and is 518 represented as a parameter of the -cpu option on the QEMU command line:: 519 520 qemu-system-s390x -cpu $model,ap=on|off 521 522 Where: 523 524 ``$model`` 525 is the CPU model defined for the guest (defaults to the model of 526 the host system if not specified). 527 528 ``ap=on|off`` 529 indicates whether AP facilities are installed (on) or not 530 (off). The default for CPU models zEC12 or newer 531 is ``ap=on``. AP facilities must be installed on the guest if a 532 vfio-ap device (``-device vfio-ap,sysfsdev=$path``) is configured 533 for the guest, or the guest will fail to start. 534 5352. Query Configuration Information (QCI) facility 536 537 The QCI facility is used by the AP bus running on the guest to query the 538 configuration of the AP facilities. This facility will be available 539 only if the QCI facility is installed on the host system. The feature is 540 s390-specific and is represented as a parameter of the -cpu option on the 541 QEMU command line:: 542 543 qemu-system-s390x -cpu $model,apqci=on|off 544 545 Where: 546 547 ``$model`` 548 is the CPU model defined for the guest 549 550 ``apqci=on|off`` 551 indicates whether the QCI facility is installed (on) or 552 not (off). The default for CPU models zEC12 or newer 553 is ``apqci=on``; for older models, QCI will not be installed. 554 555 If QCI is installed (``apqci=on``) but AP facilities are not 556 (``ap=off``), an error message will be logged, but the guest 557 will be allowed to start. It makes no sense to have QCI 558 installed if the AP facilities are not; this is considered 559 an invalid configuration. 560 561 If the QCI facility is not installed, APQNs with an APQI 562 greater than 15 will not be detected by the AP bus 563 running on the guest. 564 5653. Adjunct Process Facility Test (APFT) facility 566 567 The APFT facility is used by the AP bus running on the guest to test the 568 AP facilities available for a given AP queue. This facility will be available 569 only if the APFT facility is installed on the host system. The feature is 570 s390-specific and is represented as a parameter of the -cpu option on the 571 QEMU command line:: 572 573 qemu-system-s390x -cpu $model,apft=on|off 574 575 Where: 576 577 ``$model`` 578 is the CPU model defined for the guest (defaults to the model of 579 the host system if not specified). 580 581 ``apft=on|off`` 582 indicates whether the APFT facility is installed (on) or 583 not (off). The default for CPU models zEC12 and 584 newer is ``apft=on`` for older models, APFT will not be 585 installed. 586 587 If APFT is installed (``apft=on``) but AP facilities are not 588 (``ap=off``), an error message will be logged, but the guest 589 will be allowed to start. It makes no sense to have APFT 590 installed if the AP facilities are not; this is considered 591 an invalid configuration. 592 593 It also makes no sense to turn APFT off because the AP bus 594 running on the guest will not detect CEX4 and newer devices 595 without it. Since only CEX4 and newer devices are supported 596 for guest usage, no AP devices can be made accessible to a 597 guest started without APFT installed. 598 599Hot plug a vfio-ap device into a running guest 600~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 601 602Only one vfio-ap device can be attached to the virtual machine's ap-bus, so a 603vfio-ap device can be hot plugged if and only if no vfio-ap device is attached 604to the bus already, whether via the QEMU command line or a prior hot plug 605action. 606 607To hot plug a vfio-ap device, use the QEMU ``device_add`` command:: 608 609 (qemu) device_add vfio-ap,sysfsdev="$path-to-mdev",id="$id" 610 611Where the ``$path-to-mdev`` value specifies the absolute path to a mediated 612device to which AP resources to be used by the guest have been assigned. 613``$id`` is the name value for the optional id parameter. 614 615Note that on Linux guests, the AP devices will be created in the 616``/sys/bus/ap/devices`` directory when the AP bus subsequently performs its periodic 617scan, so there may be a short delay before the AP devices are accessible on the 618guest. 619 620The command will fail if: 621 622* A vfio-ap device has already been attached to the virtual machine's ap-bus. 623 624* The CPU model features for controlling guest access to AP facilities are not 625 enabled (see 'CPU model features' subsection in the previous section). 626 627Hot unplug a vfio-ap device from a running guest 628~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 629 630A vfio-ap device can be unplugged from a running KVM guest if a vfio-ap device 631has been attached to the virtual machine's ap-bus via the QEMU command line 632or a prior hot plug action. 633 634To hot unplug a vfio-ap device, use the QEMU ``device_del`` command:: 635 636 (qemu) device_del "$id" 637 638Where ``$id`` is the same id that was specified at device creation. 639 640On a Linux guest, the AP devices will be removed from the ``/sys/bus/ap/devices`` 641directory on the guest when the AP bus subsequently performs its periodic scan, 642so there may be a short delay before the AP devices are no longer accessible by 643the guest. 644 645The command will fail if the ``$path-to-mdev`` specified on the ``device_del`` command 646does not match the value specified when the vfio-ap device was attached to 647the virtual machine's ap-bus. 648 649Example: Configure AP Matrices for Three Linux Guests 650----------------------------------------------------- 651 652Let's now provide an example to illustrate how KVM guests may be given 653access to AP facilities. For this example, we will show how to configure 654three guests such that executing the lszcrypt command on the guests would 655look like this: 656 657Guest1:: 658 659 CARD.DOMAIN TYPE MODE 660 ------------------------------ 661 05 CEX5C CCA-Coproc 662 05.0004 CEX5C CCA-Coproc 663 05.00ab CEX5C CCA-Coproc 664 06 CEX5A Accelerator 665 06.0004 CEX5A Accelerator 666 06.00ab CEX5C CCA-Coproc 667 668Guest2:: 669 670 CARD.DOMAIN TYPE MODE 671 ------------------------------ 672 05 CEX5A Accelerator 673 05.0047 CEX5A Accelerator 674 05.00ff CEX5A Accelerator 675 676Guest3:: 677 678 CARD.DOMAIN TYPE MODE 679 ------------------------------ 680 06 CEX5A Accelerator 681 06.0047 CEX5A Accelerator 682 06.00ff CEX5A Accelerator 683 684These are the steps: 685 6861. Install the vfio_ap module on the linux host. The dependency chain for the 687 vfio_ap module is: 688 689 * iommu 690 * s390 691 * zcrypt 692 * vfio 693 * vfio_mdev 694 * vfio_mdev_device 695 * KVM 696 697 To build the vfio_ap module, the kernel build must be configured with the 698 following Kconfig elements selected: 699 700 * IOMMU_SUPPORT 701 * S390 702 * ZCRYPT 703 * S390_AP_IOMMU 704 * VFIO 705 * VFIO_MDEV 706 * VFIO_MDEV_DEVICE 707 * KVM 708 709 If using make menuconfig select the following to build the vfio_ap module:: 710 -> Device Drivers 711 -> IOMMU Hardware Support 712 select S390 AP IOMMU Support 713 -> VFIO Non-Privileged userspace driver framework 714 -> Mediated device driver framework 715 -> VFIO driver for Mediated devices 716 -> I/O subsystem 717 -> VFIO support for AP devices 718 7192. Secure the AP queues to be used by the three guests so that the host can not 720 access them. To secure the AP queues 05.0004, 05.0047, 05.00ab, 05.00ff, 721 06.0004, 06.0047, 06.00ab, and 06.00ff for use by the vfio_ap device driver, 722 the corresponding APQNs must be removed from the default queue drivers pool 723 as follows:: 724 725 echo -5,-6 > /sys/bus/ap/apmask 726 727 echo -4,-0x47,-0xab,-0xff > /sys/bus/ap/aqmask 728 729 This will result in AP queues 05.0004, 05.0047, 05.00ab, 05.00ff, 06.0004, 730 06.0047, 06.00ab, and 06.00ff getting bound to the vfio_ap device driver. The 731 sysfs directory for the vfio_ap device driver will now contain symbolic links 732 to the AP queue devices bound to it:: 733 734 /sys/bus/ap 735 ... [drivers] 736 ...... [vfio_ap] 737 ......... [05.0004] 738 ......... [05.0047] 739 ......... [05.00ab] 740 ......... [05.00ff] 741 ......... [06.0004] 742 ......... [06.0047] 743 ......... [06.00ab] 744 ......... [06.00ff] 745 746 Keep in mind that only type 10 and newer adapters (i.e., CEX4 and later) 747 can be bound to the vfio_ap device driver. The reason for this is to 748 simplify the implementation by not needlessly complicating the design by 749 supporting older devices that will go out of service in the relatively near 750 future, and for which there are few older systems on which to test. 751 752 The administrator, therefore, must take care to secure only AP queues that 753 can be bound to the vfio_ap device driver. The device type for a given AP 754 queue device can be read from the parent card's sysfs directory. For example, 755 to see the hardware type of the queue 05.0004:: 756 757 cat /sys/bus/ap/devices/card05/hwtype 758 759 The hwtype must be 10 or higher (CEX4 or newer) in order to be bound to the 760 vfio_ap device driver. 761 7623. Create the mediated devices needed to configure the AP matrixes for the 763 three guests and to provide an interface to the vfio_ap driver for 764 use by the guests:: 765 766 /sys/devices/vfio_ap/matrix/ 767 ... [mdev_supported_types] 768 ...... [vfio_ap-passthrough] (passthrough mediated matrix device type) 769 ......... create 770 ......... [devices] 771 772 To create the mediated devices for the three guests:: 773 774 uuidgen > create 775 uuidgen > create 776 uuidgen > create 777 778 or 779 780 :: 781 782 echo $uuid1 > create 783 echo $uuid2 > create 784 echo $uuid3 > create 785 786 This will create three mediated devices in the [devices] subdirectory named 787 after the UUID used to create the mediated device. We'll call them $uuid1, 788 $uuid2 and $uuid3 and this is the sysfs directory structure after creation:: 789 790 /sys/devices/vfio_ap/matrix/ 791 ... [mdev_supported_types] 792 ...... [vfio_ap-passthrough] 793 ......... [devices] 794 ............ [$uuid1] 795 ............... assign_adapter 796 ............... assign_control_domain 797 ............... assign_domain 798 ............... matrix 799 ............... unassign_adapter 800 ............... unassign_control_domain 801 ............... unassign_domain 802 803 ............ [$uuid2] 804 ............... assign_adapter 805 ............... assign_control_domain 806 ............... assign_domain 807 ............... matrix 808 ............... unassign_adapter 809 ............... unassign_control_domain 810 ............... unassign_domain 811 812 ............ [$uuid3] 813 ............... assign_adapter 814 ............... assign_control_domain 815 ............... assign_domain 816 ............... matrix 817 ............... unassign_adapter 818 ............... unassign_control_domain 819 ............... unassign_domain 820 8214. The administrator now needs to configure the matrixes for the mediated 822 devices $uuid1 (for Guest1), $uuid2 (for Guest2) and $uuid3 (for Guest3). 823 824 This is how the matrix is configured for Guest1:: 825 826 echo 5 > assign_adapter 827 echo 6 > assign_adapter 828 echo 4 > assign_domain 829 echo 0xab > assign_domain 830 831 Control domains can similarly be assigned using the assign_control_domain 832 sysfs file. 833 834 If a mistake is made configuring an adapter, domain or control domain, 835 you can use the ``unassign_xxx`` interfaces to unassign the adapter, domain or 836 control domain. 837 838 To display the matrix configuration for Guest1:: 839 840 cat matrix 841 842 The output will display the APQNs in the format ``xx.yyyy``, where xx is 843 the adapter number and yyyy is the domain number. The output for Guest1 844 will look like this:: 845 846 05.0004 847 05.00ab 848 06.0004 849 06.00ab 850 851 This is how the matrix is configured for Guest2:: 852 853 echo 5 > assign_adapter 854 echo 0x47 > assign_domain 855 echo 0xff > assign_domain 856 857 This is how the matrix is configured for Guest3:: 858 859 echo 6 > assign_adapter 860 echo 0x47 > assign_domain 861 echo 0xff > assign_domain 862 8635. Start Guest1:: 864 865 /usr/bin/qemu-system-s390x ... -cpu host,ap=on,apqci=on,apft=on -device vfio-ap,sysfsdev=/sys/devices/vfio_ap/matrix/$uuid1 ... 866 8677. Start Guest2:: 868 869 /usr/bin/qemu-system-s390x ... -cpu host,ap=on,apqci=on,apft=on -device vfio-ap,sysfsdev=/sys/devices/vfio_ap/matrix/$uuid2 ... 870 8717. Start Guest3:: 872 873 /usr/bin/qemu-system-s390x ... -cpu host,ap=on,apqci=on,apft=on -device vfio-ap,sysfsdev=/sys/devices/vfio_ap/matrix/$uuid3 ... 874 875When the guest is shut down, the mediated matrix devices may be removed. 876 877Using our example again, to remove the mediated matrix device $uuid1:: 878 879 /sys/devices/vfio_ap/matrix/ 880 ... [mdev_supported_types] 881 ...... [vfio_ap-passthrough] 882 ......... [devices] 883 ............ [$uuid1] 884 ............... remove 885 886 887 echo 1 > remove 888 889This will remove all of the mdev matrix device's sysfs structures including 890the mdev device itself. To recreate and reconfigure the mdev matrix device, 891all of the steps starting with step 3 will have to be performed again. Note 892that the remove will fail if a guest using the mdev is still running. 893 894It is not necessary to remove an mdev matrix device, but one may want to 895remove it if no guest will use it during the remaining lifetime of the linux 896host. If the mdev matrix device is removed, one may want to also reconfigure 897the pool of adapters and queues reserved for use by the default drivers. 898 899Limitations 900----------- 901 902* The KVM/kernel interfaces do not provide a way to prevent restoring an APQN 903 to the default drivers pool of a queue that is still assigned to a mediated 904 device in use by a guest. It is incumbent upon the administrator to 905 ensure there is no mediated device in use by a guest to which the APQN is 906 assigned lest the host be given access to the private data of the AP queue 907 device, such as a private key configured specifically for the guest. 908 909* Dynamically assigning AP resources to or unassigning AP resources from a 910 mediated matrix device - see `Configuring an AP matrix for a linux guest`_ 911 section above - while a running guest is using it is currently not supported. 912 913* Live guest migration is not supported for guests using AP devices. If a guest 914 is using AP devices, the vfio-ap device configured for the guest must be 915 unplugged before migrating the guest (see `Hot unplug a vfio-ap device from a 916 running guest`_ section above.)