cachepc-linux

Fork of AMDESE/linux with modifications for CachePC side-channel attack
git clone https://git.sinitax.com/sinitax/cachepc-linux
Log | Files | Refs | README | LICENSE | sfeed.txt

ibmvmc.rst (10064B)


      1.. SPDX-License-Identifier: GPL-2.0+
      2
      3======================================================
      4IBM Virtual Management Channel Kernel Driver (IBMVMC)
      5======================================================
      6
      7:Authors:
      8	Dave Engebretsen <engebret@us.ibm.com>,
      9	Adam Reznechek <adreznec@linux.vnet.ibm.com>,
     10	Steven Royer <seroyer@linux.vnet.ibm.com>,
     11	Bryant G. Ly <bryantly@linux.vnet.ibm.com>,
     12
     13Introduction
     14============
     15
     16Note: Knowledge of virtualization technology is required to understand
     17this document.
     18
     19A good reference document would be:
     20
     21https://openpowerfoundation.org/wp-content/uploads/2016/05/LoPAPR_DRAFT_v11_24March2016_cmt1.pdf
     22
     23The Virtual Management Channel (VMC) is a logical device which provides an
     24interface between the hypervisor and a management partition. This interface
     25is like a message passing interface. This management partition is intended
     26to provide an alternative to systems that use a Hardware Management
     27Console (HMC) - based system management.
     28
     29The primary hardware management solution that is developed by IBM relies
     30on an appliance server named the Hardware Management Console (HMC),
     31packaged as an external tower or rack-mounted personal computer. In a
     32Power Systems environment, a single HMC can manage multiple POWER
     33processor-based systems.
     34
     35Management Application
     36----------------------
     37
     38In the management partition, a management application exists which enables
     39a system administrator to configure the system’s partitioning
     40characteristics via a command line interface (CLI) or Representational
     41State Transfer Application (REST API's).
     42
     43The management application runs on a Linux logical partition on a
     44POWER8 or newer processor-based server that is virtualized by PowerVM.
     45System configuration, maintenance, and control functions which
     46traditionally require an HMC can be implemented in the management
     47application using a combination of HMC to hypervisor interfaces and
     48existing operating system methods. This tool provides a subset of the
     49functions implemented by the HMC and enables basic partition configuration.
     50The set of HMC to hypervisor messages supported by the management
     51application component are passed to the hypervisor over a VMC interface,
     52which is defined below.
     53
     54The VMC enables the management partition to provide basic partitioning
     55functions:
     56
     57- Logical Partitioning Configuration
     58- Start, and stop actions for individual partitions
     59- Display of partition status
     60- Management of virtual Ethernet
     61- Management of virtual Storage
     62- Basic system management
     63
     64Virtual Management Channel (VMC)
     65--------------------------------
     66
     67A logical device, called the Virtual Management Channel (VMC), is defined
     68for communicating between the management application and the hypervisor. It
     69basically creates the pipes that enable virtualization management
     70software. This device is presented to a designated management partition as
     71a virtual device.
     72
     73This communication device uses Command/Response Queue (CRQ) and the
     74Remote Direct Memory Access (RDMA) interfaces. A three-way handshake is
     75defined that must take place to establish that both the hypervisor and
     76management partition sides of the channel are running prior to
     77sending/receiving any of the protocol messages.
     78
     79This driver also utilizes Transport Event CRQs. CRQ messages are sent
     80when the hypervisor detects one of the peer partitions has abnormally
     81terminated, or one side has called H_FREE_CRQ to close their CRQ.
     82Two new classes of CRQ messages are introduced for the VMC device. VMC
     83Administrative messages are used for each partition using the VMC to
     84communicate capabilities to their partner. HMC Interface messages are used
     85for the actual flow of HMC messages between the management partition and
     86the hypervisor. As most HMC messages far exceed the size of a CRQ buffer,
     87a virtual DMA (RMDA) of the HMC message data is done prior to each HMC
     88Interface CRQ message. Only the management partition drives RDMA
     89operations; hypervisors never directly cause the movement of message data.
     90
     91
     92Terminology
     93-----------
     94RDMA
     95        Remote Direct Memory Access is DMA transfer from the server to its
     96        client or from the server to its partner partition. DMA refers
     97        to both physical I/O to and from memory operations and to memory
     98        to memory move operations.
     99CRQ
    100        Command/Response Queue a facility which is used to communicate
    101        between partner partitions. Transport events which are signaled
    102        from the hypervisor to partition are also reported in this queue.
    103
    104Example Management Partition VMC Driver Interface
    105=================================================
    106
    107This section provides an example for the management application
    108implementation where a device driver is used to interface to the VMC
    109device. This driver consists of a new device, for example /dev/ibmvmc,
    110which provides interfaces to open, close, read, write, and perform
    111ioctl’s against the VMC device.
    112
    113VMC Interface Initialization
    114----------------------------
    115
    116The device driver is responsible for initializing the VMC when the driver
    117is loaded. It first creates and initializes the CRQ. Next, an exchange of
    118VMC capabilities is performed to indicate the code version and number of
    119resources available in both the management partition and the hypervisor.
    120Finally, the hypervisor requests that the management partition create an
    121initial pool of VMC buffers, one buffer for each possible HMC connection,
    122which will be used for management application  session initialization.
    123Prior to completion of this initialization sequence, the device returns
    124EBUSY to open() calls. EIO is returned for all open() failures.
    125
    126::
    127
    128        Management Partition		Hypervisor
    129                        CRQ INIT
    130        ---------------------------------------->
    131        	   CRQ INIT COMPLETE
    132        <----------------------------------------
    133        	      CAPABILITIES
    134        ---------------------------------------->
    135        	 CAPABILITIES RESPONSE
    136        <----------------------------------------
    137              ADD BUFFER (HMC IDX=0,1,..)         _
    138        <----------------------------------------  |
    139        	  ADD BUFFER RESPONSE              | - Perform # HMCs Iterations
    140        ----------------------------------------> -
    141
    142VMC Interface Open
    143------------------
    144
    145After the basic VMC channel has been initialized, an HMC session level
    146connection can be established. The application layer performs an open() to
    147the VMC device and executes an ioctl() against it, indicating the HMC ID
    148(32 bytes of data) for this session. If the VMC device is in an invalid
    149state, EIO will be returned for the ioctl(). The device driver creates a
    150new HMC session value (ranging from 1 to 255) and HMC index value (starting
    151at index 0 and ranging to 254) for this HMC ID. The driver then does an
    152RDMA of the HMC ID to the hypervisor, and then sends an Interface Open
    153message to the hypervisor to establish the session over the VMC. After the
    154hypervisor receives this information, it sends Add Buffer messages to the
    155management partition to seed an initial pool of buffers for the new HMC
    156connection. Finally, the hypervisor sends an Interface Open Response
    157message, to indicate that it is ready for normal runtime messaging. The
    158following illustrates this VMC flow:
    159
    160::
    161
    162        Management Partition             Hypervisor
    163        	      RDMA HMC ID
    164        ---------------------------------------->
    165        	    Interface Open
    166        ---------------------------------------->
    167        	      Add Buffer                  _
    168        <----------------------------------------  |
    169        	  Add Buffer Response              | - Perform N Iterations
    170        ----------------------------------------> -
    171        	Interface Open Response
    172        <----------------------------------------
    173
    174VMC Interface Runtime
    175---------------------
    176
    177During normal runtime, the management application and the hypervisor
    178exchange HMC messages via the Signal VMC message and RDMA operations. When
    179sending data to the hypervisor, the management application performs a
    180write() to the VMC device, and the driver RDMA’s the data to the hypervisor
    181and then sends a Signal Message. If a write() is attempted before VMC
    182device buffers have been made available by the hypervisor, or no buffers
    183are currently available, EBUSY is returned in response to the write(). A
    184write() will return EIO for all other errors, such as an invalid device
    185state. When the hypervisor sends a message to the management, the data is
    186put into a VMC buffer and an Signal Message is sent to the VMC driver in
    187the management partition. The driver RDMA’s the buffer into the partition
    188and passes the data up to the appropriate management application via a
    189read() to the VMC device. The read() request blocks if there is no buffer
    190available to read. The management application may use select() to wait for
    191the VMC device to become ready with data to read.
    192
    193::
    194
    195        Management Partition             Hypervisor
    196        		MSG RDMA
    197        ---------------------------------------->
    198        		SIGNAL MSG
    199        ---------------------------------------->
    200        		SIGNAL MSG
    201        <----------------------------------------
    202        		MSG RDMA
    203        <----------------------------------------
    204
    205VMC Interface Close
    206-------------------
    207
    208HMC session level connections are closed by the management partition when
    209the application layer performs a close() against the device. This action
    210results in an Interface Close message flowing to the hypervisor, which
    211causes the session to be terminated. The device driver must free any
    212storage allocated for buffers for this HMC connection.
    213
    214::
    215
    216        Management Partition             Hypervisor
    217        	     INTERFACE CLOSE
    218        ---------------------------------------->
    219                INTERFACE CLOSE RESPONSE
    220        <----------------------------------------
    221
    222Additional Information
    223======================
    224
    225For more information on the documentation for CRQ Messages, VMC Messages,
    226HMC interface Buffers, and signal messages please refer to the Linux on
    227Power Architecture Platform Reference. Section F.