cachepc-linux

Fork of AMDESE/linux with modifications for CachePC side-channel attack
git clone https://git.sinitax.com/sinitax/cachepc-linux
Log | Files | Refs | README | LICENSE | sfeed.txt

p2pdma.rst (6329B)


      1.. SPDX-License-Identifier: GPL-2.0
      2
      3============================
      4PCI Peer-to-Peer DMA Support
      5============================
      6
      7The PCI bus has pretty decent support for performing DMA transfers
      8between two devices on the bus. This type of transaction is henceforth
      9called Peer-to-Peer (or P2P). However, there are a number of issues that
     10make P2P transactions tricky to do in a perfectly safe way.
     11
     12One of the biggest issues is that PCI doesn't require forwarding
     13transactions between hierarchy domains, and in PCIe, each Root Port
     14defines a separate hierarchy domain. To make things worse, there is no
     15simple way to determine if a given Root Complex supports this or not.
     16(See PCIe r4.0, sec 1.3.1). Therefore, as of this writing, the kernel
     17only supports doing P2P when the endpoints involved are all behind the
     18same PCI bridge, as such devices are all in the same PCI hierarchy
     19domain, and the spec guarantees that all transactions within the
     20hierarchy will be routable, but it does not require routing
     21between hierarchies.
     22
     23The second issue is that to make use of existing interfaces in Linux,
     24memory that is used for P2P transactions needs to be backed by struct
     25pages. However, PCI BARs are not typically cache coherent so there are
     26a few corner case gotchas with these pages so developers need to
     27be careful about what they do with them.
     28
     29
     30Driver Writer's Guide
     31=====================
     32
     33In a given P2P implementation there may be three or more different
     34types of kernel drivers in play:
     35
     36* Provider - A driver which provides or publishes P2P resources like
     37  memory or doorbell registers to other drivers.
     38* Client - A driver which makes use of a resource by setting up a
     39  DMA transaction to or from it.
     40* Orchestrator - A driver which orchestrates the flow of data between
     41  clients and providers.
     42
     43In many cases there could be overlap between these three types (i.e.,
     44it may be typical for a driver to be both a provider and a client).
     45
     46For example, in the NVMe Target Copy Offload implementation:
     47
     48* The NVMe PCI driver is both a client, provider and orchestrator
     49  in that it exposes any CMB (Controller Memory Buffer) as a P2P memory
     50  resource (provider), it accepts P2P memory pages as buffers in requests
     51  to be used directly (client) and it can also make use of the CMB as
     52  submission queue entries (orchestrator).
     53* The RDMA driver is a client in this arrangement so that an RNIC
     54  can DMA directly to the memory exposed by the NVMe device.
     55* The NVMe Target driver (nvmet) can orchestrate the data from the RNIC
     56  to the P2P memory (CMB) and then to the NVMe device (and vice versa).
     57
     58This is currently the only arrangement supported by the kernel but
     59one could imagine slight tweaks to this that would allow for the same
     60functionality. For example, if a specific RNIC added a BAR with some
     61memory behind it, its driver could add support as a P2P provider and
     62then the NVMe Target could use the RNIC's memory instead of the CMB
     63in cases where the NVMe cards in use do not have CMB support.
     64
     65
     66Provider Drivers
     67----------------
     68
     69A provider simply needs to register a BAR (or a portion of a BAR)
     70as a P2P DMA resource using :c:func:`pci_p2pdma_add_resource()`.
     71This will register struct pages for all the specified memory.
     72
     73After that it may optionally publish all of its resources as
     74P2P memory using :c:func:`pci_p2pmem_publish()`. This will allow
     75any orchestrator drivers to find and use the memory. When marked in
     76this way, the resource must be regular memory with no side effects.
     77
     78For the time being this is fairly rudimentary in that all resources
     79are typically going to be P2P memory. Future work will likely expand
     80this to include other types of resources like doorbells.
     81
     82
     83Client Drivers
     84--------------
     85
     86A client driver typically only has to conditionally change its DMA map
     87routine to use the mapping function :c:func:`pci_p2pdma_map_sg()` instead
     88of the usual :c:func:`dma_map_sg()` function. Memory mapped in this
     89way does not need to be unmapped.
     90
     91The client may also, optionally, make use of
     92:c:func:`is_pci_p2pdma_page()` to determine when to use the P2P mapping
     93functions and when to use the regular mapping functions. In some
     94situations, it may be more appropriate to use a flag to indicate a
     95given request is P2P memory and map appropriately. It is important to
     96ensure that struct pages that back P2P memory stay out of code that
     97does not have support for them as other code may treat the pages as
     98regular memory which may not be appropriate.
     99
    100
    101Orchestrator Drivers
    102--------------------
    103
    104The first task an orchestrator driver must do is compile a list of
    105all client devices that will be involved in a given transaction. For
    106example, the NVMe Target driver creates a list including the namespace
    107block device and the RNIC in use. If the orchestrator has access to
    108a specific P2P provider to use it may check compatibility using
    109:c:func:`pci_p2pdma_distance()` otherwise it may find a memory provider
    110that's compatible with all clients using  :c:func:`pci_p2pmem_find()`.
    111If more than one provider is supported, the one nearest to all the clients will
    112be chosen first. If more than one provider is an equal distance away, the
    113one returned will be chosen at random (it is not an arbitrary but
    114truly random). This function returns the PCI device to use for the provider
    115with a reference taken and therefore when it's no longer needed it should be
    116returned with pci_dev_put().
    117
    118Once a provider is selected, the orchestrator can then use
    119:c:func:`pci_alloc_p2pmem()` and :c:func:`pci_free_p2pmem()` to
    120allocate P2P memory from the provider. :c:func:`pci_p2pmem_alloc_sgl()`
    121and :c:func:`pci_p2pmem_free_sgl()` are convenience functions for
    122allocating scatter-gather lists with P2P memory.
    123
    124Struct Page Caveats
    125-------------------
    126
    127Driver writers should be very careful about not passing these special
    128struct pages to code that isn't prepared for it. At this time, the kernel
    129interfaces do not have any checks for ensuring this. This obviously
    130precludes passing these pages to userspace.
    131
    132P2P memory is also technically IO memory but should never have any side
    133effects behind it. Thus, the order of loads and stores should not be important
    134and ioreadX(), iowriteX() and friends should not be necessary.
    135
    136
    137P2P DMA Support Library
    138=======================
    139
    140.. kernel-doc:: drivers/pci/p2pdma.c
    141   :export: