cachepc-linux

Fork of AMDESE/linux with modifications for CachePC side-channel attack
git clone https://git.sinitax.com/sinitax/cachepc-linux
Log | Files | Refs | README | LICENSE | sfeed.txt

ntb.rst (12035B)


      1===========
      2NTB Drivers
      3===========
      4
      5NTB (Non-Transparent Bridge) is a type of PCI-Express bridge chip that connects
      6the separate memory systems of two or more computers to the same PCI-Express
      7fabric. Existing NTB hardware supports a common feature set: doorbell
      8registers and memory translation windows, as well as non common features like
      9scratchpad and message registers. Scratchpad registers are read-and-writable
     10registers that are accessible from either side of the device, so that peers can
     11exchange a small amount of information at a fixed address. Message registers can
     12be utilized for the same purpose. Additionally they are provided with
     13special status bits to make sure the information isn't rewritten by another
     14peer. Doorbell registers provide a way for peers to send interrupt events.
     15Memory windows allow translated read and write access to the peer memory.
     16
     17NTB Core Driver (ntb)
     18=====================
     19
     20The NTB core driver defines an api wrapping the common feature set, and allows
     21clients interested in NTB features to discover NTB the devices supported by
     22hardware drivers.  The term "client" is used here to mean an upper layer
     23component making use of the NTB api.  The term "driver," or "hardware driver,"
     24is used here to mean a driver for a specific vendor and model of NTB hardware.
     25
     26NTB Client Drivers
     27==================
     28
     29NTB client drivers should register with the NTB core driver.  After
     30registering, the client probe and remove functions will be called appropriately
     31as ntb hardware, or hardware drivers, are inserted and removed.  The
     32registration uses the Linux Device framework, so it should feel familiar to
     33anyone who has written a pci driver.
     34
     35NTB Typical client driver implementation
     36----------------------------------------
     37
     38Primary purpose of NTB is to share some peace of memory between at least two
     39systems. So the NTB device features like Scratchpad/Message registers are
     40mainly used to perform the proper memory window initialization. Typically
     41there are two types of memory window interfaces supported by the NTB API:
     42inbound translation configured on the local ntb port and outbound translation
     43configured by the peer, on the peer ntb port. The first type is
     44depicted on the next figure::
     45
     46 Inbound translation:
     47
     48 Memory:              Local NTB Port:      Peer NTB Port:      Peer MMIO:
     49  ____________
     50 | dma-mapped |-ntb_mw_set_trans(addr)  |
     51 | memory     |        _v____________   |   ______________
     52 | (addr)     |<======| MW xlat addr |<====| MW base addr |<== memory-mapped IO
     53 |------------|       |--------------|  |  |--------------|
     54
     55So typical scenario of the first type memory window initialization looks:
     561) allocate a memory region, 2) put translated address to NTB config,
     573) somehow notify a peer device of performed initialization, 4) peer device
     58maps corresponding outbound memory window so to have access to the shared
     59memory region.
     60
     61The second type of interface, that implies the shared windows being
     62initialized by a peer device, is depicted on the figure::
     63
     64 Outbound translation:
     65
     66 Memory:        Local NTB Port:    Peer NTB Port:      Peer MMIO:
     67  ____________                      ______________
     68 | dma-mapped |                |   | MW base addr |<== memory-mapped IO
     69 | memory     |                |   |--------------|
     70 | (addr)     |<===================| MW xlat addr |<-ntb_peer_mw_set_trans(addr)
     71 |------------|                |   |--------------|
     72
     73Typical scenario of the second type interface initialization would be:
     741) allocate a memory region, 2) somehow deliver a translated address to a peer
     75device, 3) peer puts the translated address to NTB config, 4) peer device maps
     76outbound memory window so to have access to the shared memory region.
     77
     78As one can see the described scenarios can be combined in one portable
     79algorithm.
     80
     81 Local device:
     82  1) Allocate memory for a shared window
     83  2) Initialize memory window by translated address of the allocated region
     84     (it may fail if local memory window initialization is unsupported)
     85  3) Send the translated address and memory window index to a peer device
     86
     87 Peer device:
     88  1) Initialize memory window with retrieved address of the allocated
     89     by another device memory region (it may fail if peer memory window
     90     initialization is unsupported)
     91  2) Map outbound memory window
     92
     93In accordance with this scenario, the NTB Memory Window API can be used as
     94follows:
     95
     96 Local device:
     97  1) ntb_mw_count(pidx) - retrieve number of memory ranges, which can
     98     be allocated for memory windows between local device and peer device
     99     of port with specified index.
    100  2) ntb_get_align(pidx, midx) - retrieve parameters restricting the
    101     shared memory region alignment and size. Then memory can be properly
    102     allocated.
    103  3) Allocate physically contiguous memory region in compliance with
    104     restrictions retrieved in 2).
    105  4) ntb_mw_set_trans(pidx, midx) - try to set translation address of
    106     the memory window with specified index for the defined peer device
    107     (it may fail if local translated address setting is not supported)
    108  5) Send translated base address (usually together with memory window
    109     number) to the peer device using, for instance, scratchpad or message
    110     registers.
    111
    112 Peer device:
    113  1) ntb_peer_mw_set_trans(pidx, midx) - try to set received from other
    114     device (related to pidx) translated address for specified memory
    115     window. It may fail if retrieved address, for instance, exceeds
    116     maximum possible address or isn't properly aligned.
    117  2) ntb_peer_mw_get_addr(widx) - retrieve MMIO address to map the memory
    118     window so to have an access to the shared memory.
    119
    120Also it is worth to note, that method ntb_mw_count(pidx) should return the
    121same value as ntb_peer_mw_count() on the peer with port index - pidx.
    122
    123NTB Transport Client (ntb\_transport) and NTB Netdev (ntb\_netdev)
    124------------------------------------------------------------------
    125
    126The primary client for NTB is the Transport client, used in tandem with NTB
    127Netdev.  These drivers function together to create a logical link to the peer,
    128across the ntb, to exchange packets of network data.  The Transport client
    129establishes a logical link to the peer, and creates queue pairs to exchange
    130messages and data.  The NTB Netdev then creates an ethernet device using a
    131Transport queue pair.  Network data is copied between socket buffers and the
    132Transport queue pair buffer.  The Transport client may be used for other things
    133besides Netdev, however no other applications have yet been written.
    134
    135NTB Ping Pong Test Client (ntb\_pingpong)
    136-----------------------------------------
    137
    138The Ping Pong test client serves as a demonstration to exercise the doorbell
    139and scratchpad registers of NTB hardware, and as an example simple NTB client.
    140Ping Pong enables the link when started, waits for the NTB link to come up, and
    141then proceeds to read and write the doorbell scratchpad registers of the NTB.
    142The peers interrupt each other using a bit mask of doorbell bits, which is
    143shifted by one in each round, to test the behavior of multiple doorbell bits
    144and interrupt vectors.  The Ping Pong driver also reads the first local
    145scratchpad, and writes the value plus one to the first peer scratchpad, each
    146round before writing the peer doorbell register.
    147
    148Module Parameters:
    149
    150* unsafe - Some hardware has known issues with scratchpad and doorbell
    151	registers.  By default, Ping Pong will not attempt to exercise such
    152	hardware.  You may override this behavior at your own risk by setting
    153	unsafe=1.
    154* delay\_ms - Specify the delay between receiving a doorbell
    155	interrupt event and setting the peer doorbell register for the next
    156	round.
    157* init\_db - Specify the doorbell bits to start new series of rounds.  A new
    158	series begins once all the doorbell bits have been shifted out of
    159	range.
    160* dyndbg - It is suggested to specify dyndbg=+p when loading this module, and
    161	then to observe debugging output on the console.
    162
    163NTB Tool Test Client (ntb\_tool)
    164--------------------------------
    165
    166The Tool test client serves for debugging, primarily, ntb hardware and drivers.
    167The Tool provides access through debugfs for reading, setting, and clearing the
    168NTB doorbell, and reading and writing scratchpads.
    169
    170The Tool does not currently have any module parameters.
    171
    172Debugfs Files:
    173
    174* *debugfs*/ntb\_tool/*hw*/
    175	A directory in debugfs will be created for each
    176	NTB device probed by the tool.  This directory is shortened to *hw*
    177	below.
    178* *hw*/db
    179	This file is used to read, set, and clear the local doorbell.  Not
    180	all operations may be supported by all hardware.  To read the doorbell,
    181	read the file.  To set the doorbell, write `s` followed by the bits to
    182	set (eg: `echo 's 0x0101' > db`).  To clear the doorbell, write `c`
    183	followed by the bits to clear.
    184* *hw*/mask
    185	This file is used to read, set, and clear the local doorbell mask.
    186	See *db* for details.
    187* *hw*/peer\_db
    188	This file is used to read, set, and clear the peer doorbell.
    189	See *db* for details.
    190* *hw*/peer\_mask
    191	This file is used to read, set, and clear the peer doorbell
    192	mask.  See *db* for details.
    193* *hw*/spad
    194	This file is used to read and write local scratchpads.  To read
    195	the values of all scratchpads, read the file.  To write values, write a
    196	series of pairs of scratchpad number and value
    197	(eg: `echo '4 0x123 7 0xabc' > spad`
    198	# to set scratchpads `4` and `7` to `0x123` and `0xabc`, respectively).
    199* *hw*/peer\_spad
    200	This file is used to read and write peer scratchpads.  See
    201	*spad* for details.
    202
    203NTB MSI Test Client (ntb\_msi\_test)
    204------------------------------------
    205
    206The MSI test client serves to test and debug the MSI library which
    207allows for passing MSI interrupts across NTB memory windows. The
    208test client is interacted with through the debugfs filesystem:
    209
    210* *debugfs*/ntb\_tool/*hw*/
    211	A directory in debugfs will be created for each
    212	NTB device probed by the tool.  This directory is shortened to *hw*
    213	below.
    214* *hw*/port
    215	This file describes the local port number
    216* *hw*/irq*_occurrences
    217	One occurrences file exists for each interrupt and, when read,
    218	returns the number of times the interrupt has been triggered.
    219* *hw*/peer*/port
    220	This file describes the port number for each peer
    221* *hw*/peer*/count
    222	This file describes the number of interrupts that can be
    223	triggered on each peer
    224* *hw*/peer*/trigger
    225	Writing an interrupt number (any number less than the value
    226	specified in count) will trigger the interrupt on the
    227	specified peer. That peer's interrupt's occurrence file
    228	should be incremented.
    229
    230NTB Hardware Drivers
    231====================
    232
    233NTB hardware drivers should register devices with the NTB core driver.  After
    234registering, clients probe and remove functions will be called.
    235
    236NTB Intel Hardware Driver (ntb\_hw\_intel)
    237------------------------------------------
    238
    239The Intel hardware driver supports NTB on Xeon and Atom CPUs.
    240
    241Module Parameters:
    242
    243* b2b\_mw\_idx
    244	If the peer ntb is to be accessed via a memory window, then use
    245	this memory window to access the peer ntb.  A value of zero or positive
    246	starts from the first mw idx, and a negative value starts from the last
    247	mw idx.  Both sides MUST set the same value here!  The default value is
    248	`-1`.
    249* b2b\_mw\_share
    250	If the peer ntb is to be accessed via a memory window, and if
    251	the memory window is large enough, still allow the client to use the
    252	second half of the memory window for address translation to the peer.
    253* xeon\_b2b\_usd\_bar2\_addr64
    254	If using B2B topology on Xeon hardware, use
    255	this 64 bit address on the bus between the NTB devices for the window
    256	at BAR2, on the upstream side of the link.
    257* xeon\_b2b\_usd\_bar4\_addr64 - See *xeon\_b2b\_bar2\_addr64*.
    258* xeon\_b2b\_usd\_bar4\_addr32 - See *xeon\_b2b\_bar2\_addr64*.
    259* xeon\_b2b\_usd\_bar5\_addr32 - See *xeon\_b2b\_bar2\_addr64*.
    260* xeon\_b2b\_dsd\_bar2\_addr64 - See *xeon\_b2b\_bar2\_addr64*.
    261* xeon\_b2b\_dsd\_bar4\_addr64 - See *xeon\_b2b\_bar2\_addr64*.
    262* xeon\_b2b\_dsd\_bar4\_addr32 - See *xeon\_b2b\_bar2\_addr64*.
    263* xeon\_b2b\_dsd\_bar5\_addr32 - See *xeon\_b2b\_bar2\_addr64*.