cachepc-linux

Fork of AMDESE/linux with modifications for CachePC side-channel attack
git clone https://git.sinitax.com/sinitax/cachepc-linux
Log | Files | Refs | README | LICENSE | sfeed.txt

pxa_dma.rst (6805B)


      1==============================
      2PXA/MMP - DMA Slave controller
      3==============================
      4
      5Constraints
      6===========
      7
      8a) Transfers hot queuing
      9A driver submitting a transfer and issuing it should be granted the transfer
     10is queued even on a running DMA channel.
     11This implies that the queuing doesn't wait for the previous transfer end,
     12and that the descriptor chaining is not only done in the irq/tasklet code
     13triggered by the end of the transfer.
     14A transfer which is submitted and issued on a phy doesn't wait for a phy to
     15stop and restart, but is submitted on a "running channel". The other
     16drivers, especially mmp_pdma waited for the phy to stop before relaunching
     17a new transfer.
     18
     19b) All transfers having asked for confirmation should be signaled
     20Any issued transfer with DMA_PREP_INTERRUPT should trigger a callback call.
     21This implies that even if an irq/tasklet is triggered by end of tx1, but
     22at the time of irq/dma tx2 is already finished, tx1->complete() and
     23tx2->complete() should be called.
     24
     25c) Channel running state
     26A driver should be able to query if a channel is running or not. For the
     27multimedia case, such as video capture, if a transfer is submitted and then
     28a check of the DMA channel reports a "stopped channel", the transfer should
     29not be issued until the next "start of frame interrupt", hence the need to
     30know if a channel is in running or stopped state.
     31
     32d) Bandwidth guarantee
     33The PXA architecture has 4 levels of DMAs priorities : high, normal, low.
     34The high priorities get twice as much bandwidth as the normal, which get twice
     35as much as the low priorities.
     36A driver should be able to request a priority, especially the real-time
     37ones such as pxa_camera with (big) throughputs.
     38
     39Design
     40======
     41a) Virtual channels
     42Same concept as in sa11x0 driver, ie. a driver was assigned a "virtual
     43channel" linked to the requestor line, and the physical DMA channel is
     44assigned on the fly when the transfer is issued.
     45
     46b) Transfer anatomy for a scatter-gather transfer
     47
     48::
     49
     50   +------------+-----+---------------+----------------+-----------------+
     51   | desc-sg[0] | ... | desc-sg[last] | status updater | finisher/linker |
     52   +------------+-----+---------------+----------------+-----------------+
     53
     54This structure is pointed by dma->sg_cpu.
     55The descriptors are used as follows :
     56
     57    - desc-sg[i]: i-th descriptor, transferring the i-th sg
     58      element to the video buffer scatter gather
     59
     60    - status updater
     61      Transfers a single u32 to a well known dma coherent memory to leave
     62      a trace that this transfer is done. The "well known" is unique per
     63      physical channel, meaning that a read of this value will tell which
     64      is the last finished transfer at that point in time.
     65
     66    - finisher: has ddadr=DADDR_STOP, dcmd=ENDIRQEN
     67
     68    - linker: has ddadr= desc-sg[0] of next transfer, dcmd=0
     69
     70c) Transfers hot-chaining
     71Suppose the running chain is:
     72
     73::
     74
     75   Buffer 1              Buffer 2
     76   +---------+----+---+  +----+----+----+---+
     77   | d0 | .. | dN | l |  | d0 | .. | dN | f |
     78   +---------+----+-|-+  ^----+----+----+---+
     79                    |    |
     80                    +----+
     81
     82After a call to dmaengine_submit(b3), the chain will look like:
     83
     84::
     85
     86   Buffer 1              Buffer 2              Buffer 3
     87   +---------+----+---+  +----+----+----+---+  +----+----+----+---+
     88   | d0 | .. | dN | l |  | d0 | .. | dN | l |  | d0 | .. | dN | f |
     89   +---------+----+-|-+  ^----+----+----+-|-+  ^----+----+----+---+
     90                    |    |                |    |
     91                    +----+                +----+
     92                                         new_link
     93
     94If while new_link was created the DMA channel stopped, it is _not_
     95restarted. Hot-chaining doesn't break the assumption that
     96dma_async_issue_pending() is to be used to ensure the transfer is actually started.
     97
     98One exception to this rule :
     99
    100- if Buffer1 and Buffer2 had all their addresses 8 bytes aligned
    101
    102- and if Buffer3 has at least one address not 4 bytes aligned
    103
    104- then hot-chaining cannot happen, as the channel must be stopped, the
    105  "align bit" must be set, and the channel restarted As a consequence,
    106  such a transfer tx_submit() will be queued on the submitted queue, and
    107  this specific case if the DMA is already running in aligned mode.
    108
    109d) Transfers completion updater
    110Each time a transfer is completed on a channel, an interrupt might be
    111generated or not, up to the client's request. But in each case, the last
    112descriptor of a transfer, the "status updater", will write the latest
    113transfer being completed into the physical channel's completion mark.
    114
    115This will speed up residue calculation, for large transfers such as video
    116buffers which hold around 6k descriptors or more. This also allows without
    117any lock to find out what is the latest completed transfer in a running
    118DMA chain.
    119
    120e) Transfers completion, irq and tasklet
    121When a transfer flagged as "DMA_PREP_INTERRUPT" is finished, the dma irq
    122is raised. Upon this interrupt, a tasklet is scheduled for the physical
    123channel.
    124
    125The tasklet is responsible for :
    126
    127- reading the physical channel last updater mark
    128
    129- calling all the transfer callbacks of finished transfers, based on
    130  that mark, and each transfer flags.
    131
    132If a transfer is completed while this handling is done, a dma irq will
    133be raised, and the tasklet will be scheduled once again, having a new
    134updater mark.
    135
    136f) Residue
    137Residue granularity will be descriptor based. The issued but not completed
    138transfers will be scanned for all of their descriptors against the
    139currently running descriptor.
    140
    141g) Most complicated case of driver's tx queues
    142The most tricky situation is when :
    143
    144 - there are not "acked" transfers (tx0)
    145
    146 - a driver submitted an aligned tx1, not chained
    147
    148 - a driver submitted an aligned tx2 => tx2 is cold chained to tx1
    149
    150 - a driver issued tx1+tx2 => channel is running in aligned mode
    151
    152 - a driver submitted an aligned tx3 => tx3 is hot-chained
    153
    154 - a driver submitted an unaligned tx4 => tx4 is put in submitted queue,
    155   not chained
    156
    157 - a driver issued tx4 => tx4 is put in issued queue, not chained
    158
    159 - a driver submitted an aligned tx5 => tx5 is put in submitted queue, not
    160   chained
    161
    162 - a driver submitted an aligned tx6 => tx6 is put in submitted queue,
    163   cold chained to tx5
    164
    165 This translates into (after tx4 is issued) :
    166
    167 - issued queue
    168
    169 ::
    170
    171      +-----+ +-----+ +-----+ +-----+
    172      | tx1 | | tx2 | | tx3 | | tx4 |
    173      +---|-+ ^---|-+ ^-----+ +-----+
    174          |   |   |   |
    175          +---+   +---+
    176        - submitted queue
    177      +-----+ +-----+
    178      | tx5 | | tx6 |
    179      +---|-+ ^-----+
    180          |   |
    181          +---+
    182
    183- completed queue : empty
    184
    185- allocated queue : tx0
    186
    187It should be noted that after tx3 is completed, the channel is stopped, and
    188restarted in "unaligned mode" to handle tx4.
    189
    190Author: Robert Jarzmik <robert.jarzmik@free.fr>