cachepc-linux

Fork of AMDESE/linux with modifications for CachePC side-channel attack
git clone https://git.sinitax.com/sinitax/cachepc-linux
Log | Files | Refs | README | LICENSE | sfeed.txt

compress-offload.rst (15334B)


      1=========================
      2ALSA Compress-Offload API
      3=========================
      4
      5Pierre-Louis.Bossart <pierre-louis.bossart@linux.intel.com>
      6
      7Vinod Koul <vinod.koul@linux.intel.com>
      8
      9
     10Overview
     11========
     12Since its early days, the ALSA API was defined with PCM support or
     13constant bitrates payloads such as IEC61937 in mind. Arguments and
     14returned values in frames are the norm, making it a challenge to
     15extend the existing API to compressed data streams.
     16
     17In recent years, audio digital signal processors (DSP) were integrated
     18in system-on-chip designs, and DSPs are also integrated in audio
     19codecs. Processing compressed data on such DSPs results in a dramatic
     20reduction of power consumption compared to host-based
     21processing. Support for such hardware has not been very good in Linux,
     22mostly because of a lack of a generic API available in the mainline
     23kernel.
     24
     25Rather than requiring a compatibility break with an API change of the
     26ALSA PCM interface, a new 'Compressed Data' API is introduced to
     27provide a control and data-streaming interface for audio DSPs.
     28
     29The design of this API was inspired by the 2-year experience with the
     30Intel Moorestown SOC, with many corrections required to upstream the
     31API in the mainline kernel instead of the staging tree and make it
     32usable by others.
     33
     34
     35Requirements
     36============
     37The main requirements are:
     38
     39- separation between byte counts and time. Compressed formats may have
     40  a header per file, per frame, or no header at all. The payload size
     41  may vary from frame-to-frame. As a result, it is not possible to
     42  estimate reliably the duration of audio buffers when handling
     43  compressed data. Dedicated mechanisms are required to allow for
     44  reliable audio-video synchronization, which requires precise
     45  reporting of the number of samples rendered at any given time.
     46
     47- Handling of multiple formats. PCM data only requires a specification
     48  of the sampling rate, number of channels and bits per sample. In
     49  contrast, compressed data comes in a variety of formats. Audio DSPs
     50  may also provide support for a limited number of audio encoders and
     51  decoders embedded in firmware, or may support more choices through
     52  dynamic download of libraries.
     53
     54- Focus on main formats. This API provides support for the most
     55  popular formats used for audio and video capture and playback. It is
     56  likely that as audio compression technology advances, new formats
     57  will be added.
     58
     59- Handling of multiple configurations. Even for a given format like
     60  AAC, some implementations may support AAC multichannel but HE-AAC
     61  stereo. Likewise WMA10 level M3 may require too much memory and cpu
     62  cycles. The new API needs to provide a generic way of listing these
     63  formats.
     64
     65- Rendering/Grabbing only. This API does not provide any means of
     66  hardware acceleration, where PCM samples are provided back to
     67  user-space for additional processing. This API focuses instead on
     68  streaming compressed data to a DSP, with the assumption that the
     69  decoded samples are routed to a physical output or logical back-end.
     70
     71- Complexity hiding. Existing user-space multimedia frameworks all
     72  have existing enums/structures for each compressed format. This new
     73  API assumes the existence of a platform-specific compatibility layer
     74  to expose, translate and make use of the capabilities of the audio
     75  DSP, eg. Android HAL or PulseAudio sinks. By construction, regular
     76  applications are not supposed to make use of this API.
     77
     78
     79Design
     80======
     81The new API shares a number of concepts with the PCM API for flow
     82control. Start, pause, resume, drain and stop commands have the same
     83semantics no matter what the content is.
     84
     85The concept of memory ring buffer divided in a set of fragments is
     86borrowed from the ALSA PCM API. However, only sizes in bytes can be
     87specified.
     88
     89Seeks/trick modes are assumed to be handled by the host.
     90
     91The notion of rewinds/forwards is not supported. Data committed to the
     92ring buffer cannot be invalidated, except when dropping all buffers.
     93
     94The Compressed Data API does not make any assumptions on how the data
     95is transmitted to the audio DSP. DMA transfers from main memory to an
     96embedded audio cluster or to a SPI interface for external DSPs are
     97possible. As in the ALSA PCM case, a core set of routines is exposed;
     98each driver implementer will have to write support for a set of
     99mandatory routines and possibly make use of optional ones.
    100
    101The main additions are
    102
    103get_caps
    104  This routine returns the list of audio formats supported. Querying the
    105  codecs on a capture stream will return encoders, decoders will be
    106  listed for playback streams.
    107
    108get_codec_caps
    109  For each codec, this routine returns a list of
    110  capabilities. The intent is to make sure all the capabilities
    111  correspond to valid settings, and to minimize the risks of
    112  configuration failures. For example, for a complex codec such as AAC,
    113  the number of channels supported may depend on a specific profile. If
    114  the capabilities were exposed with a single descriptor, it may happen
    115  that a specific combination of profiles/channels/formats may not be
    116  supported. Likewise, embedded DSPs have limited memory and cpu cycles,
    117  it is likely that some implementations make the list of capabilities
    118  dynamic and dependent on existing workloads. In addition to codec
    119  settings, this routine returns the minimum buffer size handled by the
    120  implementation. This information can be a function of the DMA buffer
    121  sizes, the number of bytes required to synchronize, etc, and can be
    122  used by userspace to define how much needs to be written in the ring
    123  buffer before playback can start.
    124
    125set_params
    126  This routine sets the configuration chosen for a specific codec. The
    127  most important field in the parameters is the codec type; in most
    128  cases decoders will ignore other fields, while encoders will strictly
    129  comply to the settings
    130
    131get_params
    132  This routines returns the actual settings used by the DSP. Changes to
    133  the settings should remain the exception.
    134
    135get_timestamp
    136  The timestamp becomes a multiple field structure. It lists the number
    137  of bytes transferred, the number of samples processed and the number
    138  of samples rendered/grabbed. All these values can be used to determine
    139  the average bitrate, figure out if the ring buffer needs to be
    140  refilled or the delay due to decoding/encoding/io on the DSP.
    141
    142Note that the list of codecs/profiles/modes was derived from the
    143OpenMAX AL specification instead of reinventing the wheel.
    144Modifications include:
    145- Addition of FLAC and IEC formats
    146- Merge of encoder/decoder capabilities
    147- Profiles/modes listed as bitmasks to make descriptors more compact
    148- Addition of set_params for decoders (missing in OpenMAX AL)
    149- Addition of AMR/AMR-WB encoding modes (missing in OpenMAX AL)
    150- Addition of format information for WMA
    151- Addition of encoding options when required (derived from OpenMAX IL)
    152- Addition of rateControlSupported (missing in OpenMAX AL)
    153
    154State Machine
    155=============
    156
    157The compressed audio stream state machine is described below ::
    158
    159                                        +----------+
    160                                        |          |
    161                                        |   OPEN   |
    162                                        |          |
    163                                        +----------+
    164                                             |
    165                                             |
    166                                             | compr_set_params()
    167                                             |
    168                                             v
    169         compr_free()                  +----------+
    170  +------------------------------------|          |
    171  |                                    |   SETUP  |
    172  |          +-------------------------|          |<-------------------------+
    173  |          |       compr_write()     +----------+                          |
    174  |          |                              ^                                |
    175  |          |                              | compr_drain_notify()           |
    176  |          |                              |        or                      |
    177  |          |                              |     compr_stop()               |
    178  |          |                              |                                |
    179  |          |                         +----------+                          |
    180  |          |                         |          |                          |
    181  |          |                         |   DRAIN  |                          |
    182  |          |                         |          |                          |
    183  |          |                         +----------+                          |
    184  |          |                              ^                                |
    185  |          |                              |                                |
    186  |          |                              | compr_drain()                  |
    187  |          |                              |                                |
    188  |          v                              |                                |
    189  |    +----------+                    +----------+                          |
    190  |    |          |    compr_start()   |          |        compr_stop()      |
    191  |    | PREPARE  |------------------->|  RUNNING |--------------------------+
    192  |    |          |                    |          |                          |
    193  |    +----------+                    +----------+                          |
    194  |          |                            |    ^                             |
    195  |          |compr_free()                |    |                             |
    196  |          |              compr_pause() |    | compr_resume()              |
    197  |          |                            |    |                             |
    198  |          v                            v    |                             |
    199  |    +----------+                   +----------+                           |
    200  |    |          |                   |          |         compr_stop()      |
    201  +--->|   FREE   |                   |  PAUSE   |---------------------------+
    202       |          |                   |          |
    203       +----------+                   +----------+
    204
    205
    206Gapless Playback
    207================
    208When playing thru an album, the decoders have the ability to skip the encoder
    209delay and padding and directly move from one track content to another. The end
    210user can perceive this as gapless playback as we don't have silence while
    211switching from one track to another
    212
    213Also, there might be low-intensity noises due to encoding. Perfect gapless is
    214difficult to reach with all types of compressed data, but works fine with most
    215music content. The decoder needs to know the encoder delay and encoder padding.
    216So we need to pass this to DSP. This metadata is extracted from ID3/MP4 headers
    217and are not present by default in the bitstream, hence the need for a new
    218interface to pass this information to the DSP. Also DSP and userspace needs to
    219switch from one track to another and start using data for second track.
    220
    221The main additions are:
    222
    223set_metadata
    224  This routine sets the encoder delay and encoder padding. This can be used by
    225  decoder to strip the silence. This needs to be set before the data in the track
    226  is written.
    227
    228set_next_track
    229  This routine tells DSP that metadata and write operation sent after this would
    230  correspond to subsequent track
    231
    232partial drain
    233  This is called when end of file is reached. The userspace can inform DSP that
    234  EOF is reached and now DSP can start skipping padding delay. Also next write
    235  data would belong to next track
    236
    237Sequence flow for gapless would be:
    238- Open
    239- Get caps / codec caps
    240- Set params
    241- Set metadata of the first track
    242- Fill data of the first track
    243- Trigger start
    244- User-space finished sending all,
    245- Indicate next track data by sending set_next_track
    246- Set metadata of the next track
    247- then call partial_drain to flush most of buffer in DSP
    248- Fill data of the next track
    249- DSP switches to second track
    250
    251(note: order for partial_drain and write for next track can be reversed as well)
    252
    253Gapless Playback SM
    254===================
    255
    256For Gapless, we move from running state to partial drain and back, along
    257with setting of meta_data and signalling for next track ::
    258
    259
    260                                        +----------+
    261                compr_drain_notify()    |          |
    262              +------------------------>|  RUNNING |
    263              |                         |          |
    264              |                         +----------+
    265              |                              |
    266              |                              |
    267              |                              | compr_next_track()
    268              |                              |
    269              |                              V
    270              |                         +----------+
    271              |                         |          |
    272              |                         |NEXT_TRACK|
    273              |                         |          |
    274              |                         +----------+
    275              |                              |
    276              |                              |
    277              |                              | compr_partial_drain()
    278              |                              |
    279              |                              V
    280              |                         +----------+
    281              |                         |          |
    282              +------------------------ | PARTIAL_ |
    283                                        |  DRAIN   |
    284                                        +----------+
    285
    286Not supported
    287=============
    288- Support for VoIP/circuit-switched calls is not the target of this
    289  API. Support for dynamic bit-rate changes would require a tight
    290  coupling between the DSP and the host stack, limiting power savings.
    291
    292- Packet-loss concealment is not supported. This would require an
    293  additional interface to let the decoder synthesize data when frames
    294  are lost during transmission. This may be added in the future.
    295
    296- Volume control/routing is not handled by this API. Devices exposing a
    297  compressed data interface will be considered as regular ALSA devices;
    298  volume changes and routing information will be provided with regular
    299  ALSA kcontrols.
    300
    301- Embedded audio effects. Such effects should be enabled in the same
    302  manner, no matter if the input was PCM or compressed.
    303
    304- multichannel IEC encoding. Unclear if this is required.
    305
    306- Encoding/decoding acceleration is not supported as mentioned
    307  above. It is possible to route the output of a decoder to a capture
    308  stream, or even implement transcoding capabilities. This routing
    309  would be enabled with ALSA kcontrols.
    310
    311- Audio policy/resource management. This API does not provide any
    312  hooks to query the utilization of the audio DSP, nor any preemption
    313  mechanisms.
    314
    315- No notion of underrun/overrun. Since the bytes written are compressed
    316  in nature and data written/read doesn't translate directly to
    317  rendered output in time, this does not deal with underrun/overrun and
    318  maybe dealt in user-library
    319
    320
    321Credits
    322=======
    323- Mark Brown and Liam Girdwood for discussions on the need for this API
    324- Harsha Priya for her work on intel_sst compressed API
    325- Rakesh Ughreja for valuable feedback
    326- Sing Nallasellan, Sikkandar Madar and Prasanna Samaga for
    327  demonstrating and quantifying the benefits of audio offload on a
    328  real platform.