cachepc-linux

Fork of AMDESE/linux with modifications for CachePC side-channel attack
git clone https://git.sinitax.com/sinitax/cachepc-linux
Log | Files | Refs | README | LICENSE | sfeed.txt

refcount-vs-atomic.rst (5708B)


      1===================================
      2refcount_t API compared to atomic_t
      3===================================
      4
      5.. contents:: :local:
      6
      7Introduction
      8============
      9
     10The goal of refcount_t API is to provide a minimal API for implementing
     11an object's reference counters. While a generic architecture-independent
     12implementation from lib/refcount.c uses atomic operations underneath,
     13there are a number of differences between some of the ``refcount_*()`` and
     14``atomic_*()`` functions with regards to the memory ordering guarantees.
     15This document outlines the differences and provides respective examples
     16in order to help maintainers validate their code against the change in
     17these memory ordering guarantees.
     18
     19The terms used through this document try to follow the formal LKMM defined in
     20tools/memory-model/Documentation/explanation.txt.
     21
     22memory-barriers.txt and atomic_t.txt provide more background to the
     23memory ordering in general and for atomic operations specifically.
     24
     25Relevant types of memory ordering
     26=================================
     27
     28.. note:: The following section only covers some of the memory
     29   ordering types that are relevant for the atomics and reference
     30   counters and used through this document. For a much broader picture
     31   please consult memory-barriers.txt document.
     32
     33In the absence of any memory ordering guarantees (i.e. fully unordered)
     34atomics & refcounters only provide atomicity and
     35program order (po) relation (on the same CPU). It guarantees that
     36each ``atomic_*()`` and ``refcount_*()`` operation is atomic and instructions
     37are executed in program order on a single CPU.
     38This is implemented using READ_ONCE()/WRITE_ONCE() and
     39compare-and-swap primitives.
     40
     41A strong (full) memory ordering guarantees that all prior loads and
     42stores (all po-earlier instructions) on the same CPU are completed
     43before any po-later instruction is executed on the same CPU.
     44It also guarantees that all po-earlier stores on the same CPU
     45and all propagated stores from other CPUs must propagate to all
     46other CPUs before any po-later instruction is executed on the original
     47CPU (A-cumulative property). This is implemented using smp_mb().
     48
     49A RELEASE memory ordering guarantees that all prior loads and
     50stores (all po-earlier instructions) on the same CPU are completed
     51before the operation. It also guarantees that all po-earlier
     52stores on the same CPU and all propagated stores from other CPUs
     53must propagate to all other CPUs before the release operation
     54(A-cumulative property). This is implemented using
     55smp_store_release().
     56
     57An ACQUIRE memory ordering guarantees that all post loads and
     58stores (all po-later instructions) on the same CPU are
     59completed after the acquire operation. It also guarantees that all
     60po-later stores on the same CPU must propagate to all other CPUs
     61after the acquire operation executes. This is implemented using
     62smp_acquire__after_ctrl_dep().
     63
     64A control dependency (on success) for refcounters guarantees that
     65if a reference for an object was successfully obtained (reference
     66counter increment or addition happened, function returned true),
     67then further stores are ordered against this operation.
     68Control dependency on stores are not implemented using any explicit
     69barriers, but rely on CPU not to speculate on stores. This is only
     70a single CPU relation and provides no guarantees for other CPUs.
     71
     72
     73Comparison of functions
     74=======================
     75
     76case 1) - non-"Read/Modify/Write" (RMW) ops
     77-------------------------------------------
     78
     79Function changes:
     80
     81 * atomic_set() --> refcount_set()
     82 * atomic_read() --> refcount_read()
     83
     84Memory ordering guarantee changes:
     85
     86 * none (both fully unordered)
     87
     88
     89case 2) - increment-based ops that return no value
     90--------------------------------------------------
     91
     92Function changes:
     93
     94 * atomic_inc() --> refcount_inc()
     95 * atomic_add() --> refcount_add()
     96
     97Memory ordering guarantee changes:
     98
     99 * none (both fully unordered)
    100
    101case 3) - decrement-based RMW ops that return no value
    102------------------------------------------------------
    103
    104Function changes:
    105
    106 * atomic_dec() --> refcount_dec()
    107
    108Memory ordering guarantee changes:
    109
    110 * fully unordered --> RELEASE ordering
    111
    112
    113case 4) - increment-based RMW ops that return a value
    114-----------------------------------------------------
    115
    116Function changes:
    117
    118 * atomic_inc_not_zero() --> refcount_inc_not_zero()
    119 * no atomic counterpart --> refcount_add_not_zero()
    120
    121Memory ordering guarantees changes:
    122
    123 * fully ordered --> control dependency on success for stores
    124
    125.. note:: We really assume here that necessary ordering is provided as a
    126   result of obtaining pointer to the object!
    127
    128
    129case 5) - generic dec/sub decrement-based RMW ops that return a value
    130---------------------------------------------------------------------
    131
    132Function changes:
    133
    134 * atomic_dec_and_test() --> refcount_dec_and_test()
    135 * atomic_sub_and_test() --> refcount_sub_and_test()
    136
    137Memory ordering guarantees changes:
    138
    139 * fully ordered --> RELEASE ordering + ACQUIRE ordering on success
    140
    141
    142case 6) other decrement-based RMW ops that return a value
    143---------------------------------------------------------
    144
    145Function changes:
    146
    147 * no atomic counterpart --> refcount_dec_if_one()
    148 * ``atomic_add_unless(&var, -1, 1)`` --> ``refcount_dec_not_one(&var)``
    149
    150Memory ordering guarantees changes:
    151
    152 * fully ordered --> RELEASE ordering + control dependency
    153
    154.. note:: atomic_add_unless() only provides full order on success.
    155
    156
    157case 7) - lock-based RMW
    158------------------------
    159
    160Function changes:
    161
    162 * atomic_dec_and_lock() --> refcount_dec_and_lock()
    163 * atomic_dec_and_mutex_lock() --> refcount_dec_and_mutex_lock()
    164
    165Memory ordering guarantees changes:
    166
    167 * fully ordered --> RELEASE ordering + control dependency + hold
    168   spin_lock() on success