cachepc-linux

Fork of AMDESE/linux with modifications for CachePC side-channel attack
git clone https://git.sinitax.com/sinitax/cachepc-linux
Log | Files | Refs | README | LICENSE | sfeed.txt

netfs-api.rst (18186B)


      1.. SPDX-License-Identifier: GPL-2.0
      2
      3==============================
      4Network Filesystem Caching API
      5==============================
      6
      7Fscache provides an API by which a network filesystem can make use of local
      8caching facilities.  The API is arranged around a number of principles:
      9
     10 (1) A cache is logically organised into volumes and data storage objects
     11     within those volumes.
     12
     13 (2) Volumes and data storage objects are represented by various types of
     14     cookie.
     15
     16 (3) Cookies have keys that distinguish them from their peers.
     17
     18 (4) Cookies have coherency data that allows a cache to determine if the
     19     cached data is still valid.
     20
     21 (5) I/O is done asynchronously where possible.
     22
     23This API is used by::
     24
     25	#include <linux/fscache.h>.
     26
     27.. This document contains the following sections:
     28
     29	 (1) Overview
     30	 (2) Volume registration
     31	 (3) Data file registration
     32	 (4) Declaring a cookie to be in use
     33	 (5) Resizing a data file (truncation)
     34	 (6) Data I/O API
     35	 (7) Data file coherency
     36	 (8) Data file invalidation
     37	 (9) Write back resource management
     38	(10) Caching of local modifications
     39	(11) Page release and invalidation
     40
     41
     42Overview
     43========
     44
     45The fscache hierarchy is organised on two levels from a network filesystem's
     46point of view.  The upper level represents "volumes" and the lower level
     47represents "data storage objects".  These are represented by two types of
     48cookie, hereafter referred to as "volume cookies" and "cookies".
     49
     50A network filesystem acquires a volume cookie for a volume using a volume key,
     51which represents all the information that defines that volume (e.g. cell name
     52or server address, volume ID or share name).  This must be rendered as a
     53printable string that can be used as a directory name (ie. no '/' characters
     54and shouldn't begin with a '.').  The maximum name length is one less than the
     55maximum size of a filename component (allowing the cache backend one char for
     56its own purposes).
     57
     58A filesystem would typically have a volume cookie for each superblock.
     59
     60The filesystem then acquires a cookie for each file within that volume using an
     61object key.  Object keys are binary blobs and only need to be unique within
     62their parent volume.  The cache backend is reponsible for rendering the binary
     63blob into something it can use and may employ hash tables, trees or whatever to
     64improve its ability to find an object.  This is transparent to the network
     65filesystem.
     66
     67A filesystem would typically have a cookie for each inode, and would acquire it
     68in iget and relinquish it when evicting the cookie.
     69
     70Once it has a cookie, the filesystem needs to mark the cookie as being in use.
     71This causes fscache to send the cache backend off to look up/create resources
     72for the cookie in the background, to check its coherency and, if necessary, to
     73mark the object as being under modification.
     74
     75A filesystem would typically "use" the cookie in its file open routine and
     76unuse it in file release and it needs to use the cookie around calls to
     77truncate the cookie locally.  It *also* needs to use the cookie when the
     78pagecache becomes dirty and unuse it when writeback is complete.  This is
     79slightly tricky, and provision is made for it.
     80
     81When performing a read, write or resize on a cookie, the filesystem must first
     82begin an operation.  This copies the resources into a holding struct and puts
     83extra pins into the cache to stop cache withdrawal from tearing down the
     84structures being used.  The actual operation can then be issued and conflicting
     85invalidations can be detected upon completion.
     86
     87The filesystem is expected to use netfslib to access the cache, but that's not
     88actually required and it can use the fscache I/O API directly.
     89
     90
     91Volume Registration
     92===================
     93
     94The first step for a network filsystem is to acquire a volume cookie for the
     95volume it wants to access::
     96
     97	struct fscache_volume *
     98	fscache_acquire_volume(const char *volume_key,
     99			       const char *cache_name,
    100			       const void *coherency_data,
    101			       size_t coherency_len);
    102
    103This function creates a volume cookie with the specified volume key as its name
    104and notes the coherency data.
    105
    106The volume key must be a printable string with no '/' characters in it.  It
    107should begin with the name of the filesystem and should be no longer than 254
    108characters.  It should uniquely represent the volume and will be matched with
    109what's stored in the cache.
    110
    111The caller may also specify the name of the cache to use.  If specified,
    112fscache will look up or create a cache cookie of that name and will use a cache
    113of that name if it is online or comes online.  If no cache name is specified,
    114it will use the first cache that comes to hand and set the name to that.
    115
    116The specified coherency data is stored in the cookie and will be matched
    117against coherency data stored on disk.  The data pointer may be NULL if no data
    118is provided.  If the coherency data doesn't match, the entire cache volume will
    119be invalidated.
    120
    121This function can return errors such as EBUSY if the volume key is already in
    122use by an acquired volume or ENOMEM if an allocation failure occured.  It may
    123also return a NULL volume cookie if fscache is not enabled.  It is safe to
    124pass a NULL cookie to any function that takes a volume cookie.  This will
    125cause that function to do nothing.
    126
    127
    128When the network filesystem has finished with a volume, it should relinquish it
    129by calling::
    130
    131	void fscache_relinquish_volume(struct fscache_volume *volume,
    132				       const void *coherency_data,
    133				       bool invalidate);
    134
    135This will cause the volume to be committed or removed, and if sealed the
    136coherency data will be set to the value supplied.  The amount of coherency data
    137must match the length specified when the volume was acquired.  Note that all
    138data cookies obtained in this volume must be relinquished before the volume is
    139relinquished.
    140
    141
    142Data File Registration
    143======================
    144
    145Once it has a volume cookie, a network filesystem can use it to acquire a
    146cookie for data storage::
    147
    148	struct fscache_cookie *
    149	fscache_acquire_cookie(struct fscache_volume *volume,
    150			       u8 advice,
    151			       const void *index_key,
    152			       size_t index_key_len,
    153			       const void *aux_data,
    154			       size_t aux_data_len,
    155			       loff_t object_size)
    156
    157This creates the cookie in the volume using the specified index key.  The index
    158key is a binary blob of the given length and must be unique for the volume.
    159This is saved into the cookie.  There are no restrictions on the content, but
    160its length shouldn't exceed about three quarters of the maximum filename length
    161to allow for encoding.
    162
    163The caller should also pass in a piece of coherency data in aux_data.  A buffer
    164of size aux_data_len will be allocated and the coherency data copied in.  It is
    165assumed that the size is invariant over time.  The coherency data is used to
    166check the validity of data in the cache.  Functions are provided by which the
    167coherency data can be updated.
    168
    169The file size of the object being cached should also be provided.  This may be
    170used to trim the data and will be stored with the coherency data.
    171
    172This function never returns an error, though it may return a NULL cookie on
    173allocation failure or if fscache is not enabled.  It is safe to pass in a NULL
    174volume cookie and pass the NULL cookie returned to any function that takes it.
    175This will cause that function to do nothing.
    176
    177
    178When the network filesystem has finished with a cookie, it should relinquish it
    179by calling::
    180
    181	void fscache_relinquish_cookie(struct fscache_cookie *cookie,
    182				       bool retire);
    183
    184This will cause fscache to either commit the storage backing the cookie or
    185delete it.
    186
    187
    188Marking A Cookie In-Use
    189=======================
    190
    191Once a cookie has been acquired by a network filesystem, the filesystem should
    192tell fscache when it intends to use the cookie (typically done on file open)
    193and should say when it has finished with it (typically on file close)::
    194
    195	void fscache_use_cookie(struct fscache_cookie *cookie,
    196				bool will_modify);
    197	void fscache_unuse_cookie(struct fscache_cookie *cookie,
    198				  const void *aux_data,
    199				  const loff_t *object_size);
    200
    201The *use* function tells fscache that it will use the cookie and, additionally,
    202indicate if the user is intending to modify the contents locally.  If not yet
    203done, this will trigger the cache backend to go and gather the resources it
    204needs to access/store data in the cache.  This is done in the background, and
    205so may not be complete by the time the function returns.
    206
    207The *unuse* function indicates that a filesystem has finished using a cookie.
    208It optionally updates the stored coherency data and object size and then
    209decreases the in-use counter.  When the last user unuses the cookie, it is
    210scheduled for garbage collection.  If not reused within a short time, the
    211resources will be released to reduce system resource consumption.
    212
    213A cookie must be marked in-use before it can be accessed for read, write or
    214resize - and an in-use mark must be kept whilst there is dirty data in the
    215pagecache in order to avoid an oops due to trying to open a file during process
    216exit.
    217
    218Note that in-use marks are cumulative.  For each time a cookie is marked
    219in-use, it must be unused.
    220
    221
    222Resizing A Data File (Truncation)
    223=================================
    224
    225If a network filesystem file is resized locally by truncation, the following
    226should be called to notify the cache::
    227
    228	void fscache_resize_cookie(struct fscache_cookie *cookie,
    229				   loff_t new_size);
    230
    231The caller must have first marked the cookie in-use.  The cookie and the new
    232size are passed in and the cache is synchronously resized.  This is expected to
    233be called from ``->setattr()`` inode operation under the inode lock.
    234
    235
    236Data I/O API
    237============
    238
    239To do data I/O operations directly through a cookie, the following functions
    240are available::
    241
    242	int fscache_begin_read_operation(struct netfs_cache_resources *cres,
    243					 struct fscache_cookie *cookie);
    244	int fscache_read(struct netfs_cache_resources *cres,
    245			 loff_t start_pos,
    246			 struct iov_iter *iter,
    247			 enum netfs_read_from_hole read_hole,
    248			 netfs_io_terminated_t term_func,
    249			 void *term_func_priv);
    250	int fscache_write(struct netfs_cache_resources *cres,
    251			  loff_t start_pos,
    252			  struct iov_iter *iter,
    253			  netfs_io_terminated_t term_func,
    254			  void *term_func_priv);
    255
    256The *begin* function sets up an operation, attaching the resources required to
    257the cache resources block from the cookie.  Assuming it doesn't return an error
    258(for instance, it will return -ENOBUFS if given a NULL cookie, but otherwise do
    259nothing), then one of the other two functions can be issued.
    260
    261The *read* and *write* functions initiate a direct-IO operation.  Both take the
    262previously set up cache resources block, an indication of the start file
    263position, and an I/O iterator that describes buffer and indicates the amount of
    264data.
    265
    266The read function also takes a parameter to indicate how it should handle a
    267partially populated region (a hole) in the disk content.  This may be to ignore
    268it, skip over an initial hole and place zeros in the buffer or give an error.
    269
    270The read and write functions can be given an optional termination function that
    271will be run on completion::
    272
    273	typedef
    274	void (*netfs_io_terminated_t)(void *priv, ssize_t transferred_or_error,
    275				      bool was_async);
    276
    277If a termination function is given, the operation will be run asynchronously
    278and the termination function will be called upon completion.  If not given, the
    279operation will be run synchronously.  Note that in the asynchronous case, it is
    280possible for the operation to complete before the function returns.
    281
    282Both the read and write functions end the operation when they complete,
    283detaching any pinned resources.
    284
    285The read operation will fail with ESTALE if invalidation occurred whilst the
    286operation was ongoing.
    287
    288
    289Data File Coherency
    290===================
    291
    292To request an update of the coherency data and file size on a cookie, the
    293following should be called::
    294
    295	void fscache_update_cookie(struct fscache_cookie *cookie,
    296				   const void *aux_data,
    297				   const loff_t *object_size);
    298
    299This will update the cookie's coherency data and/or file size.
    300
    301
    302Data File Invalidation
    303======================
    304
    305Sometimes it will be necessary to invalidate an object that contains data.
    306Typically this will be necessary when the server informs the network filesystem
    307of a remote third-party change - at which point the filesystem has to throw
    308away the state and cached data that it had for an file and reload from the
    309server.
    310
    311To indicate that a cache object should be invalidated, the following should be
    312called::
    313
    314	void fscache_invalidate(struct fscache_cookie *cookie,
    315				const void *aux_data,
    316				loff_t size,
    317				unsigned int flags);
    318
    319This increases the invalidation counter in the cookie to cause outstanding
    320reads to fail with -ESTALE, sets the coherency data and file size from the
    321information supplied, blocks new I/O on the cookie and dispatches the cache to
    322go and get rid of the old data.
    323
    324Invalidation runs asynchronously in a worker thread so that it doesn't block
    325too much.
    326
    327
    328Write-Back Resource Management
    329==============================
    330
    331To write data to the cache from network filesystem writeback, the cache
    332resources required need to be pinned at the point the modification is made (for
    333instance when the page is marked dirty) as it's not possible to open a file in
    334a thread that's exiting.
    335
    336The following facilities are provided to manage this:
    337
    338 * An inode flag, ``I_PINNING_FSCACHE_WB``, is provided to indicate that an
    339   in-use is held on the cookie for this inode.  It can only be changed if the
    340   the inode lock is held.
    341
    342 * A flag, ``unpinned_fscache_wb`` is placed in the ``writeback_control``
    343   struct that gets set if ``__writeback_single_inode()`` clears
    344   ``I_PINNING_FSCACHE_WB`` because all the dirty pages were cleared.
    345
    346To support this, the following functions are provided::
    347
    348	bool fscache_dirty_folio(struct address_space *mapping,
    349				 struct folio *folio,
    350				 struct fscache_cookie *cookie);
    351	void fscache_unpin_writeback(struct writeback_control *wbc,
    352				     struct fscache_cookie *cookie);
    353	void fscache_clear_inode_writeback(struct fscache_cookie *cookie,
    354					   struct inode *inode,
    355					   const void *aux);
    356
    357The *set* function is intended to be called from the filesystem's
    358``dirty_folio`` address space operation.  If ``I_PINNING_FSCACHE_WB`` is not
    359set, it sets that flag and increments the use count on the cookie (the caller
    360must already have called ``fscache_use_cookie()``).
    361
    362The *unpin* function is intended to be called from the filesystem's
    363``write_inode`` superblock operation.  It cleans up after writing by unusing
    364the cookie if unpinned_fscache_wb is set in the writeback_control struct.
    365
    366The *clear* function is intended to be called from the netfs's ``evict_inode``
    367superblock operation.  It must be called *after*
    368``truncate_inode_pages_final()``, but *before* ``clear_inode()``.  This cleans
    369up any hanging ``I_PINNING_FSCACHE_WB``.  It also allows the coherency data to
    370be updated.
    371
    372
    373Caching of Local Modifications
    374==============================
    375
    376If a network filesystem has locally modified data that it wants to write to the
    377cache, it needs to mark the pages to indicate that a write is in progress, and
    378if the mark is already present, it needs to wait for it to be removed first
    379(presumably due to an already in-progress operation).  This prevents multiple
    380competing DIO writes to the same storage in the cache.
    381
    382Firstly, the netfs should determine if caching is available by doing something
    383like::
    384
    385	bool caching = fscache_cookie_enabled(cookie);
    386
    387If caching is to be attempted, pages should be waited for and then marked using
    388the following functions provided by the netfs helper library::
    389
    390	void set_page_fscache(struct page *page);
    391	void wait_on_page_fscache(struct page *page);
    392	int wait_on_page_fscache_killable(struct page *page);
    393
    394Once all the pages in the span are marked, the netfs can ask fscache to
    395schedule a write of that region::
    396
    397	void fscache_write_to_cache(struct fscache_cookie *cookie,
    398				    struct address_space *mapping,
    399				    loff_t start, size_t len, loff_t i_size,
    400				    netfs_io_terminated_t term_func,
    401				    void *term_func_priv,
    402				    bool caching)
    403
    404And if an error occurs before that point is reached, the marks can be removed
    405by calling::
    406
    407	void fscache_clear_page_bits(struct address_space *mapping,
    408				     loff_t start, size_t len,
    409				     bool caching)
    410
    411In these functions, a pointer to the mapping to which the source pages are
    412attached is passed in and start and len indicate the size of the region that's
    413going to be written (it doesn't have to align to page boundaries necessarily,
    414but it does have to align to DIO boundaries on the backing filesystem).  The
    415caching parameter indicates if caching should be skipped, and if false, the
    416functions do nothing.
    417
    418The write function takes some additional parameters: the cookie representing
    419the cache object to be written to, i_size indicates the size of the netfs file
    420and term_func indicates an optional completion function, to which
    421term_func_priv will be passed, along with the error or amount written.
    422
    423Note that the write function will always run asynchronously and will unmark all
    424the pages upon completion before calling term_func.
    425
    426
    427Page Release and Invalidation
    428=============================
    429
    430Fscache keeps track of whether we have any data in the cache yet for a cache
    431object we've just created.  It knows it doesn't have to do any reading until it
    432has done a write and then the page it wrote from has been released by the VM,
    433after which it *has* to look in the cache.
    434
    435To inform fscache that a page might now be in the cache, the following function
    436should be called from the ``release_folio`` address space op::
    437
    438	void fscache_note_page_release(struct fscache_cookie *cookie);
    439
    440if the page has been released (ie. release_folio returned true).
    441
    442Page release and page invalidation should also wait for any mark left on the
    443page to say that a DIO write is underway from that page::
    444
    445	void wait_on_page_fscache(struct page *page);
    446	int wait_on_page_fscache_killable(struct page *page);
    447
    448
    449API Function Reference
    450======================
    451
    452.. kernel-doc:: include/linux/fscache.h