cachepc-linux

Fork of AMDESE/linux with modifications for CachePC side-channel attack
git clone https://git.sinitax.com/sinitax/cachepc-linux
Log | Files | Refs | README | LICENSE | sfeed.txt

files.rst (4355B)


      1.. SPDX-License-Identifier: GPL-2.0
      2
      3===================================
      4File management in the Linux kernel
      5===================================
      6
      7This document describes how locking for files (struct file)
      8and file descriptor table (struct files) works.
      9
     10Up until 2.6.12, the file descriptor table has been protected
     11with a lock (files->file_lock) and reference count (files->count).
     12->file_lock protected accesses to all the file related fields
     13of the table. ->count was used for sharing the file descriptor
     14table between tasks cloned with CLONE_FILES flag. Typically
     15this would be the case for posix threads. As with the common
     16refcounting model in the kernel, the last task doing
     17a put_files_struct() frees the file descriptor (fd) table.
     18The files (struct file) themselves are protected using
     19reference count (->f_count).
     20
     21In the new lock-free model of file descriptor management,
     22the reference counting is similar, but the locking is
     23based on RCU. The file descriptor table contains multiple
     24elements - the fd sets (open_fds and close_on_exec, the
     25array of file pointers, the sizes of the sets and the array
     26etc.). In order for the updates to appear atomic to
     27a lock-free reader, all the elements of the file descriptor
     28table are in a separate structure - struct fdtable.
     29files_struct contains a pointer to struct fdtable through
     30which the actual fd table is accessed. Initially the
     31fdtable is embedded in files_struct itself. On a subsequent
     32expansion of fdtable, a new fdtable structure is allocated
     33and files->fdtab points to the new structure. The fdtable
     34structure is freed with RCU and lock-free readers either
     35see the old fdtable or the new fdtable making the update
     36appear atomic. Here are the locking rules for
     37the fdtable structure -
     38
     391. All references to the fdtable must be done through
     40   the files_fdtable() macro::
     41
     42	struct fdtable *fdt;
     43
     44	rcu_read_lock();
     45
     46	fdt = files_fdtable(files);
     47	....
     48	if (n <= fdt->max_fds)
     49		....
     50	...
     51	rcu_read_unlock();
     52
     53   files_fdtable() uses rcu_dereference() macro which takes care of
     54   the memory barrier requirements for lock-free dereference.
     55   The fdtable pointer must be read within the read-side
     56   critical section.
     57
     582. Reading of the fdtable as described above must be protected
     59   by rcu_read_lock()/rcu_read_unlock().
     60
     613. For any update to the fd table, files->file_lock must
     62   be held.
     63
     644. To look up the file structure given an fd, a reader
     65   must use either lookup_fd_rcu() or files_lookup_fd_rcu() APIs. These
     66   take care of barrier requirements due to lock-free lookup.
     67
     68   An example::
     69
     70	struct file *file;
     71
     72	rcu_read_lock();
     73	file = lookup_fd_rcu(fd);
     74	if (file) {
     75		...
     76	}
     77	....
     78	rcu_read_unlock();
     79
     805. Handling of the file structures is special. Since the look-up
     81   of the fd (fget()/fget_light()) are lock-free, it is possible
     82   that look-up may race with the last put() operation on the
     83   file structure. This is avoided using atomic_long_inc_not_zero()
     84   on ->f_count::
     85
     86	rcu_read_lock();
     87	file = files_lookup_fd_rcu(files, fd);
     88	if (file) {
     89		if (atomic_long_inc_not_zero(&file->f_count))
     90			*fput_needed = 1;
     91		else
     92		/* Didn't get the reference, someone's freed */
     93			file = NULL;
     94	}
     95	rcu_read_unlock();
     96	....
     97	return file;
     98
     99   atomic_long_inc_not_zero() detects if refcounts is already zero or
    100   goes to zero during increment. If it does, we fail
    101   fget()/fget_light().
    102
    1036. Since both fdtable and file structures can be looked up
    104   lock-free, they must be installed using rcu_assign_pointer()
    105   API. If they are looked up lock-free, rcu_dereference()
    106   must be used. However it is advisable to use files_fdtable()
    107   and lookup_fd_rcu()/files_lookup_fd_rcu() which take care of these issues.
    108
    1097. While updating, the fdtable pointer must be looked up while
    110   holding files->file_lock. If ->file_lock is dropped, then
    111   another thread expand the files thereby creating a new
    112   fdtable and making the earlier fdtable pointer stale.
    113
    114   For example::
    115
    116	spin_lock(&files->file_lock);
    117	fd = locate_fd(files, file, start);
    118	if (fd >= 0) {
    119		/* locate_fd() may have expanded fdtable, load the ptr */
    120		fdt = files_fdtable(files);
    121		__set_open_fd(fd, fdt);
    122		__clear_close_on_exec(fd, fdt);
    123		spin_unlock(&files->file_lock);
    124	.....
    125
    126   Since locate_fd() can drop ->file_lock (and reacquire ->file_lock),
    127   the fdtable pointer (fdt) must be loaded after locate_fd().
    128