cachepc-linux

Fork of AMDESE/linux with modifications for CachePC side-channel attack
git clone https://git.sinitax.com/sinitax/cachepc-linux
Log | Files | Refs | README | LICENSE | sfeed.txt

usage-model.rst (19350B)


      1.. SPDX-License-Identifier: GPL-2.0
      2
      3========================
      4Linux and the Devicetree
      5========================
      6
      7The Linux usage model for device tree data
      8
      9:Author: Grant Likely <grant.likely@secretlab.ca>
     10
     11This article describes how Linux uses the device tree.  An overview of
     12the device tree data format can be found on the device tree usage page
     13at devicetree.org\ [1]_.
     14
     15.. [1] https://www.devicetree.org/specifications/
     16
     17The "Open Firmware Device Tree", or simply Devicetree (DT), is a data
     18structure and language for describing hardware.  More specifically, it
     19is a description of hardware that is readable by an operating system
     20so that the operating system doesn't need to hard code details of the
     21machine.
     22
     23Structurally, the DT is a tree, or acyclic graph with named nodes, and
     24nodes may have an arbitrary number of named properties encapsulating
     25arbitrary data.  A mechanism also exists to create arbitrary
     26links from one node to another outside of the natural tree structure.
     27
     28Conceptually, a common set of usage conventions, called 'bindings',
     29is defined for how data should appear in the tree to describe typical
     30hardware characteristics including data busses, interrupt lines, GPIO
     31connections, and peripheral devices.
     32
     33As much as possible, hardware is described using existing bindings to
     34maximize use of existing support code, but since property and node
     35names are simply text strings, it is easy to extend existing bindings
     36or create new ones by defining new nodes and properties.  Be wary,
     37however, of creating a new binding without first doing some homework
     38about what already exists.  There are currently two different,
     39incompatible, bindings for i2c busses that came about because the new
     40binding was created without first investigating how i2c devices were
     41already being enumerated in existing systems.
     42
     431. History
     44----------
     45The DT was originally created by Open Firmware as part of the
     46communication method for passing data from Open Firmware to a client
     47program (like to an operating system).  An operating system used the
     48Device Tree to discover the topology of the hardware at runtime, and
     49thereby support a majority of available hardware without hard coded
     50information (assuming drivers were available for all devices).
     51
     52Since Open Firmware is commonly used on PowerPC and SPARC platforms,
     53the Linux support for those architectures has for a long time used the
     54Device Tree.
     55
     56In 2005, when PowerPC Linux began a major cleanup and to merge 32-bit
     57and 64-bit support, the decision was made to require DT support on all
     58powerpc platforms, regardless of whether or not they used Open
     59Firmware.  To do this, a DT representation called the Flattened Device
     60Tree (FDT) was created which could be passed to the kernel as a binary
     61blob without requiring a real Open Firmware implementation.  U-Boot,
     62kexec, and other bootloaders were modified to support both passing a
     63Device Tree Binary (dtb) and to modify a dtb at boot time.  DT was
     64also added to the PowerPC boot wrapper (``arch/powerpc/boot/*``) so that
     65a dtb could be wrapped up with the kernel image to support booting
     66existing non-DT aware firmware.
     67
     68Some time later, FDT infrastructure was generalized to be usable by
     69all architectures.  At the time of this writing, 6 mainlined
     70architectures (arm, microblaze, mips, powerpc, sparc, and x86) and 1
     71out of mainline (nios) have some level of DT support.
     72
     732. Data Model
     74-------------
     75If you haven't already read the Device Tree Usage\ [1]_ page,
     76then go read it now.  It's okay, I'll wait....
     77
     782.1 High Level View
     79-------------------
     80The most important thing to understand is that the DT is simply a data
     81structure that describes the hardware.  There is nothing magical about
     82it, and it doesn't magically make all hardware configuration problems
     83go away.  What it does do is provide a language for decoupling the
     84hardware configuration from the board and device driver support in the
     85Linux kernel (or any other operating system for that matter).  Using
     86it allows board and device support to become data driven; to make
     87setup decisions based on data passed into the kernel instead of on
     88per-machine hard coded selections.
     89
     90Ideally, data driven platform setup should result in less code
     91duplication and make it easier to support a wide range of hardware
     92with a single kernel image.
     93
     94Linux uses DT data for three major purposes:
     95
     961) platform identification,
     972) runtime configuration, and
     983) device population.
     99
    1002.2 Platform Identification
    101---------------------------
    102First and foremost, the kernel will use data in the DT to identify the
    103specific machine.  In a perfect world, the specific platform shouldn't
    104matter to the kernel because all platform details would be described
    105perfectly by the device tree in a consistent and reliable manner.
    106Hardware is not perfect though, and so the kernel must identify the
    107machine during early boot so that it has the opportunity to run
    108machine-specific fixups.
    109
    110In the majority of cases, the machine identity is irrelevant, and the
    111kernel will instead select setup code based on the machine's core
    112CPU or SoC.  On ARM for example, setup_arch() in
    113arch/arm/kernel/setup.c will call setup_machine_fdt() in
    114arch/arm/kernel/devtree.c which searches through the machine_desc
    115table and selects the machine_desc which best matches the device tree
    116data.  It determines the best match by looking at the 'compatible'
    117property in the root device tree node, and comparing it with the
    118dt_compat list in struct machine_desc (which is defined in
    119arch/arm/include/asm/mach/arch.h if you're curious).
    120
    121The 'compatible' property contains a sorted list of strings starting
    122with the exact name of the machine, followed by an optional list of
    123boards it is compatible with sorted from most compatible to least.  For
    124example, the root compatible properties for the TI BeagleBoard and its
    125successor, the BeagleBoard xM board might look like, respectively::
    126
    127	compatible = "ti,omap3-beagleboard", "ti,omap3450", "ti,omap3";
    128	compatible = "ti,omap3-beagleboard-xm", "ti,omap3450", "ti,omap3";
    129
    130Where "ti,omap3-beagleboard-xm" specifies the exact model, it also
    131claims that it compatible with the OMAP 3450 SoC, and the omap3 family
    132of SoCs in general.  You'll notice that the list is sorted from most
    133specific (exact board) to least specific (SoC family).
    134
    135Astute readers might point out that the Beagle xM could also claim
    136compatibility with the original Beagle board.  However, one should be
    137cautioned about doing so at the board level since there is typically a
    138high level of change from one board to another, even within the same
    139product line, and it is hard to nail down exactly what is meant when one
    140board claims to be compatible with another.  For the top level, it is
    141better to err on the side of caution and not claim one board is
    142compatible with another.  The notable exception would be when one
    143board is a carrier for another, such as a CPU module attached to a
    144carrier board.
    145
    146One more note on compatible values.  Any string used in a compatible
    147property must be documented as to what it indicates.  Add
    148documentation for compatible strings in Documentation/devicetree/bindings.
    149
    150Again on ARM, for each machine_desc, the kernel looks to see if
    151any of the dt_compat list entries appear in the compatible property.
    152If one does, then that machine_desc is a candidate for driving the
    153machine.  After searching the entire table of machine_descs,
    154setup_machine_fdt() returns the 'most compatible' machine_desc based
    155on which entry in the compatible property each machine_desc matches
    156against.  If no matching machine_desc is found, then it returns NULL.
    157
    158The reasoning behind this scheme is the observation that in the majority
    159of cases, a single machine_desc can support a large number of boards
    160if they all use the same SoC, or same family of SoCs.  However,
    161invariably there will be some exceptions where a specific board will
    162require special setup code that is not useful in the generic case.
    163Special cases could be handled by explicitly checking for the
    164troublesome board(s) in generic setup code, but doing so very quickly
    165becomes ugly and/or unmaintainable if it is more than just a couple of
    166cases.
    167
    168Instead, the compatible list allows a generic machine_desc to provide
    169support for a wide common set of boards by specifying "less
    170compatible" values in the dt_compat list.  In the example above,
    171generic board support can claim compatibility with "ti,omap3" or
    172"ti,omap3450".  If a bug was discovered on the original beagleboard
    173that required special workaround code during early boot, then a new
    174machine_desc could be added which implements the workarounds and only
    175matches on "ti,omap3-beagleboard".
    176
    177PowerPC uses a slightly different scheme where it calls the .probe()
    178hook from each machine_desc, and the first one returning TRUE is used.
    179However, this approach does not take into account the priority of the
    180compatible list, and probably should be avoided for new architecture
    181support.
    182
    1832.3 Runtime configuration
    184-------------------------
    185In most cases, a DT will be the sole method of communicating data from
    186firmware to the kernel, so also gets used to pass in runtime and
    187configuration data like the kernel parameters string and the location
    188of an initrd image.
    189
    190Most of this data is contained in the /chosen node, and when booting
    191Linux it will look something like this::
    192
    193	chosen {
    194		bootargs = "console=ttyS0,115200 loglevel=8";
    195		initrd-start = <0xc8000000>;
    196		initrd-end = <0xc8200000>;
    197	};
    198
    199The bootargs property contains the kernel arguments, and the initrd-*
    200properties define the address and size of an initrd blob.  Note that
    201initrd-end is the first address after the initrd image, so this doesn't
    202match the usual semantic of struct resource.  The chosen node may also
    203optionally contain an arbitrary number of additional properties for
    204platform-specific configuration data.
    205
    206During early boot, the architecture setup code calls of_scan_flat_dt()
    207several times with different helper callbacks to parse device tree
    208data before paging is setup.  The of_scan_flat_dt() code scans through
    209the device tree and uses the helpers to extract information required
    210during early boot.  Typically the early_init_dt_scan_chosen() helper
    211is used to parse the chosen node including kernel parameters,
    212early_init_dt_scan_root() to initialize the DT address space model,
    213and early_init_dt_scan_memory() to determine the size and
    214location of usable RAM.
    215
    216On ARM, the function setup_machine_fdt() is responsible for early
    217scanning of the device tree after selecting the correct machine_desc
    218that supports the board.
    219
    2202.4 Device population
    221---------------------
    222After the board has been identified, and after the early configuration data
    223has been parsed, then kernel initialization can proceed in the normal
    224way.  At some point in this process, unflatten_device_tree() is called
    225to convert the data into a more efficient runtime representation.
    226This is also when machine-specific setup hooks will get called, like
    227the machine_desc .init_early(), .init_irq() and .init_machine() hooks
    228on ARM.  The remainder of this section uses examples from the ARM
    229implementation, but all architectures will do pretty much the same
    230thing when using a DT.
    231
    232As can be guessed by the names, .init_early() is used for any machine-
    233specific setup that needs to be executed early in the boot process,
    234and .init_irq() is used to set up interrupt handling.  Using a DT
    235doesn't materially change the behaviour of either of these functions.
    236If a DT is provided, then both .init_early() and .init_irq() are able
    237to call any of the DT query functions (of_* in include/linux/of*.h) to
    238get additional data about the platform.
    239
    240The most interesting hook in the DT context is .init_machine() which
    241is primarily responsible for populating the Linux device model with
    242data about the platform.  Historically this has been implemented on
    243embedded platforms by defining a set of static clock structures,
    244platform_devices, and other data in the board support .c file, and
    245registering it en-masse in .init_machine().  When DT is used, then
    246instead of hard coding static devices for each platform, the list of
    247devices can be obtained by parsing the DT, and allocating device
    248structures dynamically.
    249
    250The simplest case is when .init_machine() is only responsible for
    251registering a block of platform_devices.  A platform_device is a concept
    252used by Linux for memory or I/O mapped devices which cannot be detected
    253by hardware, and for 'composite' or 'virtual' devices (more on those
    254later).  While there is no 'platform device' terminology for the DT,
    255platform devices roughly correspond to device nodes at the root of the
    256tree and children of simple memory mapped bus nodes.
    257
    258About now is a good time to lay out an example.  Here is part of the
    259device tree for the NVIDIA Tegra board::
    260
    261  /{
    262	compatible = "nvidia,harmony", "nvidia,tegra20";
    263	#address-cells = <1>;
    264	#size-cells = <1>;
    265	interrupt-parent = <&intc>;
    266
    267	chosen { };
    268	aliases { };
    269
    270	memory {
    271		device_type = "memory";
    272		reg = <0x00000000 0x40000000>;
    273	};
    274
    275	soc {
    276		compatible = "nvidia,tegra20-soc", "simple-bus";
    277		#address-cells = <1>;
    278		#size-cells = <1>;
    279		ranges;
    280
    281		intc: interrupt-controller@50041000 {
    282			compatible = "nvidia,tegra20-gic";
    283			interrupt-controller;
    284			#interrupt-cells = <1>;
    285			reg = <0x50041000 0x1000>, < 0x50040100 0x0100 >;
    286		};
    287
    288		serial@70006300 {
    289			compatible = "nvidia,tegra20-uart";
    290			reg = <0x70006300 0x100>;
    291			interrupts = <122>;
    292		};
    293
    294		i2s1: i2s@70002800 {
    295			compatible = "nvidia,tegra20-i2s";
    296			reg = <0x70002800 0x100>;
    297			interrupts = <77>;
    298			codec = <&wm8903>;
    299		};
    300
    301		i2c@7000c000 {
    302			compatible = "nvidia,tegra20-i2c";
    303			#address-cells = <1>;
    304			#size-cells = <0>;
    305			reg = <0x7000c000 0x100>;
    306			interrupts = <70>;
    307
    308			wm8903: codec@1a {
    309				compatible = "wlf,wm8903";
    310				reg = <0x1a>;
    311				interrupts = <347>;
    312			};
    313		};
    314	};
    315
    316	sound {
    317		compatible = "nvidia,harmony-sound";
    318		i2s-controller = <&i2s1>;
    319		i2s-codec = <&wm8903>;
    320	};
    321  };
    322
    323At .init_machine() time, Tegra board support code will need to look at
    324this DT and decide which nodes to create platform_devices for.
    325However, looking at the tree, it is not immediately obvious what kind
    326of device each node represents, or even if a node represents a device
    327at all.  The /chosen, /aliases, and /memory nodes are informational
    328nodes that don't describe devices (although arguably memory could be
    329considered a device).  The children of the /soc node are memory mapped
    330devices, but the codec@1a is an i2c device, and the sound node
    331represents not a device, but rather how other devices are connected
    332together to create the audio subsystem.  I know what each device is
    333because I'm familiar with the board design, but how does the kernel
    334know what to do with each node?
    335
    336The trick is that the kernel starts at the root of the tree and looks
    337for nodes that have a 'compatible' property.  First, it is generally
    338assumed that any node with a 'compatible' property represents a device
    339of some kind, and second, it can be assumed that any node at the root
    340of the tree is either directly attached to the processor bus, or is a
    341miscellaneous system device that cannot be described any other way.
    342For each of these nodes, Linux allocates and registers a
    343platform_device, which in turn may get bound to a platform_driver.
    344
    345Why is using a platform_device for these nodes a safe assumption?
    346Well, for the way that Linux models devices, just about all bus_types
    347assume that its devices are children of a bus controller.  For
    348example, each i2c_client is a child of an i2c_master.  Each spi_device
    349is a child of an SPI bus.  Similarly for USB, PCI, MDIO, etc.  The
    350same hierarchy is also found in the DT, where I2C device nodes only
    351ever appear as children of an I2C bus node.  Ditto for SPI, MDIO, USB,
    352etc.  The only devices which do not require a specific type of parent
    353device are platform_devices (and amba_devices, but more on that
    354later), which will happily live at the base of the Linux /sys/devices
    355tree.  Therefore, if a DT node is at the root of the tree, then it
    356really probably is best registered as a platform_device.
    357
    358Linux board support code calls of_platform_populate(NULL, NULL, NULL, NULL)
    359to kick off discovery of devices at the root of the tree.  The
    360parameters are all NULL because when starting from the root of the
    361tree, there is no need to provide a starting node (the first NULL), a
    362parent struct device (the last NULL), and we're not using a match
    363table (yet).  For a board that only needs to register devices,
    364.init_machine() can be completely empty except for the
    365of_platform_populate() call.
    366
    367In the Tegra example, this accounts for the /soc and /sound nodes, but
    368what about the children of the SoC node?  Shouldn't they be registered
    369as platform devices too?  For Linux DT support, the generic behaviour
    370is for child devices to be registered by the parent's device driver at
    371driver .probe() time.  So, an i2c bus device driver will register a
    372i2c_client for each child node, an SPI bus driver will register
    373its spi_device children, and similarly for other bus_types.
    374According to that model, a driver could be written that binds to the
    375SoC node and simply registers platform_devices for each of its
    376children.  The board support code would allocate and register an SoC
    377device, a (theoretical) SoC device driver could bind to the SoC device,
    378and register platform_devices for /soc/interrupt-controller, /soc/serial,
    379/soc/i2s, and /soc/i2c in its .probe() hook.  Easy, right?
    380
    381Actually, it turns out that registering children of some
    382platform_devices as more platform_devices is a common pattern, and the
    383device tree support code reflects that and makes the above example
    384simpler.  The second argument to of_platform_populate() is an
    385of_device_id table, and any node that matches an entry in that table
    386will also get its child nodes registered.  In the Tegra case, the code
    387can look something like this::
    388
    389  static void __init harmony_init_machine(void)
    390  {
    391	/* ... */
    392	of_platform_populate(NULL, of_default_bus_match_table, NULL, NULL);
    393  }
    394
    395"simple-bus" is defined in the Devicetree Specification as a property
    396meaning a simple memory mapped bus, so the of_platform_populate() code
    397could be written to just assume simple-bus compatible nodes will
    398always be traversed.  However, we pass it in as an argument so that
    399board support code can always override the default behaviour.
    400
    401[Need to add discussion of adding i2c/spi/etc child devices]
    402
    403Appendix A: AMBA devices
    404------------------------
    405
    406ARM Primecells are a certain kind of device attached to the ARM AMBA
    407bus which include some support for hardware detection and power
    408management.  In Linux, struct amba_device and the amba_bus_type is
    409used to represent Primecell devices.  However, the fiddly bit is that
    410not all devices on an AMBA bus are Primecells, and for Linux it is
    411typical for both amba_device and platform_device instances to be
    412siblings of the same bus segment.
    413
    414When using the DT, this creates problems for of_platform_populate()
    415because it must decide whether to register each node as either a
    416platform_device or an amba_device.  This unfortunately complicates the
    417device creation model a little bit, but the solution turns out not to
    418be too invasive.  If a node is compatible with "arm,amba-primecell", then
    419of_platform_populate() will register it as an amba_device instead of a
    420platform_device.