cachepc-linux

Fork of AMDESE/linux with modifications for CachePC side-channel attack
git clone https://git.sinitax.com/sinitax/cachepc-linux
Log | Files | Refs | README | LICENSE | sfeed.txt

boot-interrupts.rst (6379B)


      1.. SPDX-License-Identifier: GPL-2.0
      2
      3===============
      4Boot Interrupts
      5===============
      6
      7:Author: - Sean V Kelley <sean.v.kelley@linux.intel.com>
      8
      9Overview
     10========
     11
     12On PCI Express, interrupts are represented with either MSI or inbound
     13interrupt messages (Assert_INTx/Deassert_INTx). The integrated IO-APIC in a
     14given Core IO converts the legacy interrupt messages from PCI Express to
     15MSI interrupts.  If the IO-APIC is disabled (via the mask bits in the
     16IO-APIC table entries), the messages are routed to the legacy PCH. This
     17in-band interrupt mechanism was traditionally necessary for systems that
     18did not support the IO-APIC and for boot. Intel in the past has used the
     19term "boot interrupts" to describe this mechanism. Further, the PCI Express
     20protocol describes this in-band legacy wire-interrupt INTx mechanism for
     21I/O devices to signal PCI-style level interrupts. The subsequent paragraphs
     22describe problems with the Core IO handling of INTx message routing to the
     23PCH and mitigation within BIOS and the OS.
     24
     25
     26Issue
     27=====
     28
     29When in-band legacy INTx messages are forwarded to the PCH, they in turn
     30trigger a new interrupt for which the OS likely lacks a handler. When an
     31interrupt goes unhandled over time, they are tracked by the Linux kernel as
     32Spurious Interrupts. The IRQ will be disabled by the Linux kernel after it
     33reaches a specific count with the error "nobody cared". This disabled IRQ
     34now prevents valid usage by an existing interrupt which may happen to share
     35the IRQ line::
     36
     37  irq 19: nobody cared (try booting with the "irqpoll" option)
     38  CPU: 0 PID: 2988 Comm: irq/34-nipalk Tainted: 4.14.87-rt49-02410-g4a640ec-dirty #1
     39  Hardware name: National Instruments NI PXIe-8880/NI PXIe-8880, BIOS 2.1.5f1 01/09/2020
     40  Call Trace:
     41
     42  <IRQ>
     43   ? dump_stack+0x46/0x5e
     44   ? __report_bad_irq+0x2e/0xb0
     45   ? note_interrupt+0x242/0x290
     46   ? nNIKAL100_memoryRead16+0x8/0x10 [nikal]
     47   ? handle_irq_event_percpu+0x55/0x70
     48   ? handle_irq_event+0x4f/0x80
     49   ? handle_fasteoi_irq+0x81/0x180
     50   ? handle_irq+0x1c/0x30
     51   ? do_IRQ+0x41/0xd0
     52   ? common_interrupt+0x84/0x84
     53  </IRQ>
     54
     55  handlers:
     56  irq_default_primary_handler threaded usb_hcd_irq
     57  Disabling IRQ #19
     58
     59
     60Conditions
     61==========
     62
     63The use of threaded interrupts is the most likely condition to trigger
     64this problem today. Threaded interrupts may not be reenabled after the IRQ
     65handler wakes. These "one shot" conditions mean that the threaded interrupt
     66needs to keep the interrupt line masked until the threaded handler has run.
     67Especially when dealing with high data rate interrupts, the thread needs to
     68run to completion; otherwise some handlers will end up in stack overflows
     69since the interrupt of the issuing device is still active.
     70
     71Affected Chipsets
     72=================
     73
     74The legacy interrupt forwarding mechanism exists today in a number of
     75devices including but not limited to chipsets from AMD/ATI, Broadcom, and
     76Intel. Changes made through the mitigations below have been applied to
     77drivers/pci/quirks.c
     78
     79Starting with ICX there are no longer any IO-APICs in the Core IO's
     80devices.  IO-APIC is only in the PCH.  Devices connected to the Core IO's
     81PCIe Root Ports will use native MSI/MSI-X mechanisms.
     82
     83Mitigations
     84===========
     85
     86The mitigations take the form of PCI quirks. The preference has been to
     87first identify and make use of a means to disable the routing to the PCH.
     88In such a case a quirk to disable boot interrupt generation can be
     89added. [1]_
     90
     91Intel® 6300ESB I/O Controller Hub
     92  Alternate Base Address Register:
     93   BIE: Boot Interrupt Enable
     94
     95	  ==  ===========================
     96	  0   Boot interrupt is enabled.
     97	  1   Boot interrupt is disabled.
     98	  ==  ===========================
     99
    100Intel® Sandy Bridge through Sky Lake based Xeon servers:
    101  Coherent Interface Protocol Interrupt Control
    102   dis_intx_route2pch/dis_intx_route2ich/dis_intx_route2dmi2:
    103	  When this bit is set. Local INTx messages received from the
    104	  Intel® Quick Data DMA/PCI Express ports are not routed to legacy
    105	  PCH - they are either converted into MSI via the integrated IO-APIC
    106	  (if the IO-APIC mask bit is clear in the appropriate entries)
    107	  or cause no further action (when mask bit is set)
    108
    109In the absence of a way to directly disable the routing, another approach
    110has been to make use of PCI Interrupt pin to INTx routing tables for
    111purposes of redirecting the interrupt handler to the rerouted interrupt
    112line by default.  Therefore, on chipsets where this INTx routing cannot be
    113disabled, the Linux kernel will reroute the valid interrupt to its legacy
    114interrupt. This redirection of the handler will prevent the occurrence of
    115the spurious interrupt detection which would ordinarily disable the IRQ
    116line due to excessive unhandled counts. [2]_
    117
    118The config option X86_REROUTE_FOR_BROKEN_BOOT_IRQS exists to enable (or
    119disable) the redirection of the interrupt handler to the PCH interrupt
    120line. The option can be overridden by either pci=ioapicreroute or
    121pci=noioapicreroute. [3]_
    122
    123
    124More Documentation
    125==================
    126
    127There is an overview of the legacy interrupt handling in several datasheets
    128(6300ESB and 6700PXH below). While largely the same, it provides insight
    129into the evolution of its handling with chipsets.
    130
    131Example of disabling of the boot interrupt
    132------------------------------------------
    133
    134      - Intel® 6300ESB I/O Controller Hub (Document # 300641-004US)
    135	5.7.3 Boot Interrupt
    136	https://www.intel.com/content/dam/doc/datasheet/6300esb-io-controller-hub-datasheet.pdf
    137
    138      - Intel® Xeon® Processor E5-1600/2400/2600/4600 v3 Product Families
    139	Datasheet - Volume 2: Registers (Document # 330784-003)
    140	6.6.41 cipintrc Coherent Interface Protocol Interrupt Control
    141	https://www.intel.com/content/dam/www/public/us/en/documents/datasheets/xeon-e5-v3-datasheet-vol-2.pdf
    142
    143Example of handler rerouting
    144----------------------------
    145
    146      - Intel® 6700PXH 64-bit PCI Hub (Document # 302628)
    147	2.15.2 PCI Express Legacy INTx Support and Boot Interrupt
    148	https://www.intel.com/content/dam/doc/datasheet/6700pxh-64-bit-pci-hub-datasheet.pdf
    149
    150
    151If you have any legacy PCI interrupt questions that aren't answered, email me.
    152
    153Cheers,
    154    Sean V Kelley
    155    sean.v.kelley@linux.intel.com
    156
    157.. [1] https://lore.kernel.org/r/12131949181903-git-send-email-sassmann@suse.de/
    158.. [2] https://lore.kernel.org/r/12131949182094-git-send-email-sassmann@suse.de/
    159.. [3] https://lore.kernel.org/r/487C8EA7.6020205@suse.de/