Kernel researchers are familiar with the concept of interrupts, and often come across a variety of interrupt-related terms, categorized by software and hardware:

Hardware CPU related:

  • IRQ
  • IDT
  • cli&sti

Software operating system related:

  • APC
  • DPC
  • IRQL

There has always been a vague understanding of interrupts and how the operating system and CPU work together. Recently spent some time to comb this piece of knowledge seriously, improper place, but also ask the master to point out, first thanked!

This article aims to answer the following questions:

  • What is the relationship between IRQ and IRQL?
  • How does Windows virtualize IRQL at the software level
  • APC and DPC are both software interrupts. Since they are interrupts, where are the processing routines in the corresponding IDT entries?

0x00 The Intel 80386 processor is interrupted

First, let’s forget Windows and start with the original 80386 processor and see how Intel designed it to handle interrupts.

Let’s take a look at this CPU from 1985:

Take a look at the pins that stick out. Here’s how they look:

Notice the two pins circled in red. These are the two pins the 80386 processor sets aside for interrupts. INTR is the maskable interrupt input interface, and NMI is the non-maskable interrupt input interface.

So how do interrupts get input to the processor? With so many peripherals and only one pin (think of mashable interrupts for now), you need a secretary for interrupts management for the CPU, the PROGRAMMABLE Interrupt Controller PIC. What does the secretary need to do? It receives interrupt signals from peripherals and sends interrupt requests to the CPU based on priority. The original PIC role was played by a chip code-named 8259A, which looked like this:

! [insert picture description here] (img – blog. Csdnimg. Cn / 900 e0365000… process=image/watermark,type_ZHJvaWRzYW5zZmFsbGJhY2s,shadow_50,text_Q1NETiBA5Luj56CB54as5aSc5pWy,size_18,color_FFFFFF,t_ 70,g_se,x_16)

Here is its pin diagram:

Among them, there are 8 pins ir0-IR7 which are responsible for connecting external devices. Each IR port of 8259A PIC is connected with an IRQ line for receiving interrupt signals from peripherals. INT The INTR pin that connects to the CPU and is used to initiate an interrupt request to the CPU. Typically, two 8259A chips are cascled, one connected to the CPU, called the master chip, and the other connected to the IR2 pin of the master PIC, called the slave chip, so that a total of 8+7=15 peripherals can be connected. As shown below:

In 8259A, the default priority is that the interrupt request from primary slice IR0 has the highest priority, the interrupt request from primary slice IR7 has the lowest priority, and all interrupt requests from secondary slice IR0-7 have the same priority as IR2. Therefore, the priorities of IRQ lines are IRQ0, IRQ1, IRQ8-15, and IRQ3-7 in descending order. This is the default and can be changed programmatically.

There are several important registers inside the 8259A chip:

Interrupt request register: IRR, 8bit, corresponding to IR0-IR7, when the corresponding pin generates interrupt signal, this bit position 1.

Interrupt service register: ISR, 8bit, corresponding to IR0-IR7, bit position 1 when the interrupt of the corresponding pin is being processed by the CPU.

Interrupt masking register: IMR, 8bit, corresponding to IR0-IR7. When the corresponding bit is 1, it means that the interrupt signal generated by this pin is shielded.

There is also an interrupt priority adjudicator: PR. When the interrupt pin has a signal, combining the IRQ number that generates the interrupt and the interrupt information that is currently being processed recorded in THE ISR, it decides whether to report the new interrupt signal to the CPU according to the priority, so as to generate interrupt nesting.

Here are the peripherals that the 15 IRQ lines are connected to:

Now let’s look at how the secretary works in coordination with the CPU.

Now suppose we hit a key on a keyboard, and an interrupt event occurs on the keyboard. This event informs the master PIC through IRQ1. After some internal judgment processing, the master PIC sends an electrical signal to INTR on the CPU side through INT. CPU after executing the current command, check INTR has a signal, indicating that there is an interrupt request, and then check eFlags IF is not zero, indicating that the current interrupt is allowed, then send a signal to PIC -INTA, tell it to send the vector number of this interrupt. After receiving the signal from the -INTA pin, the master PIC outputs the interrupt vector number of this interruption to the data bus through the D0-D7 pin (here the interaction process is simplified, there are actually two INTA signals sent). Once the CPU gets this number, it can look for an interrupt service routine (ISR) from the IDT to process it, and the rest is history.

Where does the interrupt vector in PIC come from? How does each IRQ correspond to each item in the IDT? Here we use the programmable nature of the interrupt controller to determine.

PIC is called programmable interrupt controller, so its programmable in what aspects? Reference 2 “i8259A Interrupt controller analysis a” article has a more detailed description, including programming to specify the master and slave IRQ line corresponding interrupt vector number in the IDT table, 8259A interrupt controller interrupt mode, priority mode, interrupt nesting mode, interrupt shielding mode, interrupt end mode, etc. These can be specified programmatically by the operating system. The specific programming format is illustrated in reference 3 i8259A Interrupt Controller Analysis II.

Back on a problem, the IRQ line interrupt and match them with the entry of IDT, the operating system at the time of initialization, will pass to 8259 a chip programming (read and write I/O port), will specify the PIC chip initial vector, and requires low three to zero, the initial vector, according to the eight alignment, is the cause of such provisions, when an interrupt occurs, The lower three digits are automatically filled with the corresponding IRQ number, which can be added to the starting vector number and sent directly to the data bus for the CPU. In Windows, the programming of PIC during system initialization is as follows: specify the starting interrupt vector number of the master slice as 0x30, and specify the starting interrupt vector number of the slave slice as 0x38. Thus, the 15 peripherals connected by the interrupt controller will be mapped flat to the range 0x30-0x40 in the IDT. Hal was used during Windows kernel startup initialization. HalpInitializePICs programmed the 8259A chip. The code in ReactOS is as follows:

0x20,0x21 are THE I/O ports of the master slice, 0xA0,0xa1 are the I/O ports of the slave slice:

PRIMARY_VECTOR_BASE is defined as:

The specific programming method of 8259A is to read and write IO ports and set corresponding control commands, without further study. Let’s look at what Windows specified when programming 8259A.

  • 1. The working mode of the main piece is specified as cascade, and the interrupt mode is triggered by the edge of electrical signal
  • 2. The interrupt vector mapping base address of the main slice IRQ is specified: 0x30
  • 3. Specify that the main piece is cascaded with its own IRQ2 pin
  • 4. The working mode of the main chip is specified as 80×86 mode, and the end mode of interruption is normal end mode
  • 5. The working mode of the slave chip is specified as cascade, and the interrupt mode is triggered by the edge of electrical signal
  • 6. Specify the interrupt vector mapping base address of IRQ from slice: 0x38
  • 7. Specify the working mode of the slave piece and cascade the pin IRQ2 of the master piece
  • 8. Specify the working mode of slave chip as 80×86 mode and the end mode of interrupt as normal end mode

So far we know that on a computer using the 8259A interrupt controller, the 15 peripherals that can mask interrupts connected through the IRQ line are linearly mapped by the operating system to a range in the IDT. In Windows, it is 0x30-0x40 (PS: 0x20-0x2F in Linux), and the interrupt mode of interrupt controller is specified as edge trigger, and the end mode is normal end mode (that is, the CPU side needs to inform whether the interrupt processing is over and set the corresponding bit, which cannot be set automatically).

0x02 Windows IRQL on 8259A

Let’s look at IRQL.

From the front, we see the hardware level have to interrupt processing provides a good support, need is made of the operating system: first, the initialization of PIC when programming to set the way it works and to map the IRQ, let each item of these interrupt corresponds to the IDT, second, to implement these IDT in the interrupt service routine. It seems that this is enough, then Windows out of a set of IRQL is what?

Take a look at the Windows Internals book’s definition of IRQL:

When writing drivers, you will often come across the concept of IRQL, which implements the Interrupt priority system in Windows. High-priority interrupts can always be processed first, while low-priority interrupts have to wait for high-priority interrupts to be processed. How can a mechanism created by software dictate hardware priorities? How does this work?

Let’s solve two problems first:

1. What is the relationship between IRQ and IRQL? KeRaiseIrql is used to upgrade the current IRQL, why is it guaranteed not to be disturbed by lower priority interrupts?

For the first problem, on a computer using the 8259A interrupt controller, IRQL= 27-IRq is a linear relationship.

On the second question, Windows Internals answers it like this:

Let’s look at the implementation of Windows in detail:

IRQL is a completely virtual concept. In order to realize this virtual mechanism, Windows completely virtual an interrupt controller, which is in KPCR:

+0x024 Irql: UChar // Irql +0x028 IRR: Uint4B +0x02c IrrActive: Uint4B Uint4B // Virtual interrupt masking registerCopy the code

As mentioned in the first part, the 15 interrupt sources connected by two 8259A chips are mapped to a range in the processor IDT, which for Windows is in the 0x30-0x40 range. The interrupt descriptors in these 15 IDTs describe interrupt handling routines (ISRs) that differ from KiTrap03 for int 3 and KiTrap0E for int 0e in that their ISR points to code in the DispatchCode of their respective interrupt object, KINTERRUPT. Here is the definition of this structure:

typedef struct _KINTERRUPT {
    CSHORT Type;
    CSHORT Size;
    LIST_ENTRY InterruptListEntry;
    PKSERVICE_ROUTINE ServiceRoutine;
    PVOID ServiceContext;
    KSPIN_LOCK SpinLock;
    ULONG TickCount;
    PKSPIN_LOCK ActualLock;
    PVOID DispatchAddress;
    ULONG Vector;
    KIRQL Irql;
    KIRQL SynchronizeIrql;
    BOOLEAN FloatingSave;
    BOOLEAN Connected;
    CHAR Number;
    UCHAR ShareVector;
    KINTERRUPT_MODE Mode;
    ULONG ServiceCount;
    ULONG DispatchCount;
    ULONG DispatchCode[106];
} KINTERRUPT, *PKINTERRUPT;
Copy the code

The DispatchCode is copied from a template. These ISR processes start the same way KiTrap03 does, by setting up a trap frame and then getting the address of its own KINTERRUPT object. Once you get these two parameters, Start dispatching interrupts using KiInterruptDispatch or KiChainedDispatch (use this function if multiple KINTERRUPT structures have been registered with the interrupt to form a linked list). In both dispatches, HalBeginSystemInterrupt is called before the actual processing of the interrupt is performed, and HalEndSystemInterrupt is executed to complete the processing of the interrupt. So let’s focus on these two functions.

BOOLEAN
HalBeginSystemInterrupt(
    IN KIRQL Irql
    IN CCHAR Vector,
    OUT PKIRQL OldIrql);
Copy the code

The input parameter Irql represents the Irql of the interrupt that occurred, and the Vector represents the interrupt Vector number. As mentioned earlier, both of these parameters are fetched by DispatchCode from its KINTERRUPT object.

HalBeginSystemInterrupt is distributed internally using the IRQL parameter in a table where all entries are consistent except for individual functions (which is just one more layer of judgment). This table is called HalpDismissIrqGeneric in ReactOS. This function instead calls its underline version _HalpDismissIrqGeneric. This is where the IRQL priority implementation comes in. This function is not long, here is the code in ReactOS (in Windows2000 code is not as intuitive as the C language form used by ReactOS, so ReactOS code is used to illustrate) :

First, compare the IRQL corresponding to this interrupt with the IRQL of the current processor (KPCR). If it is greater than the IRQL of the current processor, it indicates that an interrupt has a higher priority. Then set the IRQL in KPCR to this new higher value, and return TRUE. Indicates that the interrupt request needs to be processed. If it is not greater than the IRQL of the current processor, first record the interrupt record to the IRR value of the virtual interrupt controller in KPCR, and then directly select the mask code corresponding to the IRQL of the current processor from KiI8259MaskTable and write it into PIC to shield those interrupt sources whose IRQL is lower than their own. Return FALSE to indicate that the interrupt request will not be processed. Why not set the mask code when you set the processor’s new IRQL? Windows Internals explains it this way:

The return value of HalpDismissIrqGeneric will be directly used as the return value of HalBeginSystemInterrupt. Take the KiInterruptDispatch function as an example to see how it uses this return value:

As you can see, if HalBeginSystemInterrupt returns FALSE, the interrupt processing ends prematurely. The actual interrupt handling routine continues only if HalBeginSystemInterrupt returns TRUE. Finally, in all cases KiExitInterrupt is called to end interrupt processing. Take a look at this function. Using the KiInterruptDispatch code, you can see that the following if condition is TRUE only if HalBeginSystemInterrupt returns TRUE, thus entering HalEndSystemInterrupt.

Finally, HalEndSystemInterrupt. As mentioned earlier, if an interrupt occurs with an IRQL lower than the processor’s IRQL, its ISR will not be executed, but it will be recorded in the IRR of the virtual Interrupt controller in KPCR. When the processor completes its high-IRQL task, At HalEndSystemInterrupt time, the IRQL of the processor is lowered and the INTERRUPT masking code of the PIC is reset. It is also important to check the record in the IRR and dispatch the interrupt if there is a record higher than the reduced IRQL.

→ [Technical document] ←

conclusion

Finally, a summary of Windows IRQL on a computer using the 8259A interrupt controller.

First, the 8259A chip is programmed when the system starts up, its working mode is set, and 15 interrupt sources (IRQs) are mapped to the 0x30-0x40 segment in the IDT.

Second, Windows defines an IRQL concept called interrupt request level to describe the priority level of interrupt. IRQL is a DWORD with a total of 32 levels. Windows uses a simple linear relationship to map IRQ and IRQL: IRQL=27-IRQ.

Third, each ISR of the interrupt descriptor mapped from 0x30 to 0x40 of the interrupt request points to a DispatchCode in a KINTERRUPT structure, This DispatchCode dispatches interrupts using either KiInterruptDispatch or KiChainedDispatch.

The dispatch process is as follows: HalBeginSystemInterrupt is first used to judge the IRQL of this interrupt to determine whether the interrupt needs to be processed. If not, the masking code of the interrupt controller is set to prevent further interruption and the interrupt is registered in the IRR of the virtual interrupt controller in KPCR. Raise the IRQL if necessary, execute the actual handling routine for the interrupt, lower the IRQL with HalEndSystemInterrupt after execution, and then check that the IRR records any unhandled interrupts for processing at this point.