kernel-hacking-2024-linux-s.../arch
Eric W. Biederman ef3e28c5b9 x86_64: check remote IRR bit before migrating level triggered irq
On x86_64 kernel, level triggered irq migration gets initiated in the
context of that interrupt(after executing the irq handler) and following
steps are followed to do the irq migration.

1. mask IOAPIC RTE entry;     // write to IOAPIC RTE
2. EOI;                       // processor EOI write
3. reprogram IOAPIC RTE entry // write to IOAPIC RTE with new destination and
                              // and interrupt vector due to per cpu vector
                              // allocation.
4. unmask IOAPIC RTE entry;   // write to IOAPIC RTE

Because of the per cpu vector allocation in x86_64 kernels, when the irq
migrates to a different cpu, new vector(corresponding to the new cpu) will
get allocated.

An EOI write to local APIC has a side effect of generating an EOI write for
level trigger interrupts (normally this is a broadcast to all IOAPICs).
The EOI broadcast generated as a side effect of EOI write to processor may
be delayed while the other IOAPIC writes (step 3 and 4) can go through.

Normally, the EOI generated by local APIC for level trigger interrupt
contains vector number.  The IOAPIC will take this vector number and search
the IOAPIC RTE entries for an entry with matching vector number and clear
the remote IRR bit (indicate EOI).  However, if the vector number is
changed (as in step 3) the IOAPIC will not find the RTE entry when the EOI
is received later.  This will cause the remote IRR to get stuck causing the
interrupt hang (no more interrupt from this RTE).

Current x86_64 kernel assumes that remote IRR bit is cleared by the time
IOAPIC RTE is reprogrammed.  Fix this assumption by checking for remote IRR
bit and if it still set, delay the irq migration to the next interrupt
arrival event(hopefully, next time remote IRR bit will get cleared before
the IOAPIC RTE is reprogrammed).

Initial analysis and patch from Nanhai.

Clean up patch from Suresh.

Rewritten to be less intrusive, and to contain a big fat comment by Eric.

[akpm@linux-foundation.org: fix comments]
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: Nanhai Zou <nanhai.zou@intel.com>
Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Asit Mallick <asit.k.mallick@intel.com>
Cc: Keith Packard <keith.packard@intel.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-21 18:37:10 -07:00
..
alpha some kmalloc/memset ->kzalloc (tree wide) 2007-07-19 10:04:50 -07:00
arm clockevents: fix resume logic 2007-07-21 17:49:15 -07:00
arm26 mm: Remove slab destructors from kmem_cache_create(). 2007-07-20 10:11:58 +09:00
avr32 mm: fault feedback #2 2007-07-19 10:04:41 -07:00
blackfin some kmalloc/memset ->kzalloc (tree wide) 2007-07-19 10:04:50 -07:00
cris Merge git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild 2007-07-19 14:28:19 -07:00
frv FRV: work around a possible compiler bug 2007-07-19 10:04:50 -07:00
h8300 PTRACE_POKEDATA consolidation 2007-07-17 10:23:03 -07:00
i386 x86: round_jiffies() for i386 and x86-64 non-critical/corrected MCE polling 2007-07-21 18:37:10 -07:00
ia64 revert "PIE randomization" 2007-07-21 17:49:14 -07:00
m32r Merge branch 'release' of git://lm-sensors.org/kernel/mhoffman/hwmon-2.6 2007-07-19 14:24:57 -07:00
m68k m68k: exclude more unbuildable drivers 2007-07-20 08:24:49 -07:00
m68knommu m68knommu: reformat show_cpuinfo() 2007-07-20 08:44:20 -07:00
mips [MIPS] User stack pointer randomisation 2007-07-20 18:57:40 +01:00
parisc define new percpu interface for shared data 2007-07-19 10:04:44 -07:00
powerpc spufs: make signal-notification files readonly for NOSCHED contexts 2007-07-21 17:49:16 -07:00
ppc Merge branch 'release' of git://lm-sensors.org/kernel/mhoffman/hwmon-2.6 2007-07-19 14:24:57 -07:00
s390 s390: Put allocated ELF notes in read-only data segment 2007-07-19 10:04:47 -07:00
sh clockevents: fix resume logic 2007-07-21 17:49:15 -07:00
sh64 sh64: Flag sh64_get_page() as __init_refok. 2007-07-20 17:46:42 +09:00
sparc [SPARC]: Make sure dev_archdata is filled in for all devices. 2007-07-20 17:13:42 -07:00
sparc64 NTP: move the cmos update code into ntp.c 2007-07-21 17:49:15 -07:00
um fallout from kbuild changes 2007-07-19 18:37:54 -07:00
v850 PTRACE_POKEDATA consolidation 2007-07-17 10:23:03 -07:00
x86_64 x86_64: check remote IRR bit before migrating level triggered irq 2007-07-21 18:37:10 -07:00
xtensa Merge git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild 2007-07-19 14:28:19 -07:00