[parisc-linux-cvs] linux-2.6 willy

James Bottomley James.Bottomley at SteelEye.com
Sun Jan 2 15:53:10 MST 2005


On Sun, 2005-01-02 at 14:29 -0700, Grant Grundler wrote:
> On Sun, Jan 02, 2005 at 02:47:43PM -0600, James Bottomley wrote:
> > 1) Disable the interrupt routine.  The problem here is that the card
> > still asserts the interrupt so we get an unhandled interrupt, which
> > causes the kernel to get annoyed and disable the interrupt anyway.
> 
> I didn't realize the kernel would eventually shutdown unhandled
> interrupts. It's a good idea (ie IRQs left active by BIOS) in any case.
> 
> > 2) Disable the interrupt.  Here we've stopped listening to any other
> > sharers of this interrupt as well.
> > 
> > Clearly, the best choice is to beat all the HW vendors up until they
> > allow the driver to disable the interrupt on the card.  However, if
> > we're reduced to a choice between these evils, I think what linux
> > chooses (number 2) is the slightly less evil one.
> 
> If it's possible for the kernel to consistently diagnose (1),
> I don't see how (2) is less evil. Worst case is now we end up
> in the same situation for broken HW.

Consider the end result:  assume we have a PCI card with level triggered
interrupts we can't shut down, but we need it to stop squeaking while we
do packet processing or packet receive mitigation (this is, sadly, quite
a common case).

In 1) we disable the routine while the card is processing, but since its
interrupt line is asserted, it's already masking any other shared
interrupt.  If we just allow this to go on, every time we enable
interrupts, we get another one on this line.  Eventually the machine
bogs (no forward progress) because it's simply re-interrupting every
time we try to handle the line's interrupt but find no-one wants it.
Thus, we're forced to shut the line up.  Even if the other cards are
trying to interrupt, they're usually lost in the noise.

In 2), we disable the IRQ line proactively.  Now, we stop listening to
all cards on the line (interference) but at least the kernel makes
forward progress until the bad card is ready to re-enable the interrupt
line.

> I consider (2) more evil because it always interfers with 
> other devices even when the HW is not broken.
> But I'm assuming (1) is "consistently diagnosed" and I'm not
> real comfortable yet it's a valid assumption - it's a hard one to prove.

The general rule should be don't share interrupts, not at all, never,
not even when the docs say its OK (and sadly a lot of ACPI and APIC
programming rules force shared interrupts on us anyway).  Then, if the
line isn't shared, (2) is clearly better.  For shared lines, I'd still
argue that (2) is the lesser of the evils.

James




More information about the parisc-linux-cvs mailing list