Redhat 7.2 kernel-smp-2.4.9-7 failure on 2550

Matt_Domsch@Dell.com Matt_Domsch at Dell.com
Mon Nov 5 08:37:00 CST 2001


> Our 2550 fails to boot with latest smp kernel errata
> (kernel-smp-2.4.9-7). It says:

Latest is actually 2.4.9-13 now.   Progress...

>   AAC: NMI ISR: NMI_DMA_0_ERROR
> 
> and hangs. This is just after the "Press 'I' to enter interactive
> startup" message.
> 
> The non-smp kernel (kernel-2.4.9-7) boots fine.
> 
> Not sure how to proceed. I tried comparing the dmesg output between
> the 2.4.7-10smp kernel (this is what we were using before the upgrade)
> and the 2.4.9-7 kernel which boots, and there are some suspicious
> messages about shared and conflicting IRQs... could this be the
> problem? 

Absolutely.  Sounds like there's something wierd with IRQ routing on your
system.  Can you post the diff (diff -bu) between the working and
non-working kernel dmesg?  (serial console is your friend here!)  On the
non-smp kernel, you would expect to see IRQ sharing (it doesn't use the
IOAPICs, so you've got only IRQ 0-15 at most to use, and most of those are
already assigned to something).

> PCI: Found IRQ 9 for device 02:04.0
> IRQ routing conflict for 02:04.0, have irq 10, want irq 9

Hmmm.  That's kind of wierd.  What BIOS are you running?  A05 for the latest
for the PE2550.

 
> Apart from that, the aacraid driver has changed from the "Sep 6 2001"
> version to "Oct 18 2001".

This part is normal.  The code grabs this date at build time, and that's
when RH rebuilt the kernel (thus the driver).

> AAC:Batte

This indicates that you need to perform a battery recondition (and that the
driver has a bug in the printing of that message, but that's now
understood).


--
Matt Domsch
Sr. Software Engineer
Dell Linux Solutions
www.dell.com/linux
#2 Linux Server provider with 17% in the US and 14% Worldwide (IDC)!
#3 Unix provider with 18% in the US (Dataquest)!



More information about the Linux-PowerEdge mailing list