MCE problems with SMP kernel on 2850

bloch at verdurin.com bloch at verdurin.com
Wed May 3 09:08:15 CDT 2006


The other day there was an error on a 2850 here:

E07F0
PROC 1 IERR
PROC 2 IERR

I rebooted it but it panics now if I try booting an SMP kernel, eith
with an NMI watchdog error or "uncorrected machine check".  In the
latter case, some machine check events are briefly listed but they refer
to "CPU 6" and "CPU 7", whereas the box has two hyperthreaded Xeons.

If I boot with the "nomce" option, the machine reboots before the kernel
has fully started.

I've seen some indications that this might be a BIOS problem, but I'm
running the latest 2850 BIOS.  Is there an option I might change to
clear these errors?

The machine is running Centos 4.3.  When I tried booting to a recent
Knoppix DVD, that didn't start up properly either.

Any ideas appreciated.

Adam



More information about the Linux-PowerEdge mailing list