Kernel panic 2.4.18-10smp PERC3/Di

Delahunty, Mark MDelahunty at cc.ucc.ie
Mon Nov 4 07:48:01 CST 2002


Matt,

thanks for your response.

Here's what afacli says:

AFA0> disk show smart
Executing: disk show smart

        Smart    Method of         Enable 
        Capable  Informational     Exception  Performance  Error  
B:ID:L  Device   Exceptions(MRIE)  Control    Enabled      Count
------  -------  ----------------  ---------  -----------  ------
0:00:0     Y            6             Y           Y             0
0:01:0     Y            6             Y           Y             0
0:02:0     Y            6             Y           Y             0
0:03:0     Y            6             Y           N             0
0:06:0     N

Also, here are some more symptoms in case they're relevant:
On both occasions when the system hung, the light on the second disc of the
RAID-1 container was flashing red. The status light on the front of the 2550
was also flashing red. I don't know how the PERC connects to the SCSI
controller (any useful links?), but a colleague here suggests that the PERC
or SCSI controller is seeing/generating a fault and sending error conditions
to the kernel which it cannot handle.

I will upgrade the kernel as suggested, but if the same problem happens
again, I will need to look at alternative as it's an important server for
us. One idea I had was, if the problem recurs I would:

1. Reconfigure the disks to be plain (legacy or non-PERC) SCSI disks.  Will
"container split" do this?
2. Then power down and pull the PERC chip to disable the PERC. 
3. Reboot from the same disks which are now plain SCSI discs.

This would allow us to to at least narrow the problem down to the disk drive
or SCSI controller if we get further failures.

If this fallback to legacy SCSI disks is not possible, can I instead make a
copy of the filesystems from the RAID-1 diskset to a separate non-PERC SCSI
disk in the same server. I would then be able to boot up into that one even
if the PERC is susect. I would need to have 1 disc as plain SCSI - can I do
this on a 2550 with PERC 3/Di?

Any other options?

thanks

Mark

-----Original Message-----
From: Matt_Domsch at Dell.com [mailto:Matt_Domsch at Dell.com]
Sent: 03 November 2002 04:22
To: MDelahunty at cc.ucc.ie; Linux-PowerEdge at Dell.com
Subject: RE: Kernel panic 2.4.18-10smp PERC3/Di


> we're getting kernel panics on a new 2550. Nothing syslogged at the
> time. Here are some details
> uname -a : Linux SERVER3.ucc.ie 2.4.18-10smp #1 SMP Wed Aug 7 11:17:48

You should upgrade to kernel 2.4.18-17.* as it's had a lot more testing than
-10 got by virtue of being nearly identical to the RHL8.0 retail kernel.
Also, does afacli disk show smart report any suspect disks?

Thanks,
Matt

--
Matt Domsch
Sr. Software Engineer, Lead Engineer, Architect
Dell Linux Solutions www.dell.com/linux
Linux on Dell mailing lists @ http://lists.us.dell.com




More information about the Linux-PowerEdge mailing list