PE2650 / Perc 3Di crash

Salyzyn, Mark mark_salyzyn at adaptec.com
Tue Aug 5 09:13:18 CDT 2003


If this is the case ... AAC_NUM_IO_FIB defined in
drivers/scsi/aacraid/aacraid.h which was originally set to 512, and is
reduced to 116 in the 2.4.19 generic variant of the driver might have to be
reduced. The 2.4.20 driver has this value *increased* to 512 (!!!!)

In Adaptec's release of the driver it is reduced to a value of 100, only
because we determined experimentally that 128 would crash the adapter, and
100 did not under all test circumstances for a sample of card variants. I
have *no* idea where the 116 came from in the . The theoretical maximum in
the adapter is 512 with *one* array including the RAID splitting and other
Firmware tasks which have to absorb some of the spares above this limit.

My suggestion is to drop the AAC_NUM_IO_FIB to 100, *maybe* 116, but *not*
leave it at 512.

Sincerely -- Mark Salyzyn

Value of AAC_NUM_IO_FIB for various kernels:

2.4.18-14 512
2.4.18-24.7.x 512
2.4.18-24.8.0 512
2.4.18-26.7.x 512
2.4.18-26.8.0 512
2.4.18-27.7.x 512
2.4.18-27.8.0 512
2.4.18-3 116
2.4.18-5 116
2.4.18-e.12 512
2.4.18-e.25 512
2.4.18.SuSE 116
2.4.18.generic 116
2.4.19.SLES 116
2.4.19.SuSE 116
2.4.19.generic 116
2.4.20-13.7 512
2.4.20-13.8 512
2.4.20-13.9 512
2.4.20-18.7 512
2.4.20-18.8 512
2.4.20-18.9 512
2.4.20-6 512
2.4.20-8 512
2.4.20-9 512
2.4.20.SuSE 512
2.4.20.generic 512
2.4.21-rc2-ac2 512
2.4.21-rc5-ac1 100
2.4.21.generic 512
2.4.22-pre6-ac1 100
2.4.9-31.22ml 116
2.4.9-34 116
2.4.9-38 116
2.4.9-e.3 116
2.4.9-e.5 116
2.4.9-e.8 116
2.4.9-e.9 116
2.4.9-e.10 116
2.4.9-e.12 116
2.4.9-e.16 116
2.4.9-e.23 512
2.4.9-e.24 512
2.4.9-e.25 512
2.5.70.generic 100
2.5.74.generic 100
2.6.0-test2 100

-----Original Message-----
From: James Bourne [mailto:jbourne at mtroyal.ab.ca]
Sent: Tuesday, August 05, 2003 9:43 AM
To: Matthias Pigulla
Cc: linux-poweredge at dell.com; linux-aacraid-devel at dell.com;
matt_domsch at dell.com
Subject: Re: PE2650 / Perc 3Di crash


On Tue, 5 Aug 2003, Matthias Pigulla wrote:

> Hello everyone,
> 
> tonight, I lost one of my PowerEdge boxes with a kernel panic. I'm
> running a PERC 3/Di, RAID10, on Debian woody with a custom 2.4.19
> kernel. I'll try to provide all information I can collect, I hope
> someone can help me to track this issue down. Please bear with me,
> although if it's long :)

FYI, this is what we have seen on our aacraid systems under heavy I/O
and CPU load.  It's unclear at this time if this is a firmware issue
or a driver issue, but I do know that now Dell and Adaptec are working
on a resolution...

Turning off write caching will provide a work around, although you
will still get timeouts, it looks as though the crashes will be prevented.

Regards
James Bourne

-- 
James Bourne, Supervisor Data Centre Operations
Mount Royal College, Calgary, AB, CA
www.mtroyal.ab.ca

"There are only 10 types of people in this world: those who
understand binary and those who don't."

****************************************************************************
*

This  communication  is intended for the use  of the recipient to which it
is
addressed,  and  may  contain  confidential,  personal,  and   or
privileged
information.  Please  contact  the  sender  immediately  if  you  are not
the
intended recipient of this  communication, and  do not  copy, distribute,
or
take action relying on it. Any communication received in error, or
subsequent
reply, should be deleted or destroyed.

****************************************************************************
*

_______________________________________________
Linux-aacraid-devel mailing list
Linux-aacraid-devel at dell.com
http://lists.us.dell.com/mailman/listinfo/linux-aacraid-devel
Please read the FAQ at http://lists.us.dell.com/faq or search the list
archives at http://lists.us.dell.com/htdig/




More information about the Linux-PowerEdge mailing list