AW: PE2650 / Perc 3Di crash

Matthias Pigulla mp at webfactory.de
Tue Aug 5 09:28:04 CDT 2003


Hi,

I added deanna_bonds at adaptec.com to the list of recipients, as the drivers/scsi/aacraid/README file says that this driver is supported by Adaptec and that she might be contacted and:

Deanna Bonds <deanna_bonds at adaptec.com> (non-DASD support, PAE fibs and 64 bit, added new adaptec controllers
                     added new ioctls, changed scsi interface to use new error handler,
                     increased the number of fibs and outstanding commands to a container)

... she seems to have increased the number of fibs (whatever they are :).

I'd like to hear some more ("official") opinions on either decreasing AAC_NUM_IO_FIB to 100 and rebuilding one of the newer kernel versions, or immediately switching to 2.4.22-pre6-ac1 with a value of 100. Possible consequences, side effects?

Best regards,
Matthias

PS. @Mark I did the dd as well as the afacli/disk verify and got no errors.

> -----Ursprüngliche Nachricht-----
> Von: Salyzyn, Mark [mailto:mark_salyzyn at adaptec.com] 
> Gesendet: Dienstag, 5. August 2003 16:13
> An: 'James Bourne'; Matthias Pigulla
> Cc: linux-poweredge at dell.com; linux-aacraid-devel at dell.com; 
> matt_domsch at dell.com
> Betreff: RE: PE2650 / Perc 3Di crash
> 
> 
> If this is the case ... AAC_NUM_IO_FIB defined in 
> drivers/scsi/aacraid/aacraid.h which was originally set to 
> 512, and is reduced to 116 in the 2.4.19 generic variant of 
> the driver might have to be reduced. The 2.4.20 driver has 
> this value *increased* to 512 (!!!!)
> 
> In Adaptec's release of the driver it is reduced to a value 
> of 100, only because we determined experimentally that 128 
> would crash the adapter, and 100 did not under all test 
> circumstances for a sample of card variants. I have *no* idea 
> where the 116 came from in the . The theoretical maximum in 
> the adapter is 512 with *one* array including the RAID 
> splitting and other Firmware tasks which have to absorb some 
> of the spares above this limit.
> 
> My suggestion is to drop the AAC_NUM_IO_FIB to 100, *maybe* 
> 116, but *not* leave it at 512.
> 
> Sincerely -- Mark Salyzyn
> 
> Value of AAC_NUM_IO_FIB for various kernels:
...

> 
> -----Original Message-----
> From: James Bourne [mailto:jbourne at mtroyal.ab.ca]
> Sent: Tuesday, August 05, 2003 9:43 AM
> To: Matthias Pigulla
> Cc: linux-poweredge at dell.com; linux-aacraid-devel at dell.com; 
> matt_domsch at dell.com
> Subject: Re: PE2650 / Perc 3Di crash
> 
> 
> On Tue, 5 Aug 2003, Matthias Pigulla wrote:
> 
> > Hello everyone,
> > 
> > tonight, I lost one of my PowerEdge boxes with a kernel panic. I'm 
> > running a PERC 3/Di, RAID10, on Debian woody with a custom 2.4.19 
> > kernel. I'll try to provide all information I can collect, I hope 
> > someone can help me to track this issue down. Please bear with me, 
> > although if it's long :)
> 
> FYI, this is what we have seen on our aacraid systems under 
> heavy I/O and CPU load.  It's unclear at this time if this is 
> a firmware issue or a driver issue, but I do know that now 
> Dell and Adaptec are working on a resolution...
> 
> Turning off write caching will provide a work around, 
> although you will still get timeouts, it looks as though the 
> crashes will be prevented.
> 
> Regards
> James Bourne




More information about the Linux-PowerEdge mailing list