PE2650 / Perc 3Di crash
rstuart at lubemobile.com.au
Sun Aug 10 17:48:07 CDT 2003
Another data point. I lowered AAC_NUM_IO_FIB to 20. Still crashed.
Lowering it in this way kills I/O speed - even more so that turning off
On Sun, 2003-08-10 at 03:43, James Bourne wrote:
> Yesterday at 0700 and 52 seconds I received a timeout on the raid, then
> shortly after that the adapter hung and I started to get I/O errors.
> Here's the kernel log for the event.
My gut feeling is that there is a hardware/software bug in the RAID
controller somewhere triggered by a change in the SCSI protocol timing -
possibly caused by disk retries. It is definitely sensitive to the way
you access the disk. I can not trigger it by doing badblock tests, for
instance. In my case it only happens while doing a drive to drive
I always structure the badblock test so the cache is not useful - the
amount tested always exceeds the size of any cache in use, for obvious
reasons. This is interesting because turning off the controllers cache
also fixes the problem in my case.
More information about the Linux-PowerEdge