afacli says Number of PRIMARY defects on drive: 5903

Matt_Domsch@Dell.com Matt_Domsch at Dell.com
Wed Aug 28 10:50:59 CDT 2002


> scsi : aborting command due to timeout : pid 778229, scsi 3, 
> channel 0,
> id 0, lun 0, write (10) 00 00 03 7a 85 00 00 10 00
> SCSI host 3 abort (pid 778229 timed out - resetting
> SCSI bus is being reset for host 3 channel 0
> Kernel panic : scsi_free : Bad offset
> In interrupt handler - not syncing

The timeout shouldn't happen, so that's wierd.  Any idea what you were doing
at the time?  Hopefully not a container rebuild?

The controller can't abort an operation in progress, so being told to
doesn't matter, but there's apparently a bug in the mid-layer error handling
path that causes the panic.  Not surprising, that path is subject to a
rewrite, probably not in time for 2.5 kernel feature freeze though...


> I run afacli (open afa0) and "disk show defects 5" on a drive that is
> currently not mounted. The result is:
> Number of PRIMARY defects: 5903
> Number of GROWN defects on drive: 0
> 
> Is this like in the old days, where you expected to discover some bad
> blocks on the first format, or is this serious?

Nope, that's normal.  Zero grown is good.  That means that the defects found
during the first low-level format seem to be all the disk has found.  The
number of primary defects isn't out-of-line, especially now that we're
moving into larger and larger disks (you've got 72GB disks I see, 2.8MB of
which the disk thinks is bad and remapped for you.  Not too shabby.)

> I ran fsck and checked for bad blocks (only reading as far as I could see)
without 
> getting any errors on the drive.

Good.

Thanks,
Matt

--
Matt Domsch
Sr. Software Engineer, Lead Engineer, Architect
Dell Linux Solutions www.dell.com/linux
Linux on Dell mailing lists @ http://lists.us.dell.com
#1 US Linux Server provider for 2001 and Q1/2002! (IDC May 2002)




More information about the Linux-PowerEdge mailing list