Preventing I/O starvation on MD1000s triggered by a failed disk.
cmadams at hiwaay.net
Mon Aug 23 23:01:46 CDT 2010
Once upon a time, Bond Masuda <bond.masuda at jlbond.com> said:
> Do you know if you encountered a URE? I've been running into URE's on
> large RAID-5 arrays that use 500GB, 750GB, 1TB drives... basically,
> RAID-5 might survive a single disk failure, but the rebuild will kill it
> due to URE. In this situation, usually you can force "online" the disk
> that had the URE and still be able to read data in degraded state as
> long as you avoid the block that has the URE. If URE is the issue, then
> you should start considering RAID-6.
I assume you are talking about the case where one drive in a RAID-5
fails, and while rebuilding, a bad block is discovered on another drive
(which results in at least that block being unrecoverable, and possibly
turning the whole array offline, depending on the RAID controller).
Linux software RAID has a way to validate an array. Newer RHEL and
Fedora (and others I expect) run this check weekly and report any
faults. Is there a way to do the same thing on the hardware RAID
controllers (either internal cards or external arrays like EqualLogic)?
Chris Adams <cmadams at hiwaay.net>
Systems and Network Administrator - HiWAAY Internet Services
I don't speak for anybody but myself - that's enough trouble.
More information about the Linux-PowerEdge