Fatal I/O Failures on 2650

James Bourne jbourne at mtroyal.ca
Sun Aug 24 12:21:01 CDT 2003


On Wed, 20 Aug 2003, Johnathan Conley wrote:

> We are running RedHat Linux 7.3, latest kernel - the file systems are
> buffered, and the RAID cache is enabled read/write.

Try turning off RAID write caching.

run afacli
open afa0
container show cache 1 (1 being the container you want to show)
container set cache /write_cache_enable=FALSE 1 (1 being the container you want to show)
container show cache 1 (1 being the container you want to show)

We have seen much the same problem and are trying to track down the
problem, more or less in our spare time...

Regards
James Bourne

> 
> We have also run 2 sets of Dell Diagnostics provided by tech support
> with no errors.
> 
> 
> At some random time when the builds are running - the system console
> starts flooding with messages like the ones below. These are not written
> to the system logs, so these were manually captured and reproduced as
> best as possible here. It also appears that no actual disk I/O occurs
> (no lights flashing) once the system gets in this state. Since nothing
> is logged, it's impossible to tell if there is some more important
> message up front that all of these follow.
> 
> Most of the times after rebooting, the file system is corrupt and has to
> be fixed. (we are running both ext3 and ext2)
> 
> Any help would be appreciated - this box is completely unreliable. (can
> crash several times a day - crash frequency seems tied to the amount of
> random I/O we throw at it)
> 
> 
> EXT3-fs error (device sd(8,1))
> ext3_reserve_inode_write: IO failure
> ext3_reserve_inode_write: IO failure
> ext3_get_inode_loc
> 
> EXT2-fs error (devide sd(8,17))
> ext2_write_inode_loc:
> 	unable to write inode
> ext2_write_inode
> 
> I/O Error: dev 08:11, sector 82313288	(tons of sectors following)
> 
> Inode=####(some #), block=####(some #)
> 
> dev 08:01
> dev 08:11
> 
> _______________________________________________
> Linux-PowerEdge mailing list
> Linux-PowerEdge at dell.com
> http://lists.us.dell.com/mailman/listinfo/linux-poweredge
> Please read the FAQ at http://lists.us.dell.com/faq or search the list archives at http://lists.us.dell.com/htdig/
> 

-- 
James Bourne, Supervisor Data Centre Operations
Mount Royal College, Calgary, AB, CA
www.mtroyal.ca

"There are only 10 types of people in this world: those who
understand binary and those who don't."

*****************************************************************************

This  communication  is intended for the use  of the recipient to which it is
addressed,  and  may  contain  confidential,  personal,  and   or  privileged
information.  Please  contact  the  sender  immediately  if  you  are not the
intended recipient of this  communication, and  do not  copy, distribute,  or
take action relying on it. Any communication received in error, or subsequent
reply, should be deleted or destroyed.

*****************************************************************************




More information about the Linux-PowerEdge mailing list