Linux x86_64 frozen by heavy I/O on PE2950 with PERC 5/i

Peter Grandi pg_dlxpe at dlxpe.for.sabi.co.UK
Mon Aug 4 02:50:47 CDT 2008


[ ... system freezes during IO ... ]

>>> My chassis has 6x 1TB 7200RPM SATA drives in it.  The first two
>>> are a 1TB RAID1 and the last four are a 3TB RAID5.  The problem
>>> exists when using either logical drive.

Uh, on a PE with a PERC 6 I get 150-200MB/s writes and 400-500MB/s
reads with RAID10 using Linux MD in 'f2' mode with 6 drives like
that. The PERC 6 is being used as a JBOD HA.

>>> Does anyone have any other suggestions for investigating or
>>> remediating this problem?

Have you checked read rates? With 'hdparm -t' and with

  sysctl vm/drop_caches=1; dd bs=1M count=50000 if=/dev/... of=/dev/null'

with and without 'oflag=direct'. 

>> You might want to tune /proc/sys/vm parameters as suggested
>> in this thread:
>> http://lists.us.dell.com/pipermail/linux-poweredge/2008-April/035855.html

> I've done so (trying with percentages of 1, 5, 10) to no avail.
> All this does is cause the system to freeze up faster sooner in
> the copying process.  I really don't think this is a write-buffer
> issue. It does nothing to solve the problem that the disk i/o is
> abysmally slow.

That's a completely different issue from the system freezing. Even
if IO were very slow as reported below, only applications that did
IO would be affected by sharing a very slow IO channel. Then check
which block device elevator your system is defaulting to, as some
elevators are very unfair (e.g. anticipatory) and a lot of writes
on a slow IO channel can hold up reads (and 'sshd' does some reads
on a new connection).

> I'm seeing around 6mB/sec on writes to a two-disk RAID1 of
> 1TB/ 7200RPM/32MB disks. [ ... ] "dd if=/dev/zero of=testfile
> bs=1M count=2000 && sync".

If you have a recent version of 'fileutils' adding 'oflag=direct'
xor 'conv=fsync' is somewhat better. Doing two separate tests with
either setting would be best ('conv=fsync' does through the cache
synchronous writing, and 'oflag=direct' bypasses the cache entirely).
Also test writing to the block device, not just the filesystem.

> [ ... ] those 5 minutes, it takes 2-3 minutes just to open an
> SSH connection to the machine.

That SSH daemon does very little IO, so perhaps something else is
going on. Have a look at the output of 'vmstat 1' while running
'dd' to spot high values of system CPU time for example. But that's
unlikely for a hardware based HA.

Also, IO rates of a few MB/s often mean that the cache/buffer on
the affected disks (and/or that of the HA itself) is disabled.

This is likely to be the default for your RAID HA unless you have
battery backup on the HA or the system. If read IO rates are also a
few MB/s that is likely to mean that the disk cache/buffer is
indeed disabled.



More information about the Linux-PowerEdge mailing list