MD1000-PERC5/E Performance issues

Stephen Dowdy sdowdy at ucar.edu
Fri Jan 2 19:33:10 CST 2009


Mike,

Mike McGrath wrote, On 01/02/09 14:17:
>>>> http://thias.marmotte.net/archives/2008/01/05/Dell-PERC5E-and-MD1000-performance-tweaks.html

Yep, that mostly concurs with my findings.  I disable READAHEAD caching in the PERC/LSI,
use DIRECTIO and WRITEBACK.  Also, the CERN Oetiker dudes ran some testing on
the stripe base-offset when using labels on the RAID device showing dramatic
improvement when aligning the base offset to a stripe.  I've started using the
full block device and not laying labels on them anymore.

I also generally use 'blockdev --setra 4096' on my MD1000 devices.  I think you
do NOT want to do this if your I/O patterns are small random reads as you will
be pulling more data unnecessarily than you need to, but for sequential I/O it
is a BIG win over the defaults.

>>>> Why was there such a significant jump there?  Also, why does sar think the
>>>> array was 100% utilized basically the whole time even though there's such
>>>> a difference in sec reads/s?

I have no idea how 'sar' and 'iostat' calculate %utilization, but i suspect
it's not a very trustable figure.

>>>> Other issues.  /dev/sde is the underlying physical device (raid5).
>>>> However, it is divided up using lvm.  The copy mentioned above was from
>>>> a logical volume.  While I expected some overhead and performance hit, I
>>>> didn't expect it to be so huge and sporadic.  I've seen reads as low as
>>>> 3MB/s during a copy.

I am curious about the LVM overhead, but everywhere i look i see handwave
statements like "LVM adds NO, ZERO, NADA overhead", which sounds like utter
BS to me.  Now, i doubt it would be larger than 3-5%, though.

>>> Ok, upon further testing I don't think the majority of my issue is related
>>> to the MD1000 or the raid controller.  I'm seeing decent read speeds from
>>> large files.  So I'll get back to debugging unless someone has a tip.  I
>>> suspect another list would be better.

> https://www.redhat.com/archives/fedora-infrastructure-list/2008-December/msg00142.html
> https://www.redhat.com/archives/fedora-infrastructure-list/2009-January/msg00000.html

In one of those threads you performed a 'dd', but w/o a 'bs' option.
I think 'dd' uses a 512byte default buffer.  Try using 'bs=4096k' or larger.

Also, since this is a RedHat related issue (apparently), i definitely
noticed a 15-20% improvement in I/O going from RHEL4 to RHEL5.

It's always possible that an errant drive is slowing the whole process down.
Use 'megasasctl -t long a0' (or whatever controller) to initiate some SMART
diagnostic tests on the drives.  'megasasctl -svv a0' will report success/fail.
(megactl is on sourceforge)

That wouldn't necessarily jive with your finding that large files are doing
okay for you, though.

Also, dump the TTYLOG from the PERC controller to make sure it's not reporting
timeouts and the like.  (megacli -eventlog -getevents -a0  -- or something like
that)

FWIW, i easily get 500-600MB/s streaming IOZone read/write on MD1000s.

--stephen



More information about the Linux-PowerEdge mailing list