Time-honoured problem : RHEL3 iowait performance
Aaron
dell at microchp.org
Sun Feb 18 00:29:27 CST 2007
Brendan,
No worries. I initially kept it off the list as I often bump into folk
that have religious beliefs in particular configurations. :-)
I was not aware of your LVM configuration. That brings to mind a few
different issues that have come across the list; though as my memory has
been sucking as of late, I am having issues recalling the specific
details. Certainly a good starting point would be to run through the
list archive searching for LVM and multipathing/multipath as search
terms. I recall a few instances in where people were having problems
with both IO wait and overall performance because of their
LVM/Multipathing configuration. There have been a few bugs highlighted
in that, though I have no idea as to the current status. Most of my
servers are using NetApp for storage and only using the local disks for
the OS. My prior employer used HP SureStore which also handled the
stripping, so I have been spoiled in this area and thus this has
admittedly become my weaker area.
This is probably not a logistical option, but you might try creating a
non-lvm volume and test the performance on it using the same controller
and same model of disks and raid configuration. I am just throwing that
out there in the event that you have enough channels/drive bays to do
this. Should it work out, you could always rsync the data over to the
new volume.
The dag repo is an rpm package repository that is becoming very popular
due to the large number of packages being incorporated and the excellent
standards being followed in the packaging. There are a few repo's
growing in popularity; though dag seems to have most of the packages I
am looking for. What I especially like about it is that I can easily
contact the maintainer in IRC if i run into a conflict, etc... (#centos
on freenode)
To sum up, I won't be much help with this particular IO problem. I hope
you find what is causing the wait time.
Cheers!
Aaron
Brendan Heading wrote:
> Aaron wrote:
>> Have you tried disabling jumbo frames? If so, have you ran tcpdump
>> and pulled up the capture file in ethereal to see if you are doing
>> allot of retransmissions (also possibly visible in netstat -s)?
>
> Aaron,
>
> Thanks for the reply. I hope you don't mind if I CC the list.
>
> As of right now, the server has rx'd 224507725 segments, tx'd
> 262637305, and there have been 25457 retransmits. Not that high in the
> schemeof things, I reckon.
>
>> If you do local file transfers, many at the same time, do you see the
>> same problem? i.e. /one/group/of/disks to /another/group/of/disks
>> and/or /var/tmp local transfers.
>
> I will have to check again but I believe yes, I see the problem if I
> simulate the operation locally.
>
>> I assume you have also set noatime on the ext mounts. (Lots of
>> simultaneous reads add up to allot of atime writes and thrashing)
>
> Yes, noatime is turned off, and the ext3 commit value is set to 30 to
> try to reduce any thrash caused by commits to the journal.
>
>> Have you also tried running bonnie++ ? It is in the dag repo and can
>> show you individual disk performance.
>
> I am not sure what you mean by "dag repo", is this a set of test tools
> somewhere, and if so where can I get it ? .. I have a version of
> bonnie++ (1.03) from a while back, running it on locally gives me the
> following results :
>
> =====================================
> Version 1.03 ------Sequential Output------ --Sequential Input-
> --Random-
> -Per Chr- --Block-- -Rewrite- -Per Chr- --Block--
> --Seeks--
> Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP
> /sec %CP
> xxxxx 8G 17948 42 13785 16 9561 14 22268 60 26889 17
> 122.8 3
> ------Sequential Create------ --------Random
> Create--------
> -Create-- --Read--- -Delete-- -Create-- --Read---
> -Delete--
> files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP
> /sec %CP
> 16 613 3 21720 38 1610 5 690 4 28433 41
> 1183 3
>
> ,8G,17948,42,13785,16,9561,14,22268,60,26889,17,122.8,3,16,613,3,21720,38,1610,5,690,4,28433,41,1183,3
>
>
>
> =====================================
>
> It's a bit hard to test reliably, since the volume is LVM so you can't
> tell which of the two RAID arrays it's writing stuff to. LVM seems to
> try to distribute activity across the volumes in a group, rather than
> wait for one PV to fill up before going to the next one. This makes
> things "interesting" because the larger RAID array has a larger stripe
> size, 128K, than the other one which is 64K. (in both cases, read
> policy is adaptive, write policy is writeback, and cache policy is
> direct I/O).
>
>> As far as performance, there are also many things you can do in
>> /etc/sysctl.conf to tune for being a file server, but those won't
>> help until you fix the IO wait issues.
>>
>> Just a few thoughts.
>>
>> --Aarön
>>
>> P.S. - Oh.. and anything interesting in dmesg?
>
> There is nothing in dmesg that looks like an error...
More information about the Linux-PowerEdge
mailing list