Time-honoured problem : RHEL3 iowait performance
brendanheading at clara.co.uk
Sat Feb 17 18:37:03 CST 2007
> Have you tried disabling jumbo frames? If so, have you ran tcpdump and
> pulled up the capture file in ethereal to see if you are doing allot of
> retransmissions (also possibly visible in netstat -s)?
Thanks for the reply. I hope you don't mind if I CC the list.
As of right now, the server has rx'd 224507725 segments, tx'd 262637305,
and there have been 25457 retransmits. Not that high in the schemeof
things, I reckon.
> If you do local file transfers, many at the same time, do you see the
> same problem? i.e. /one/group/of/disks to /another/group/of/disks
> and/or /var/tmp local transfers.
I will have to check again but I believe yes, I see the problem if I
simulate the operation locally.
> I assume you have also set noatime on the ext mounts. (Lots of
> simultaneous reads add up to allot of atime writes and thrashing)
Yes, noatime is turned off, and the ext3 commit value is set to 30 to
try to reduce any thrash caused by commits to the journal.
> Have you also tried running bonnie++ ? It is in the dag repo and can
> show you individual disk performance.
I am not sure what you mean by "dag repo", is this a set of test tools
somewhere, and if so where can I get it ? .. I have a version of
bonnie++ (1.03) from a while back, running it on locally gives me the
following results :
Version 1.03 ------Sequential Output------ --Sequential Input-
-Per Chr- --Block-- -Rewrite- -Per Chr- --Block--
Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP
xxxxx 8G 17948 42 13785 16 9561 14 22268 60 26889 17
------Sequential Create------ --------Random
-Create-- --Read--- -Delete-- -Create-- --Read---
files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP
16 613 3 21720 38 1610 5 690 4 28433 41
It's a bit hard to test reliably, since the volume is LVM so you can't
tell which of the two RAID arrays it's writing stuff to. LVM seems to
try to distribute activity across the volumes in a group, rather than
wait for one PV to fill up before going to the next one. This makes
things "interesting" because the larger RAID array has a larger stripe
size, 128K, than the other one which is 64K. (in both cases, read policy
is adaptive, write policy is writeback, and cache policy is direct I/O).
> As far as performance, there are also many things you can do in
> /etc/sysctl.conf to tune for being a file server, but those won't help
> until you fix the IO wait issues.
> Just a few thoughts.
> P.S. - Oh.. and anything interesting in dmesg?
There is nothing in dmesg that looks like an error...
More information about the Linux-PowerEdge