Time-honoured problem : RHEL3 iowait performance

Brendan Heading brendanheading at clara.co.uk
Sat Feb 17 18:37:03 CST 2007

Aaron wrote:
> Have you tried disabling jumbo frames?  If so, have you ran tcpdump and 
> pulled up the capture file in ethereal to see if you are doing allot of 
> retransmissions (also possibly visible in netstat -s)?


Thanks for the reply. I hope you don't mind if I CC the list.

As of right now, the server has rx'd 224507725 segments, tx'd 262637305, 
and there have been 25457 retransmits. Not that high in the schemeof 
things, I reckon.

> If you do local file transfers, many at the same time, do you see the 
> same problem?  i.e. /one/group/of/disks to /another/group/of/disks 
> and/or /var/tmp local transfers.

I will have to check again but I believe yes, I see the problem if I 
simulate the operation locally.

> I assume you have also set noatime on the ext mounts.  (Lots of 
> simultaneous reads add up to allot of atime writes and thrashing)

Yes, noatime is turned off, and the ext3 commit value is set to 30 to 
try to reduce any thrash caused by commits to the journal.

> Have you also tried running bonnie++ ?  It is in the dag repo and can 
> show you individual disk performance.

I am not sure what you mean by "dag repo", is this a set of test tools 
somewhere, and if so where can I get it ? .. I have a version of 
bonnie++ (1.03) from a while back, running it on locally gives me the 
following results :

Version  1.03       ------Sequential Output------ --Sequential Input- 
                     -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- 
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP 
/sec %CP
xxxxx            8G 17948  42 13785  16  9561  14 22268  60 26889  17 
122.8   3
                     ------Sequential Create------ --------Random 
                     -Create-- --Read--- -Delete-- -Create-- --Read--- 
               files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP 
/sec %CP
                  16   613   3 21720  38  1610   5   690   4 28433  41 
1183   3


It's a bit hard to test reliably, since the volume is LVM so you can't 
tell which of the two RAID arrays it's writing stuff to. LVM seems to 
try to distribute activity across the volumes in a group, rather than 
wait for one PV to fill up before going to the next one. This makes 
things "interesting" because the larger RAID array has a larger stripe 
size, 128K, than the other one which is 64K. (in both cases, read policy 
is adaptive, write policy is writeback, and cache policy is direct I/O).

> As far as performance, there are also many things you can do in 
> /etc/sysctl.conf to tune for being a file server, but those won't help 
> until you fix the IO wait issues.
> Just a few thoughts.
> --Aarön
> P.S. - Oh.. and anything interesting in dmesg?

There is nothing in dmesg that looks like an error...

More information about the Linux-PowerEdge mailing list