Experiencing disk latency problems from your PERC controller?
Seth Mos
seth.mos at xs4all.nl
Sat Sep 16 11:20:46 CDT 2006
> If so, then try turning off "Patrol Read" either from the PERC bios,
> or via megapr (http://ftp1.us.dell.com/scsi-raid/MegaPR_Linux_A02.tar.gz).
>
> Comparing some representative iostat output from a PowerEdge 2800,
> Perc 4e/Di, 4x15,000rpm SCSI disks in RAID10:(apologies for the
> formatting):
This is exact same problem I am currently having issues with on a exact
same machine. More disks (8) and 8GB ram. But serious interactivity issues
forced me to not take the machine into production since 3-2006.
I have since understood from the Dell technician I am working with that
patroll reads also are problematic with respect too benchmarks.
I was under the impression that Patroll reads occur when idle. Not when
you are flogging the disks.
It's currently running windows 2003 where I had the same sort of issues
with disk access being fast and then being slow. It doesn't make for a
dependable server, it voids my performance benchmarks which are indication
for me if I can use it in production.
The issue affects both windows and linux and quite severly at that.
Although Dell recommends turning Patroll Read on, I recommend against it.
It cuts into performance so hard that the machine is unusable for
production.
If copying files from a network share on the machine take 30 minutes from
the old machine and over 3 hours from the new machine with patroll read it
should be off per default.
The idea needs more work.
Cheers,
Seth
>
> With PR:
> avg-cpu: %user %nice %sys %iowait %idle
> 2.02 0.00 5.33 17.40 75.24
>
> Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s
> avgrq-sz avgqu-sz await svctm %util
> sda 1.37 1.24 419.19 1.67 17716.62 23.27 8858.31
> 11.63 42.15 2.83 6.72 1.34 56.47
>
>
> Without PR:
> avg-cpu: %user %nice %sys %iowait %idle
> 3.13 0.00 6.21 9.30 81.36
>
> Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s
> avgrq-sz avgqu-sz await svctm %util
> sda 1.57 2.44 368.23 2.01 18979.53 35.59 9489.77
> 17.79 51.36 0.72 1.94 1.23 45.40
>
> A couple of things to note
> 1)iowait drops from 17% to 10%.
> 2)the queue length drops from about 3 to 1.
> 3)and most importantly for me, await - the average time (in
> milliseconds) for I/O requests issued to the device to be served -
> falls from about 7 to 2.
>
> That last one becomes very important if you are running applications
> where data latency is important. In my case video streaming. I was
> seeing data starvation in the video server, but only at certain times
> (when PR was running and the video wasn't cached in RAM), and more
> with certain videos than others (the higher the video bitrate the more
> likely for the server not to get the data when it needed it).
>
> I found a few other posts in the archives referencing bandwidth issues
> when PR is running, but nothing specific to latency. Hopefully someone
> else will find this post before spending days tearing their hair out
> playing with kernel vm tunables, RAID parameters and I/O schedulers
> :)
>
> _______________________________________________
> Linux-PowerEdge mailing list
> Linux-PowerEdge at dell.com
> http://lists.us.dell.com/mailman/listinfo/linux-poweredge
> Please read the FAQ at http://lists.us.dell.com/faq
>
More information about the Linux-PowerEdge
mailing list