Experiencing disk latency problems from your PERC controller?

Seth Mos seth.mos at xs4all.nl
Sat Sep 16 11:20:46 CDT 2006


> If so, then try turning off "Patrol Read" either from the PERC bios,
> or via megapr (http://ftp1.us.dell.com/scsi-raid/MegaPR_Linux_A02.tar.gz).
>
> Comparing some representative iostat output from a PowerEdge 2800,
> Perc 4e/Di, 4x15,000rpm SCSI disks in RAID10:(apologies for the
> formatting):

This is exact same problem I am currently having issues with on a exact
same machine. More disks (8) and 8GB ram. But serious interactivity issues
forced me to not take the machine into production since 3-2006.

I have since understood from the Dell technician I am working with that
patroll reads also are problematic with respect too benchmarks.

I was under the impression that Patroll reads occur when idle. Not when
you are flogging the disks.

It's currently running windows 2003 where I had the same sort of issues
with disk access being fast and then being slow. It doesn't make for a
dependable server, it voids my performance benchmarks which are indication
for me if I can use it in production.

The issue affects both windows and linux and quite severly at that.
Although Dell recommends turning Patroll Read on, I recommend against it.
It cuts into performance so hard that the machine is unusable for
production.

If copying files from a network share on the machine take 30 minutes from
the old machine and over 3 hours from the new machine with patroll read it
should be off per default.

The idea needs more work.

Cheers,

Seth
>
> With PR:
> avg-cpu:  %user   %nice    %sys %iowait   %idle
>            2.02    0.00    5.33   17.40   75.24
>
> Device:    rrqm/s wrqm/s   r/s   w/s  rsec/s  wsec/s    rkB/s    wkB/s
> avgrq-sz avgqu-sz   await  svctm  %util
> sda          1.37   1.24 419.19  1.67 17716.62   23.27  8858.31
> 11.63    42.15     2.83    6.72   1.34  56.47
>
>
> Without PR:
> avg-cpu:  %user   %nice    %sys %iowait   %idle
>            3.13    0.00    6.21    9.30   81.36
>
> Device:    rrqm/s wrqm/s   r/s   w/s  rsec/s  wsec/s    rkB/s    wkB/s
> avgrq-sz avgqu-sz   await  svctm  %util
> sda          1.57   2.44 368.23  2.01 18979.53   35.59  9489.77
> 17.79    51.36     0.72    1.94   1.23  45.40
>
> A couple of things to note
> 1)iowait drops from 17% to 10%.
> 2)the queue length drops from about 3 to 1.
> 3)and most importantly for me, await - the average time (in
> milliseconds) for I/O requests  issued to the device to be served -
> falls from about 7 to 2.
>
> That last one becomes very important if you are running applications
> where data latency is important. In my case video streaming. I was
> seeing data starvation in the video server, but only at certain times
> (when PR was running and the video wasn't cached in RAM), and more
> with certain videos than others (the higher the video bitrate the more
> likely for the server not to get the data when it needed it).
>
> I found a few other posts in the archives referencing bandwidth issues
> when PR is running, but nothing specific to latency. Hopefully someone
> else will find this post before spending days tearing their hair out
> playing with kernel vm tunables, RAID parameters and I/O schedulers
> :)
>
> _______________________________________________
> Linux-PowerEdge mailing list
> Linux-PowerEdge at dell.com
> http://lists.us.dell.com/mailman/listinfo/linux-poweredge
> Please read the FAQ at http://lists.us.dell.com/faq
>



More information about the Linux-PowerEdge mailing list