R515/H700 high iowait

Mark Nelson mark.nelson at inktank.com
Tue Jul 17 10:27:16 CDT 2012

Hi Guys,

I was wondering if anyone has had any problems either on the R515 or 
with the H700 card where there are very high io wait times when multiple 
writers write to the same raid group.  Basically we first noticed this 
due to inconsistent buffered IO performance under heavy IO load.  We saw 
IO wait times as high as 6+ seconds.

After that we started testing with fio.  What we noticed is that if 
writing to say a large raid0 with 7-8 drives we could achieve 800MB/s, 
but only with large (256MB) IOs using direct IO and a single "job" (ie 
fio process).  Simply increasing the number of jobs to 2 reduced 
performance to 80MB/s and dramatically increased IO wait times.  The 
controller has a BBU and the raid array was configured with adaptive 
readahead and writeback cache.

One thought is that perhaps this is not a problem with the raid 
controller, but some kind of preemption issue with the io scheduler.  At 
once point early on I tried switching from cfq to deadline with little 
apparent change, though I have not yet done exhaustive tests.  I will 
try to record blktrace output soon to see if that provides any insight 
into what is going on.

In the mean time, has anyone else seen anything like this?


