perc5e/MD1000 and Oracle/LVM I/O configuration

Andrew Geiger andrew_geiger at tstna.com
Thu May 28 16:34:01 CDT 2009


Tim,
What version of Oracle is this?  Have you considered splitting the array into 7 Raid 1's, then add them to an ASM disk group?

Andrew Geiger 
MIS Administrator 
TS Trim Industries, Inc. 

-----Original Message-----
From: linux-poweredge-bounces at lists.us.dell.com [mailto:linux-poweredge-bounces at lists.us.dell.com] On Behalf Of Tim Pickard
Sent: Thursday, May 28, 2009 2:58 PM
To: John LLOYD
Cc: Linux-PowerEdge at dell.com
Subject: Re: perc5e/MD1000 and Oracle/LVM I/O configuration

Resent to CC the list. (Doh!)

John LLOYD wrote:
>> I will be redoing the disk layout on a 2950 RHEL4 server with an 
>> attached MD1000 filled with 146 15k SAS drives.
>>
>> The current config is a the MD1000 configured as one large 
>> RAID 10 and 
>> using LVM to slice it into 4 volumes striped across the disks.  This 
>> device has been hitting 100% utilization regularly while the 
>> overlaying 
>> LVMs are 70-95% utilized.
>>     
>
> The changes you propose might make a difference; however at best you
> might get 20% improvement, without re-architecting your oracle database
> layout.  This might or might not be worth the risk of changing things.
>
>   
We do have the data and indexes spread on the different volumes fairly
well and logs going to a local raid1 pair.
>> I would like to redo the configuration to be most efficient 
>> and achieve 
>> the best I/O on the random random read/write.  RAID5 is not an option 
>> for us.  While Oracle recommends RAID 01 for data files, it does not 
>> work for us as one failed disk leaves us with a big RAID 0.  
>>     
>
> Hmmm?  Raid 01 and Raid 10 are often used interchangeably - you want
> striping of mirrored pairs; you should have that now with the PERC.
>
>
>   
I did mean RAID 0+1.
>> Speed and 
>> protection are important for us.
>>
>> What I currently  have in mind is breaking the md1000 into 2 
>> and having 
>> the PERC5/e have each side on a separate channel.  Within each side, 
>> have 2 4 disk RAID 10 volumes.  This basically take the role 
>> the LVM had 
>> been handling and give it to the hardware on 2 the separate 
>> channels.  
>> Our data growth will not be a factor.  The Seagate drives have a 
>> reported max 125MB/s so hopefully I will get close to 200 
>> MB/s per new 
>> volume.
>>     
>
> With Oracle, the usual measure is IO per second, not megabytes per
> second.  Most transfers are smallish, even after Linux gets through with
> them (Linux merges transfers into fewer, bigger ones while they wait in
> the queue to be processed).  Small transfers randomly sprinkled around
> the disk means few megabytes per second.   Remember, at 15k rpm, disks
> can only process 250 operations per second, max, (assuming seek time is
> less than rotational delay, which it isn't.)
>
>
>   
This is a good distinction to make for me.  I want to set my
expectations correctly and then accurately test them.
>> I do plan on using the NoReadAhead read policy, Direct I/O 
>> cache policy 
>> and WriteBack  write policy.
>>
>> Is this a logical way to approach the situation?  Will I see the 
>> performance gains I expect?
>>     
>
> My guess is maybe 20%.   Given you are at 100% utilization already, it
> will stay at 100% after you are done and you might see 20% improvement
> in "oracle throughput", whatever that means to you.
>
>   
>> Also, I have seen in my research online using "/sbin/blockdev --setra 
>> 8192 DEVICE" on RAID5 volumes up from the default 256.  Will 
>> that matter 
>> with my DB Random I/O profile?  It feels like not to me but I plan to 
>> test it.
>>     
>
> Probably not; Oracle caches lots of things in it's global shared memory
> pool, whatever caching Linux does will be invisible.  An impact on
> synthetic benchmark figures is likely but not very useful to your actual
> situation.
>
>   
I did find and download a tool from Oracle, Orion.   (Oracle I/O
Calibration Tool)
http://www.oracle.com/technology/software/tech/orion/index.html  I will
let you all know how that goes.

>> Is hdparm the best way to test this kind of I/O performance on the 
>> volumes?  Are there other open source tools you would recommend?
>>
>> I cannot test the current config in any fashion as it is in 
>> production 
>> at the moment.  My maintenance window is next week and I want 
>> to have my 
>> ducks in a row.  I can test the current before I redo anything.
>>
>> I will be happy to share the results with the list.
>>
>> Thanks,
>> Tim
>>     
>
> You may get more mileage out of adding 2 or 8 GB of RAM to your machine,
> and upping the configured Oracle system global area (SGA) memory size,
> than twiddling minor disk layout / stripe parameters.  There are also a
> few other memory-related Oracle parameters, and IO related parameters,
> etc that can sometimes help.  An index or two, judiciously applied, can
> eliminate table-scans and consequently reduce IO workload by a huge
> amount.
>
> First rule of tuning hardware: fix the software first.
>
> Second rule of tuning hardware: see the first rule.
>
>
> --John
>
>
>   
I would agree with you on the RAM upgrade.  It is at 16g now.  This
machine is being moved the standby/failover role and a newer 2950 with
32g and faster processors will be put in the active/primary role.   And
I would like to see a couple indexes added to some tables that make the
application a real dog when that function starts up with a couple of
ugly queries.  But I am just an admin.
>
> _______________________________________________
> Linux-PowerEdge mailing list
> Linux-PowerEdge at lists.us.dell.com
> https://lists.us.dell.com/mailman/listinfo/linux-poweredge
> Please read the FAQ at http://lists.us.dell.com/faq
>   


_______________________________________________
Linux-PowerEdge mailing list
Linux-PowerEdge at lists.us.dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq



More information about the Linux-PowerEdge mailing list