perc5e/MD1000 and Oracle/LVM I/O configuration

Andrew Geiger andrew_geiger at tstna.com
Fri May 29 09:15:00 CDT 2009


ASM was introduced with 10g.

ASM handles extent balancing across the members of the storage group.  You could get the same effect by balancing datafiles across multiple disks, but would need to do frequent table re-org to ensure I/O balancing.

Andrew Geiger 
MIS Administrator 
TS Trim Industries, Inc. 

-----Original Message-----
From: Tim Pickard [mailto:tpickard at crossref.org] 
Sent: Friday, May 29, 2009 9:17 AM
To: Andrew Geiger
Cc: John LLOYD; Linux-PowerEdge at dell.com
Subject: Re: perc5e/MD1000 and Oracle/LVM I/O configuration

Well, this is the fun thing.  It is Oracle 9i on RHEL4.  Yes it is old 
and there are plans to migrate to one of the newer versions.  However, 
this system will be 9i on RHEL4 until it is repurposed after that upgrade.

I have only just seen mention of ASM while planning out this redo and am 
not familiar with it.  How does it compare to LVM? 

Thanks for the input.

Tim

Andrew Geiger wrote:
> Tim,
> What version of Oracle is this?  Have you considered splitting the array into 7 Raid 1's, then add them to an ASM disk group?
>
> Andrew Geiger 
> MIS Administrator 
> TS Trim Industries, Inc. 
>
> -----Original Message-----
> From: linux-poweredge-bounces at lists.us.dell.com [mailto:linux-poweredge-bounces at lists.us.dell.com] On Behalf Of Tim Pickard
> Sent: Thursday, May 28, 2009 2:58 PM
> To: John LLOYD
> Cc: Linux-PowerEdge at dell.com
> Subject: Re: perc5e/MD1000 and Oracle/LVM I/O configuration
>
> Resent to CC the list. (Doh!)
>
> John LLOYD wrote:
>   
>>> I will be redoing the disk layout on a 2950 RHEL4 server with an 
>>> attached MD1000 filled with 146 15k SAS drives.
>>>
>>> The current config is a the MD1000 configured as one large 
>>> RAID 10 and 
>>> using LVM to slice it into 4 volumes striped across the disks.  This 
>>> device has been hitting 100% utilization regularly while the 
>>> overlaying 
>>> LVMs are 70-95% utilized.
>>>     
>>>       
>> The changes you propose might make a difference; however at best you
>> might get 20% improvement, without re-architecting your oracle database
>> layout.  This might or might not be worth the risk of changing things.
>>
>>   
>>     
> We do have the data and indexes spread on the different volumes fairly
> well and logs going to a local raid1 pair.
>   
>>> I would like to redo the configuration to be most efficient 
>>> and achieve 
>>> the best I/O on the random random read/write.  RAID5 is not an option 
>>> for us.  While Oracle recommends RAID 01 for data files, it does not 
>>> work for us as one failed disk leaves us with a big RAID 0.  
>>>     
>>>       
>> Hmmm?  Raid 01 and Raid 10 are often used interchangeably - you want
>> striping of mirrored pairs; you should have that now with the PERC.
>>
>>
>>   
>>     
> I did mean RAID 0+1.
>   
>>> Speed and 
>>> protection are important for us.
>>>
>>> What I currently  have in mind is breaking the md1000 into 2 
>>> and having 
>>> the PERC5/e have each side on a separate channel.  Within each side, 
>>> have 2 4 disk RAID 10 volumes.  This basically take the role 
>>> the LVM had 
>>> been handling and give it to the hardware on 2 the separate 
>>> channels.  
>>> Our data growth will not be a factor.  The Seagate drives have a 
>>> reported max 125MB/s so hopefully I will get close to 200 
>>> MB/s per new 
>>> volume.
>>>     
>>>       
>> With Oracle, the usual measure is IO per second, not megabytes per
>> second.  Most transfers are smallish, even after Linux gets through with
>> them (Linux merges transfers into fewer, bigger ones while they wait in
>> the queue to be processed).  Small transfers randomly sprinkled around
>> the disk means few megabytes per second.   Remember, at 15k rpm, disks
>> can only process 250 operations per second, max, (assuming seek time is
>> less than rotational delay, which it isn't.)
>>
>>
>>   
>>     
> This is a good distinction to make for me.  I want to set my
> expectations correctly and then accurately test them.
>   
>>> I do plan on using the NoReadAhead read policy, Direct I/O 
>>> cache policy 
>>> and WriteBack  write policy.
>>>
>>> Is this a logical way to approach the situation?  Will I see the 
>>> performance gains I expect?
>>>     
>>>       
>> My guess is maybe 20%.   Given you are at 100% utilization already, it
>> will stay at 100% after you are done and you might see 20% improvement
>> in "oracle throughput", whatever that means to you.
>>
>>   
>>     
>>> Also, I have seen in my research online using "/sbin/blockdev --setra 
>>> 8192 DEVICE" on RAID5 volumes up from the default 256.  Will 
>>> that matter 
>>> with my DB Random I/O profile?  It feels like not to me but I plan to 
>>> test it.
>>>     
>>>       
>> Probably not; Oracle caches lots of things in it's global shared memory
>> pool, whatever caching Linux does will be invisible.  An impact on
>> synthetic benchmark figures is likely but not very useful to your actual
>> situation.
>>
>>   
>>     
> I did find and download a tool from Oracle, Orion.   (Oracle I/O
> Calibration Tool)
> http://www.oracle.com/technology/software/tech/orion/index.html  I will
> let you all know how that goes.
>
>   
>>> Is hdparm the best way to test this kind of I/O performance on the 
>>> volumes?  Are there other open source tools you would recommend?
>>>
>>> I cannot test the current config in any fashion as it is in 
>>> production 
>>> at the moment.  My maintenance window is next week and I want 
>>> to have my 
>>> ducks in a row.  I can test the current before I redo anything.
>>>
>>> I will be happy to share the results with the list.
>>>
>>> Thanks,
>>> Tim
>>>     
>>>       
>> You may get more mileage out of adding 2 or 8 GB of RAM to your machine,
>> and upping the configured Oracle system global area (SGA) memory size,
>> than twiddling minor disk layout / stripe parameters.  There are also a
>> few other memory-related Oracle parameters, and IO related parameters,
>> etc that can sometimes help.  An index or two, judiciously applied, can
>> eliminate table-scans and consequently reduce IO workload by a huge
>> amount.
>>
>> First rule of tuning hardware: fix the software first.
>>
>> Second rule of tuning hardware: see the first rule.
>>
>>
>> --John
>>
>>
>>   
>>     
> I would agree with you on the RAM upgrade.  It is at 16g now.  This
> machine is being moved the standby/failover role and a newer 2950 with
> 32g and faster processors will be put in the active/primary role.   And
> I would like to see a couple indexes added to some tables that make the
> application a real dog when that function starts up with a couple of
> ugly queries.  But I am just an admin.
>   
>> _______________________________________________
>> Linux-PowerEdge mailing list
>> Linux-PowerEdge at lists.us.dell.com
>> https://lists.us.dell.com/mailman/listinfo/linux-poweredge
>> Please read the FAQ at http://lists.us.dell.com/faq
>>   
>>     
>
>
> _______________________________________________
> Linux-PowerEdge mailing list
> Linux-PowerEdge at lists.us.dell.com
> https://lists.us.dell.com/mailman/listinfo/linux-poweredge
> Please read the FAQ at http://lists.us.dell.com/faq
>   




More information about the Linux-PowerEdge mailing list