perc5e/MD1000 and Oracle/LVM I/O configuration
Andrew Geiger
andrew_geiger at tstna.com
Thu May 28 16:34:01 CDT 2009
Tim,
What version of Oracle is this? Have you considered splitting the array into 7 Raid 1's, then add them to an ASM disk group?
Andrew Geiger
MIS Administrator
TS Trim Industries, Inc.
-----Original Message-----
From: linux-poweredge-bounces at lists.us.dell.com [mailto:linux-poweredge-bounces at lists.us.dell.com] On Behalf Of Tim Pickard
Sent: Thursday, May 28, 2009 2:58 PM
To: John LLOYD
Cc: Linux-PowerEdge at dell.com
Subject: Re: perc5e/MD1000 and Oracle/LVM I/O configuration
Resent to CC the list. (Doh!)
John LLOYD wrote:
>> I will be redoing the disk layout on a 2950 RHEL4 server with an
>> attached MD1000 filled with 146 15k SAS drives.
>>
>> The current config is a the MD1000 configured as one large
>> RAID 10 and
>> using LVM to slice it into 4 volumes striped across the disks. This
>> device has been hitting 100% utilization regularly while the
>> overlaying
>> LVMs are 70-95% utilized.
>>
>
> The changes you propose might make a difference; however at best you
> might get 20% improvement, without re-architecting your oracle database
> layout. This might or might not be worth the risk of changing things.
>
>
We do have the data and indexes spread on the different volumes fairly
well and logs going to a local raid1 pair.
>> I would like to redo the configuration to be most efficient
>> and achieve
>> the best I/O on the random random read/write. RAID5 is not an option
>> for us. While Oracle recommends RAID 01 for data files, it does not
>> work for us as one failed disk leaves us with a big RAID 0.
>>
>
> Hmmm? Raid 01 and Raid 10 are often used interchangeably - you want
> striping of mirrored pairs; you should have that now with the PERC.
>
>
>
I did mean RAID 0+1.
>> Speed and
>> protection are important for us.
>>
>> What I currently have in mind is breaking the md1000 into 2
>> and having
>> the PERC5/e have each side on a separate channel. Within each side,
>> have 2 4 disk RAID 10 volumes. This basically take the role
>> the LVM had
>> been handling and give it to the hardware on 2 the separate
>> channels.
>> Our data growth will not be a factor. The Seagate drives have a
>> reported max 125MB/s so hopefully I will get close to 200
>> MB/s per new
>> volume.
>>
>
> With Oracle, the usual measure is IO per second, not megabytes per
> second. Most transfers are smallish, even after Linux gets through with
> them (Linux merges transfers into fewer, bigger ones while they wait in
> the queue to be processed). Small transfers randomly sprinkled around
> the disk means few megabytes per second. Remember, at 15k rpm, disks
> can only process 250 operations per second, max, (assuming seek time is
> less than rotational delay, which it isn't.)
>
>
>
This is a good distinction to make for me. I want to set my
expectations correctly and then accurately test them.
>> I do plan on using the NoReadAhead read policy, Direct I/O
>> cache policy
>> and WriteBack write policy.
>>
>> Is this a logical way to approach the situation? Will I see the
>> performance gains I expect?
>>
>
> My guess is maybe 20%. Given you are at 100% utilization already, it
> will stay at 100% after you are done and you might see 20% improvement
> in "oracle throughput", whatever that means to you.
>
>
>> Also, I have seen in my research online using "/sbin/blockdev --setra
>> 8192 DEVICE" on RAID5 volumes up from the default 256. Will
>> that matter
>> with my DB Random I/O profile? It feels like not to me but I plan to
>> test it.
>>
>
> Probably not; Oracle caches lots of things in it's global shared memory
> pool, whatever caching Linux does will be invisible. An impact on
> synthetic benchmark figures is likely but not very useful to your actual
> situation.
>
>
I did find and download a tool from Oracle, Orion. (Oracle I/O
Calibration Tool)
http://www.oracle.com/technology/software/tech/orion/index.html I will
let you all know how that goes.
>> Is hdparm the best way to test this kind of I/O performance on the
>> volumes? Are there other open source tools you would recommend?
>>
>> I cannot test the current config in any fashion as it is in
>> production
>> at the moment. My maintenance window is next week and I want
>> to have my
>> ducks in a row. I can test the current before I redo anything.
>>
>> I will be happy to share the results with the list.
>>
>> Thanks,
>> Tim
>>
>
> You may get more mileage out of adding 2 or 8 GB of RAM to your machine,
> and upping the configured Oracle system global area (SGA) memory size,
> than twiddling minor disk layout / stripe parameters. There are also a
> few other memory-related Oracle parameters, and IO related parameters,
> etc that can sometimes help. An index or two, judiciously applied, can
> eliminate table-scans and consequently reduce IO workload by a huge
> amount.
>
> First rule of tuning hardware: fix the software first.
>
> Second rule of tuning hardware: see the first rule.
>
>
> --John
>
>
>
I would agree with you on the RAM upgrade. It is at 16g now. This
machine is being moved the standby/failover role and a newer 2950 with
32g and faster processors will be put in the active/primary role. And
I would like to see a couple indexes added to some tables that make the
application a real dog when that function starts up with a couple of
ugly queries. But I am just an admin.
>
> _______________________________________________
> Linux-PowerEdge mailing list
> Linux-PowerEdge at lists.us.dell.com
> https://lists.us.dell.com/mailman/listinfo/linux-poweredge
> Please read the FAQ at http://lists.us.dell.com/faq
>
_______________________________________________
Linux-PowerEdge mailing list
Linux-PowerEdge at lists.us.dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq
More information about the Linux-PowerEdge
mailing list