Redundant NFS storage setup (part 3) : The disappointing PERC 5/E (solved?)

Matthias Saou thias at spam.spam.spam.spam.spam.spam.spam.egg.and.spam.freshrpms.net
Fri Jan 4 11:00:43 CST 2008


Harald_Jensas at Dell.com wrote :

> I have had a look at the GPT partition layout on the disk, and I belive
> it also has alignment problems.
> 
> Will try to explain my reasoning: (I hope the drawing stay intact in
> your mail clients.)
> 
> LBA	       0  1  2  3         34
> GPT	       |--|--|--|----------|-----------------Partition
> 1---------
> PhysDisks    |--------------Stripe 1-----------------|-----Stripe 2----
> Filesystem                       |----I/O-------|----I/O-------|---I/O-
> 
> I got the information about GPT layout from Wikipedia entry
> http://en.wikipedia.org/wiki/GUID_Partition_Table
> 
> Each LBA is a 512 bytes in size. 
> LBA 0 = Protective MBR
> LBA 1 = Primary GPT Header
> LBA 2 - 33 = Partition entries
> LBA 34 = This is where the first partition will be created.
> 
> LBA 34 is (34 * 512 bytes) = 17408 bytes into the disk. This is quite
> close to what parted reports as the start of the xfs partition Matthias
> created.
> 
> Now if the RAID stripre size is 64KB (65 536 bytes) the remaining space
> on the first stripe is (65 536 - 17408) = 48128. 
> If the filesystem element size is 32KB the first filesystem block will
> be created on the first stripe, then the next 32KB filesystem block will
> be split between stripe 1 and stripe 2 and so on.
> 
> I belive you can get the partition aligned by specifying start and end
> of the partition in megabytes when you create the partition.
> 
> mkpart part-type [fs-type] start end
> 
> 
> So for a striped RAID volume with Stipe Size set to 64K in the
> controller, do the following to create a partition approximately 10 TB
> in size:
> 
> mkpart part-type [fs-type] 0.0625 10485760
> 
> The result should be something like this:
> PhysDisks    |--------------Stripe 1-----------------|-----Stripe 2----
> Filesystem                                           |----I/O-------|--
> 
> 
> The only way I can think of to verify alignment is to use a sector
> analyzer and check that the start LBA in the partition entry is
> dividable by 128 for a 64KB stripe size array...
> 
> 
> Please do the math over again, I might have done a mistake. And remember
> to use values that fit your setup, stripe-size configured in HW RAID
> Controller  etc.

Very interesting, thanks.

I offten like parted trying to hide all of the complexity of
partitioning for us, but in this particular case it's annoying : I tried
creating a partition like you suggested, changing to match my 128k
stripes :

mkpart MD1 xfs 0.125 100%

This doesn't work, it seems that the default isn't megabytes. This does
work, though :

mkpart MD1 xfs 128kB 100%

Now I have this :

Model: DELL PERC 5/E Adapter (scsi)
Disk /dev/sdb: 13.0TB
Sector size (logical/physical): 512B/512B
Partition Table: gpt

Number  Start  End     Size    File system  Name  Flags
 1      128kB  13.0TB  13.0TB               MD1        

FWIW, here is how I now create my XFS filesystem (since my RAID5
consists of 14 disks and has 128k stripes) :
mkfs.xfs -f -L data -i size=512 -d su=128k,sw=14 /dev/sdb1

I just made some more tests, but don't see any changes in the results,
nor read or write. I'm still getting good performance, though.

One thing I've just realized which puzzles me a little is this :

# hdparm -t /dev/sdb
/dev/sdb:
 Timing buffered disk reads:  1876 MB in  3.00 seconds = 624.76 MB/sec
# hdparm -t /dev/sdb1
/dev/sdb1:
 Timing buffered disk reads:  872 MB in  3.00 seconds = 290.66 MB/sec

The speeds vary slightly from one run to the other, but the difference
between the entire block device and the partition I've created above is
a factor 2. With sdb1 starting at 17.4kB, same thing.

Could this be a result of the unalignment? I can't help but think it
actually could... I just tested the same against sda and sda1 (RAID1 of
15k SAS drives on the internal SAS5i), and both give me the same speed.

Matthias

-- 
Clean custom Red Hat Linux rpm packages : http://freshrpms.net/
Fedora release 8 (Werewolf) - Linux kernel 2.6.23.9-85.fc8
Load : 0.38 0.64 0.49



More information about the Linux-PowerEdge mailing list