>16tb filesystems on linux

Nick Stephens nick at ceiva.com
Thu Aug 26 13:30:37 CDT 2010


I actually gave that a shot myself but didn't think it was available yet 
due to getting the same error message.  Now that I think about it 
though, it could be a different issue I'm encountering. 

    [root at localhost ~]# mkfs.ext4dev -T news -m0 -L backup -E 
stride=16,stripe-width=208 /dev/sda1
    mke2fs 1.41.12 (17-May-2010)
    mkfs.ext4dev: Size of device /dev/sda1 too big to be expressed in 32 
bits
            using a blocksize of 4096.


To explain:

In our environment we don't handle large files at all, but rather 
millions of jpg images.  As such, our file sizes range from around 4kb 
-> 1mb.  The fact that we are utilizing such small amounts of data 
greatly limits the ability of the hardware to reach it's maximum 
potential with sustained read/writes.

Because of this, I typically create new filesystems using the -T news 
flag for the greatest amount of inodes as such:

    [root at localhost ~]# mkfs.ext4 -T news -m0 -L backup -E 
stride=16,stripe-width=208 /dev/sdb1
    mke4fs 1.41.5 (23-Apr-2009)
    Filesystem label=backup
    OS type: Linux
    Block size=4096 (log=2)
    Fragment size=4096 (log=2)
    ....

The MD1000 is populated with (15) 2TB 7200rpm SAS drives in a RAID-5 
with 1 hotspare (leaving 13 data disks).  I know that conventional 
wisdom says that raid5 is a poor choice when you are looking for 
performance, but localized benchmarking has proven that in our scenario 
the total-size gains acquired with the striping outweigh the redundancy 
provided with RAID-10 (since we are unable to get significant 
performance increases).

It's been a bit over a year since we did the XFS testing, but iirc we 
ditched it due to poor delete performance and (i think) overall 
performance issues.  Again it's been a while, but I do remember doing a 
lot of performance tuning research that did not seem to help us.

Going with an opensolaris type option COULD work and has been kicking 
around in the back of my mind for some time, but I'm hesitant to add a 
new OS to the environment if I don't need to.  Trying to keep it as 
simple for the rest of the team as possible, within practicality.

Thanks all
Nick


Jeff Layton wrote:
>  You can always build the latest e2fsprogs yourself. They have
> the 16TB fixes in them but haven't gotten alot of testing so
> be careful (test it out first). I've heard it's mostly the resizing
> piece of ext4 that hasn't been exercised much but you can
> ask the ext4 mailing list.
>
> Jeff
>
>> Hi all,
>>
>> I recently purchased a PE610 with a PERC6 card attached to an MD1000
>> with about 26TB of space.  I know from my own research that ext4
>> supports up to an exabyte, however it appears that the e2fs team has not
>> yet created a mkfs.ext4 that supports anything bigger than 16TB.
>>
>> I have played with XFS in the past, and sadly it's performance is
>> severely lacking for our environment, so it is not an option.
>>
>> I am very interested in ZFS, but it seems like it will never make it (in
>> a stable fashion) into the linux world at this rate.
>>
>> Does anyone have any tips or tricks for this scenario?  I am utilizing
>> RHEL5 based installations, btw.
>>
>> Thanks!
>> Nick



More information about the Linux-PowerEdge mailing list