Software raid rules!

Sigbjorn Strommen sigbjorn.strommen at roxar.com
Thu Apr 18 08:32:00 CDT 2002


jason andrade wrote:

> is your machine going to be lightly loaded ?  your raid ops will

NFS/Samba server only.  No user applications or databases running
locally.


> how does your testing work for when you have a disk failure ?  i

Well, if a disk dies while you are writing to it, and the server
goes down at the same time, then you may have some problems with
inconsistent data...


> have no doubts it should work in degraded mode, but have you tested
> the i/o there, since that should be using a lot more cpu as it has

I have not tested i/o with disks down.  To me that's not so important, as
the user(s) on that volume hopefully can live with degraded performance
until I've replaced the disk (and the volume has been rebuilt).  And I
have no reason to believe that the rebuild would be faster on the hw
controllers.


> tested how the system behaves on a reboot ?  if you have no hardware
> raid at all, you have a disk failure on the OS disk and you have
> a panic/reboot, will it automatically boot from the failover disk
> in software raid ?

Yes.  I was a bit sceptical to putting the system disks in sw raid,
and would have felt more comfortable by setting up a cloning scheme.
But several sources told me that sw raid1 works fine, so I took the
chance.


> 
> > For the fun of it I did a stress test, running 8 bonnie sessions in
> > parallel (1 session for each of the partitions on all the raid 5
> volumes),
> > and even then I get better block read speed for each process then
> > single runs with hw raid...
> 
> what size of bonnie sessions are you doing ?  i am slightly sceptical
> of those speeds (they are amazing) just in case the machine has a lot
> of ram and these are being cached and read out of ram.

The server has 1GB ram.  I've therefore used 2000MB as data size in
bonnie++.  But because I, as you, were sceptical to the numbers I
got, I did a rerun of all the 10 tests I did using 4000MB, and got
the numbers I reported.

> 
> also, i would do a lot more intensive write tests, as software raid
> has historically been much worse on writes than reads.

Well, the rewrite test in bonnie++ reports almost 3 times better
performance with sw raid...  I also executed 8 bonnie sessions in
parallel (and two sessions on each raid 5 volume), and I still got
quite decent write speed numbers.

Another factor was that the raid controller performance also went
down with more activity, and the dual channel megaraids were not
really dual channels at all (they were lousy at handling i/o to both
channels simultaneously).


> 
> > Another big advantage is that deactivating the aacraid card (using
> > it in scsi mode) - kernel upgrades runs much more smoothly, no more
> > struggling to get the system partitions recognized at boot time.
> 
> is this an issue ? i thought it was just a matter of passing the correct
> arguments to the kernel at boot and it should detect ok.

So did I.  But it didn't work.  Another list member had had the same
problem, and sent me a workaround.


> 
> > So...finally it seems I can put this server in full production. (I only
> > hope that it also will be stable with sw raid, and with this kernel!)
> 
> best of luck with the software raid - it's very interesting to see other
> people's experiences with this.  i will be sticking with hardware raid

Believe me.  Before this server I would not have put sw raid 5 on any
box!  And before I moved to sw raid I think I did just about everything
to get hw raid to perform.  In the end (after months of testing, tweaking,
advice from other people, etc.) the hw raid performance was still so
incredibly bad that I had 3 options:  Get better controllers from elsewhere,
get my money back for the whole setup and buy other stuff, or try sw raid.
With the time limits I had long gone exceeded, sw raid was the only option.

> though, as i have a higher comfort factor with it (and also, the aacraid
> only has to have enough performance to keep up with the OS - we do all
> the serious i/o using fibrechannel cards/raid servers/disks)


If you have scanned the archives you may have seen that our plan was
also to buy fiber (PV660F), and got promises that Linux would be
supported in just a couple of months (at the time).  Then we were told
that the 660F would not be supported at all under Linux, so we were left
with the 210S's and scsi hw raid as the only option.

I also intended to use the aacraid for system disks only, and two dual
channel megaraids for the data disks.

Then I did some of the tests you asked for above on the system disks
attached to the aacraid controller, and was not happy at all about the
recover procedures.

At one of the many reinstalls/upgrades I tried (to get the megaraids
to perform) I wanted to keep the current OS in case of disaster.
To also test the failover of the system mirror I unplugged one of the
disks, and this worked fine.  I then tried to add a disk back into the
mirror (same size disk, but different brand), and were unable to get
the mirror working again with two disks.  Maybe due to lack of experience
with aacraid command interface, but at least I feel more comfortable
having the same recover procedure on all disks on this system, by using
software raid.



So, to get som kind of conlusion on all of this:  If you are concerned
about performance, and you do more reads then writes (which is the case
in most environments) you should use software raid (and it's much cheaper
to buy more CPU power then raid controllers :-)




-Sigbjorn

PS!
I got bonnie++ number from a guy with a fiber setup (660F) on Linux,
and I was not impressed by those numbers either...




More information about the Linux-PowerEdge mailing list