> how does your testing work for when you have a disk failure ?  i
> have no doubts it should work in degraded mode, but have you tested
> the i/o there, since that should be using a lot more cpu as it has
> to xor the data from the disks to recreate it.   have you also
> tested how the system behaves on a reboot ?  if you have no hardware
> raid at all, you have a disk failure on the OS disk and you have
> a panic/reboot, will it automatically boot from the failover disk
> in software raid ?

Our most common configuration on our servers is all software raid, the os 
is raid 1 on the first 2 disks, and the data dirves are either raid1 or 
raid 5.  Over the years of using software raid, on heavily used systems 
I've seen drives die with no performance hit on the system.  I do 
recommend having a spare disk, not for performance issues, but from the 
fact tha one drive may die silently and you don't notice for another week 
when the second drive dies and takes out the array.

I've lost the primary OS disks and the machine keeps running.  Later on we
rebooted the machine (pulling the former sda1 from the box), and the
system booted normally.  For this to happen you must be using lilo.  Only
once did we have a kernel panic which caused a unrecoverable error.  Most
of the time you can just reboot and continue on. 

