In 2.6.18 the default disk scheduler changed from Anticipatory to CFQ.
CFQ is a much better scheduler in that it allows more throughput and
load to go to the controller... unfortunately, this puts more stress on
a buggy controller, and thus crashes.

This is very interesting to me... firmware 5.0.3-00(10/01?) has been very
stable for us... has 5.1.1-0040 regressed?

> I was wrong; the following is my updated test table:
>                     megasas-commit    megasas      dm-crypt    works
> 2.6.15-gentoo-r5    2005-11-10        02.00-rc4    1.1.0       yes
> 2.6.16-hardened-r11 2006-02-28        02.04        1.1.0       yes
> 2.6.16-hardened-r11 2006-07-02        03.01        1.1.0       yes, with RESETs
>            2006-07-02        03.01        1.1.0       no
>            2006-07-03        03.01        1.1.0       no
> 2.6.18-hardened-r6  2006-07-03        03.01        1.1.0       no
> 2.6.20-hardened-r2  current/hand      03.05/03.09  1.3.0       no
> It seems to indicate that, instead of the version of the driver, the
> bug is dependent on the version of the kernel.  All of the panic
> traces I have indicate consistent failure in megasas_isr, and are
> either "Unable to handle kernel paging request" or "Unable to handle
> kernel NULL pointer dereference".  To my untrained eye, this looks
> like a race condition - megasas_isr (Interrupt Service Routine) is
> trying to service an interrupt that has already been handled by
> another thread.  I can do singleton writes to the disk (small edits to
> files), but once the load increases and there are multiple, parallel
> interrupts and several items on the queue, the issue immediately
> appears.  It's likely exacerbated by my use of dm-crypt and XFS as
> well.
> I know this may be beyond the level of this list, but it's worth a try...
