megasas: MFI FW status 0x3

Joe Malicki jmalicki at metacarta.com
Wed Apr 25 12:44:26 CDT 2007


In 2.6.18 the default disk scheduler changed from Anticipatory to CFQ.
CFQ is a much better scheduler in that it allows more throughput and
load to go to the controller... unfortunately, this puts more stress on
a buggy controller, and thus crashes.

This is very interesting to me... firmware 5.0.3-00(10/01?) has been very
stable for us... has 5.1.1-0040 regressed?

RB wrote:
>> the 2.6.16 kernel next.  All indications are that it will panic and
>> clear any of the other kernel updates/patches of causing the issue.
>> If so, one of the [many] changes between 2006/02/28 and 2006/07/03 is
>> suspect for introducing this bug.
>>     
>
> I was wrong; the following is my updated test table:
>
>                     megasas-commit    megasas      dm-crypt    works
> 2.6.15-gentoo-r5    2005-11-10        02.00-rc4    1.1.0       yes
> 2.6.16-hardened-r11 2006-02-28        02.04        1.1.0       yes
> 2.6.16-hardened-r11 2006-07-02        03.01        1.1.0       yes, with RESETs
> 2.6.18.8            2006-07-02        03.01        1.1.0       no
> 2.6.18.8            2006-07-03        03.01        1.1.0       no
> 2.6.18-hardened-r6  2006-07-03        03.01        1.1.0       no
> 2.6.20-hardened-r2  current/hand      03.05/03.09  1.3.0       no
>
> It seems to indicate that, instead of the version of the driver, the
> bug is dependent on the version of the kernel.  All of the panic
> traces I have indicate consistent failure in megasas_isr, and are
> either "Unable to handle kernel paging request" or "Unable to handle
> kernel NULL pointer dereference".  To my untrained eye, this looks
> like a race condition - megasas_isr (Interrupt Service Routine) is
> trying to service an interrupt that has already been handled by
> another thread.  I can do singleton writes to the disk (small edits to
> files), but once the load increases and there are multiple, parallel
> interrupts and several items on the queue, the issue immediately
> appears.  It's likely exacerbated by my use of dm-crypt and XFS as
> well.
>
> I know this may be beyond the level of this list, but it's worth a try...
>
> _______________________________________________
> Linux-PowerEdge mailing list
> Linux-PowerEdge at dell.com
> http://lists.us.dell.com/mailman/listinfo/linux-poweredge
> Please read the FAQ at http://lists.us.dell.com/faq
>
>   



More information about the Linux-PowerEdge mailing list