Megasas driver lock?

Christopher Stura christophers at anthesi.it
Mon Mar 2 01:36:22 CST 2009


I had the same problem using kernel 2.6.18 on a configuration identical 
to yours. Here is the trick. you have to download the dell omsa (open 
manage package) which will install a program that will load a 
supplimentary mptsas module into the kernel (supposedly) so that 
openmanage can comunicate with the controller. Once you install this 
package (which will load a module) that wraps calls to the megasas 
driver, you should no longer get crashes on the machine, but randomic 
crashes of the megasas module under the wrapper driver. (this is not 
good) however the machine stay's up and running and you only experince 
random delays in writeback to the disk array.

In the end with the error's produced once the wrapper driver was 
installed which you will find in dmesg, and not anywhere else, DELL in 
the end replace the backplain and the Perc SAS controller on my 2950, 
and even the random writeback crashes disapeared. It took me 45 days 
with DELL tecnical support to get to this solution. (the only thing is 
that you might need to revert to kernel 2.6.18) to get the wrapper 
driver to work on your server. I am sure you can find a Ubuntu kernel 
2.6.18 for your distribution if that is a problem.

Hope this helps

Chris.

Avleen Vig wrote:
> Poweredge 2950
> PERC 6/i
> Running Ubuntu with a 2.6.24-23 kernel.
>
> For a few months, a seeming "random" times, the server has been crashing
> and resetting or locking up, with no diagnostic data returned.
>
> Running various diagnostic tools hadn't revealed anything.
> Most of the crashes I'm aware of had a common trait, where there was a
> lot of disk activity (lots of interrupts from megasas).
> I tried to reproduce it a few times with bonnie but couldn't make the
> system crash.
>
> It happened again tonight, and I saw this on a terminal I fortunately
> had open:
>   kernel: [3542884.516547] Disabling IRQ #16
> IRQ16 is megasas.
>
> This was moments after our monitoring system reported a system load of
> just over 85.
> The box deadlocked.
>
>
> Does anyone have any suggestions? I'm out of ideas.
>
>   



More information about the Linux-PowerEdge mailing list