serious stability issues with Dell C6145 and C410x

Mark Nipper nipsy at
Fri Jul 29 11:04:37 CDT 2011

On 29 Jul 2011, Stijn De Weirdt wrote:
> we run a recompiled 2.6.32-131.4 and saw that this really mattered a lot
> wrt compute times. the main changes were to disable no_hz and set the
> cpu_freq to 100Hz. (we also stripped a lot of unnecessary stuff from the
> default kernels (these are compute nodes after all).
> (the bios settings are performance, so no power saving features enabled)

	Well, I was hoping to avoid making these somewhat
Frankenstein-ish by running custom compiled kernels or installing
non-package managed software (as much as possible).  But it might
be unavoidable unfortunately.  I guess I can start with disabling
dynamic ticks via the kernel command line though since it should
be rather painless (assuming nohz=off works to disable it

> we also had performance issues with the raid0 of the SAS2008 cards we
> have. new firmware fixed that, but it was not standard (we got help from
> dell support though)

	We're just doing RAID-5 here.  I did notice that the RHEL
6.1 formatting of our ext4 root file system took significantly
longer than the Debian testing installer's format.  I couldn't
tell if RHEL simply wasn't using sparse_super or if it was an
actual problem with the megaraid_sas driver in the RHEL kernel.

	I assume you're running a direct from LSI firmware now?

> for now things are starting to look good, my only remaining issue with
> the boxes is that i can get the pcie max payload higher then 128byte on
> our IB cards (something also important for your setup i assume). 

	Just getting the GPGPU cards stable in the C410x
enclosure is the first step.  I'm not at all concerned about
performance right now if trying to use the cards at all means the
system locks up.

Mark Nipper
nipsy at
+1 512 471 3483 - office
+1 979 575 3193 - cell
I cannot tolerate intolerant people.

More information about the Linux-PowerEdge mailing list