serious stability issues with Dell C6145 and C410x

Mark Nipper nipsy at mail.utexas.edu
Fri Jul 29 11:31:35 CDT 2011


On 29 Jul 2011, Mark Nipper wrote:
> On 29 Jul 2011, Stijn De Weirdt wrote:
> > we run a recompiled 2.6.32-131.4 and saw that this really mattered a lot
> > wrt compute times. the main changes were to disable no_hz and set the
> > cpu_freq to 100Hz. (we also stripped a lot of unnecessary stuff from the
> > default kernels (these are compute nodes after all).
> > (the bios settings are performance, so no power saving features enabled)
> 
> 	Well, I was hoping to avoid making these somewhat
> Frankenstein-ish by running custom compiled kernels or installing
> non-package managed software (as much as possible).  But it might
> be unavoidable unfortunately.  I guess I can start with disabling
> dynamic ticks via the kernel command line though since it should
> be rather painless (assuming nohz=off works to disable it
> completely).

	Well, nohz=off is literally a night and day difference.
So far it has meant that simple things like "nvidia-smi -pm 1"
don't hang the system and python-pyopencl is actually detecting
both GPGPU devices correctly now and talking to them.  This is
already a huge improvement!

	Hopefully disabling dynamic ticks was the only issue
here.  Thanks for the pointer.

-- 
Mark Nipper
nipsy at mail.utexas.edu
+1 512 471 3483 - office
+1 979 575 3193 - cell
-
There are 10 kinds of people in the world; those who know binary
and those who don't.



More information about the Linux-PowerEdge mailing list