serious stability issues with Dell C6145 and C410x

Stijn De Weirdt stijn.deweirdt at ugent.be
Fri Jul 29 11:11:38 CDT 2011


> > we run a recompiled 2.6.32-131.4 and saw that this really mattered a lot
> > wrt compute times. the main changes were to disable no_hz and set the
> > cpu_freq to 100Hz. (we also stripped a lot of unnecessary stuff from the
> > default kernels (these are compute nodes after all).
> > (the bios settings are performance, so no power saving features enabled)
> 
> 	Well, I was hoping to avoid making these somewhat
> Frankenstein-ish by running custom compiled kernels or installing
> non-package managed software (as much as possible).  
we have rpms of these kernels ;)

> But it might
> be unavoidable unfortunately.  I guess I can start with disabling
> dynamic ticks via the kernel command line though since it should
> be rather painless (assuming nohz=off works to disable it
> completely).
> 
it should. (default timing is 1kHz though)

> > we also had performance issues with the raid0 of the SAS2008 cards we
> > have. new firmware fixed that, but it was not standard (we got help from
> > dell support though)
> 
> 	We're just doing RAID-5 here.  I did notice that the RHEL
> 6.1 formatting of our ext4 root file system took significantly
> longer than the Debian testing installer's format.  I couldn't
> tell if RHEL simply wasn't using sparse_super or if it was an
> actual problem with the megaraid_sas driver in the RHEL kernel.
> 
> 	I assume you're running a direct from LSI firmware now?
> 
yes, but ours are simple JBOD/Raid0/Raid1 cards, no real raid
controllers. performance of raid0 with 2 15k rpm sas was 50MB/s (dd,
direct write flag). after the update it's 300+


stijn
> > for now things are starting to look good, my only remaining issue with
> > the boxes is that i can get the pcie max payload higher then 128byte on
> > our IB cards (something also important for your setup i assume). 
> 
> 	Just getting the GPGPU cards stable in the C410x
> enclosure is the first step.  I'm not at all concerned about
> performance right now if trying to use the cards at all means the
> system locks up.
> 

-- 
http://hasthelhcdestroyedtheearth.com/




More information about the Linux-PowerEdge mailing list