Problems with R410

Dennis Jacobfeuerborn dennisml at conversis.de
Fri Jan 6 18:00:53 CST 2012


Hi,
just a little heads up for those who are having similar problems:

On 12/30/2011 02:30 AM, Dennis Jacobfeuerborn wrote:
> On 12/30/2011 12:11 AM, Mark Nipper wrote:
>> On 29 Dec 2011, Dennis Jacobfeuerborn wrote:
>>> That bug isn't accessible but I found it referenced in other posts on
>>> this mailing list which mention problem with the C-State handling of
>>> the CPUs.
>>> I'm not sure though what the best workaround is. From what I get
>>> changing the settings in the BIOS is not working because the OS
>>> resets them. Does that mean I can keep the BIOS settings and simply
>>> providing the kernel options "intel_idle.max_cstate=2" should be
>>> enough? I also saw that sombody set "intel_idle.max_cstate=0
>>> processor.max_cstate=1".
>>
>> 	We're using just:
>> ---
>> intel_idle.max_cstate=2
>>
>> and it's basically solved the problem.  Setting the options in
>> BIOS did NOT fix our problem since, as you pointed out, the
>> kernel simply does its own thing.
>
> Ok, I'll try that and see if it fixes our ptoblems as well, thanks.

The system has been running fine since I added the kernel parameter so we 
are indeed suffering from the c-state issue and the workaround works for us 
so far.

>>> The idrac problem makes this worse as we cannot reboot the machine
>>> remotely when i goes down. So far we have used Supermicro based
>>> systems and there I just changed the ipmi network settings and was
>>> ready to go. On the two R410 we now have (more arriving next week)
>>> the idracs just tell me there is no link even though there clearly
>>> is.
>>
>> 	No promises that you're even running into this same
>> problem.  But, it's definitely the first thing to exclude as a
>> possibility since it seems to affect a lot of people.  The DRAC
>> issue you're having sounds unrelated to the hangs, and is
>> something I have not personally seen on any of our R series
>> machines.  I would say you might have a hardware issue there if
>> it weren't for the problem being duplicated across all of your
>> R410's.  It might be worth trying a different switch to rule out
>> a bad interaction between your particular switch and the DRAC's.
>> I'm also assuming your Ethernet cables have all been checked too.
>>
>
> Yes we used a different cable and both ports connected but we didn't try
> another switch yet. That's probably what we are going to do next although
> it would be really weird since the system itself seems to run fine so far
> without any issues so at least physically server and switch seem to get
> along quite well.
> Maybe the fact that I never configured the drac through the console but
> only using the shell under Centos has something to do with this? That would
> be strange but I'm running out of options.

As it turned out the systems have been delivered with the management 
functions turned off. I had to create a dos boot usb stick with the 
broadcom uxdiag.exe tool and issue a "uxdiag -t abcd -mfw 1" the enable the 
management.
After that the DRAC detected the link and everything started working as 
expected.
I'm not exactly happy that it took Dell almost two weeks to figure this out 
in which they made us reconfigure the DRAC several times and gave us 
dubious information ("If you use vlans on the host you can no longer use 
vlans with DRAC") but at lease we finally got to the bottom of this.

Regards,
   Dennis



More information about the Linux-PowerEdge mailing list