IPMItool returns false value on PowerEdge 1950

Harald_Jensas at Dell.com Harald_Jensas at Dell.com
Tue Jan 2 02:36:37 CST 2007


> -----Original Message-----
> From: Simon Gao [mailto:gao at schrodinger.com] 
> Sent: 29 December 2006 20:39
> To: Jensas, Harald
> Cc: linux-poweredge-Lists
> Subject: Re: IPMItool returns false value on PowerEdge 1950
> 
> Harald_Jensas at dell.com wrote:
> >> To: linux-poweredge-Lists
> >> Subject: Re: IPMItool returns false value on PowerEdge 1950
> >>
> >>     
> >
> > Have a look at page 86, 87 and 89 in this document.
> >
> > http://www.intel.com/design/xeon/datashts/313355.htm
> >
> > There is no diode in the new processors to read an absolute 
> CPU temperature from. The value you are seeing as negative is 
> a releative value that the system use to control the fan's.
> >
> > Quote:
> > "Fan speed control solutions utilize a TControl value stored in the 
> > processor IA32_TEMPERATURE_TARGET MSR. Prior to Dual-Core 
> Intel Xeon 
> > Processor 5100 Series, TControl represented a diode 
> temperature. With 
> > Dual-Core Intel Xeon Processor 5100 Series, TControl represents an 
> > offset from TCC activation temperature.The DTS outputs temperature 
> > offsets over the PECI interface in response to a GetTemp0() 
> command and these offsets are relative values vs. an absolute values."
> >
> >
> > Since GetTemp0() is used both on older CPUs with a diode, 
> absolute values, and newer CPUs with digital thermal sensors, 
> relative values, the BMC will add the sensor to the SDR list. 
> And thus on systems with a newer CPU you will get the 
> confusing relative readings.
> >
> > I am not sure if this can be fixed, or if this can be fixed 
> in BMC firmware or ipmitool...
> >
> >
> > Your hardware is working as expected. Just disregard the 
> CPU temp values you are seeing.
> >
> >   

> I don't think this is an acceptable option. Dell should put a 
> patch out to help convert relative value to accurate absolute 
> value. There will be a time when such absolute value is 
> necessary and required in troubleshooting system problem 
> might be related to CPU or system overheating. It's very 
> helpful to tell if fans fail to keep up with work.
> 

I do not see how you can not use the negative values that will close to zero as the system get hotter to see if a CPU is overheating or to see if the fans are not efficient enough.

Tjmax is processor specific. So if you use an absolute value you will have to read the CPU datasheet to know if you are closing to the max temperature. 

> Maybe you should check if HP Proliant machines with the same 
> CPU have the similar problem. If not, then you guys got to do 
> something about it.
> 

In my humble opinion the way it works now is better. Regardless of the specification of the CPU you will know that to close to zero is bad.



--
Harald Jensås











More information about the Linux-PowerEdge mailing list