[Linux-PowerEdge] Fwd: syslog gets hosed with CPU power

Anthony Ciani aciani at sivananthanlabs.us
Thu Sep 5 15:32:11 CDT 2013


Hi Fred,

 

The clearcpuid thing may not be present in your kernel.  It just masks certain cpu features so that the kernel doesn’t use them.  It is also possible that the feature bits in the CPUID are different on your CPUs than the ones used by the person who suggested it.

 

These log messages appear on a lot more than just PowerEdge systems.  Everything with the Intel Sandy Bridge and later architectures with frequency stepping enabled will generate them.

 

As mentioned, these are nothing but harmless notifications about the load on the system.  The “bug” was solved by disabling these messages by default, but allowing the user to turn them on by passing “int_pln_enable” as a kernel parameter.

 

I only saw one post about performance degradation.  Perhaps that person had their cpufreq governor set at some very high usage limit, like 95% (so the CPU frequency would never increase).

 

Until Redhat applies the kernel patch to deactivate the messages, you’ll just have to live with them.  I can’t imagine they make the log all that large, but if they are really annoying, you could:

 

1) Write a cron job to run a sed script to delete the lines from the log every hour or day.

 

sed '/Core power/d;/Package power/d' –i /var/log/messages

 

OR

 

2) Try modifying the load limits on your cpufreq governor.  Perhaps stepping up at a lower load will prevent reaching the power limits.  Just a guess though.

 

OR

 

3) Turn off frequency stepping in the BIOS and/or run the system at full speed/power using the performance governor.

 

 

From: Fred van Zwieten [mailto:fvzwieten at gmail.com] 
Sent: Thursday, September 05, 2013 2:37 AM
To: Anthony Ciani
Cc: linux-poweredge at dell.com
Subject: Re: [Linux-PowerEdge] Fwd: syslog gets hosed with CPU power

 

Well, I tried the clearcpuid=299, but that did not help. When googling around I see a lot of people complaining about the same and it's all on PowerEdge hardware only. RedHat mostly says it's harmless but it's also investigating it together with Dell. There are reports from people that they also experience slowness when these messages occur.

 

Any other ways to switch this spam off?




Groeten,

 

Fred

 

Science flies us to the moon. Religion flies us into buildings (Victor Stenger)

 

On Tue, Sep 3, 2013 at 9:47 PM, Anthony Ciani <aciani at sivananthanlabs.us> wrote:

Fred,

Those are just notifications that the load on the CPU reached some limit and
then went back down to normal.  From the times you gave, it looks like a
high load program runs on the machine for about an hour, stops, and then
about an hour later the machine is used again.

The messages are just informative, and really do nothing other than tell you
that the CPU is being used at 80+%.  For an HPC cluster, those messages may
as well not even exist.

They can supposedly be disabled by passing clearcpuid=299 as a kernel
parameter.

You could also edit your syslog.conf to reduce the verbosity of the message
or direct them to another log location.

https://bugzilla.kernel.org/show_bug.cgi?id=36182




Date: Mon, 2 Sep 2013 20:57:38 +0200
From: Fred van Zwieten <fvzwieten at vxcompany.com>

Hi,

We have a bunch of R620's with up-to-date RHEL6.4 OS on them. BIOS is 1.6.0
and system firmware 1.37.35 (Build 02). Afaik all is current level.

I get this in my /var/log/messages file:

Sep  2 10:28:14 svg008 kernel: CPU1: Core power limit normal
Sep  2 10:28:14 svg008 kernel: CPU3: Core power limit normal
Sep  2 10:28:14 svg008 kernel: CPU5: Core power limit normal
Sep  2 10:28:14 svg008 kernel: CPU7: Core power limit normal
Sep  2 10:28:14 svg008 kernel: CPU3: Package power limit normal
Sep  2 10:28:14 svg008 kernel: CPU5: Package power limit normal
Sep  2 11:36:22 svg008 kernel: CPU1: Core power limit notification (total
events = 93314)
Sep  2 11:36:22 svg008 kernel: CPU3: Core power limit notification (total
events = 93290)
Sep  2 11:36:22 svg008 kernel: CPU5: Core power limit notification (total
events = 93304)
Sep  2 11:36:22 svg008 kernel: CPU7: Core power limit notification (total
events = 92976)
Sep  2 11:36:22 svg008 kernel: CPU3: Package power limit notification (total
events = 93802)
Sep  2 11:36:22 svg008 kernel: CPU5: Package power limit notification (total
events = 93911)
Sep  2 11:36:22 svg008 kernel: CPU7: Package power limit notification (total
events = 93754)
Sep  2 11:36:22 svg008 kernel: CPU1: Package power limit notification (total
events = 93562)
Sep  2 11:36:22 svg008 kernel: CPU1: Core power limit normal
Sep  2 11:36:22 svg008 kernel: CPU3: Core power limit normal
Sep  2 11:36:22 svg008 kernel: CPU5: Core power limit normal
Sep  2 11:36:22 svg008 kernel: CPU7: Core power limit normal
<snip>
<-snip->
Temperatures on all machines is well below warning level according to iDrac.
<snip>
<-snip->
Any thoughts?

Regards,

Fred


_______________________________________________
Linux-PowerEdge mailing list
Linux-PowerEdge at dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.us.dell.com/pipermail/linux-poweredge/attachments/20130905/a2012601/attachment.html 


More information about the Linux-PowerEdge mailing list