About KIPMI0 process

Chris - Dell PowerEdge List linux-poweredge at dotcomdesigners.com
Wed May 23 15:03:07 CDT 2007


----- Original Message ----- 
From: "Matt Domsch" <Matt_Domsch at dell.com>
Sent: Monday, May 21, 2007 12:39 PM
Subject: Re: About KIPMI0 process


> On Mon, May 21, 2007 at 01:10:49PM -0500, hai wu wrote:
>> The ipmi is already off here, the one kipmi0 currently running on the
>> system, I think, comes from Dell OpenManager. After I stopping Dell
>> OpenManage (srvadmin-services stop), the kipmi0 process no longer
>> uses high CPU usage.
>
> Yes, Dell OpenManage System Administrator (OMSA) issues IPMI commands
> to the IPMI device driver.  The kipmi0 kernel thread greatly speeds up
> all IPMI commands, but is only active when commands are being issued.
> So, if no OMSA, and no other tool issuing IPMI commands, the kipmi0
> thread won't consume cycles.  But, the kipmi0 thread runs at idle
> priority, so it only uses CPU cycles when nothing else needs them.

Matt (and all),

We have one stubborn PowerEdge 2850 that is giving us a lot of headaches 
with this issue.  We turned off IPMI, OpenManage, stopped all services we 
could think of that have to do with this particular issue, yet every few 
minutes we see the CPU load spike at up to 0.52.  This server is not even 
running anything yet - it's a completely clean install of RHEL 4.4 (minimal 
install).  No Apache, no email, no mySQL, no FTP - absolutely nothing 
running or even installed on the server.

"srvadmin-services.sh status" reports this:
dell_rbu (module) is running
ipmi driver is stopped
dsm_om_shrsvc32d is stopped

I've tried to "modprobe -r dell_rbu" but the status still shows it as 
"running".  After a few minutes, though, the status changed to "stopped" - 
but the CPU load continues to spike at ~0.5 anyway.

Here's a sample MRTG graph that shows what the CPU load looks like (scale is 
x100):
http://000a.com/sample_load.gif

We've reloaded this server 3 times now, each time with the standard 'minimal 
install' (RHEL 4.4) and the problem persists.

The hardware configuration of this system:  PowerEdge 2850, dual Xeon 
3.6GHz, 4GB RAM, 4 x 36GB 15KRPM SCSI drives, PERC RAID controller.  We have 
over a 150 of identical systems installed and running and this is the only 
one that appears to have this problem.  None of the other systems, so far, 
experience the CPU load hovering at 0.50 levels without doing anything.

What do we have to do to turn all this stuff completely off to bring the CPU 
load down to 0.00 when it's not running anything at all?  I'm open to any 
and all suggestions at this point.  We've resisted putting this server into 
production.  I know this is considered "harmless" load by Dell, but it 
really messes up our monitoring systems and alters the true CPU load that we 
monitor for best application processing.  There's no reason we should be 
seeing anything but 0.00 on a system that has nothing installed and nothing 
running on it.

Thank you,

Chris



More information about the Linux-PowerEdge mailing list