About KIPMI0 process
Michael E Brown
Michael_E_Brown at dell.com
Tue May 29 16:38:41 CDT 2007
On Tue, May 29, 2007 at 11:37:48AM -0700, Chris - PowerEdge Linux List wrote:
> ----- Original Message -----
> From: "Michael E Brown" <Michael_E_Brown at dell.com>
> Sent: Tuesday, May 29, 2007 9:49 AM
> Subject: Re: About KIPMI0 process
>
>
> > On Fri, May 25, 2007 at 05:35:20PM -0700, Chance Reschke wrote:
> >>
> >> I'm don't follow this list closely and I'm just jumping into this
> >> thread in the middle and apologize if I'm bringing something up
> >> that's already been covered.
> >>
> >> Anyway, the load average doesn't necessarily have anything to do with
> >> how busy or not busy any of your CPUs are. Rather, it's a reflection
> >> of the depth of the run queue. The run queue can back up because
> >> there aren't enough cycles to service the load being offered by the
> >> computing you want to do, but it can just as easily be the product of
> >> the CPUs waiting around for access to a device - usually a disk.
> >> Look for processes in state 'D' - device-wait - and you might find
> >> the culprit(s). Even with very fast storage, this can easily happen
> >> when two or more processes attempt to write to a single file
> >> simultaneously. Fast CPUs, fast disk, almost nothing even trying to
> >> run, but high load average anyway.
> >
> > +1. I was just about to say the exact same thing.
>
> Guys - really appreciate these suggestions, and while what you suggest does
> make logical sense - this server was loaded with RHEL 4.4 "minimal install"
> and as of yet has *nothing* running on it, no apps waiting for disk or
> anything else. I just spent another half hour staring non-stop at 'top'
> watching the load go from 0.00 to 0.60 every few minutes and did not see any
> processes in the 'D' state - just the usual 'S', while 'top' itself shows up
> as state 'R' which is normal - and there's nothing else running - all other
> active processes are permanently in the 'S' state and nothing changes when
> the load hits 0.60. Nothing. The top 25 processes remain the same, all in
> 'S' state, all showing 0.00% CPU usage, yet load goes from 0.00 to 0.60
> within a 10-20 second span every few minutes.
What about the rest of the processes? If one of the bottom 25 processes
went zombie, it would bring the load average up without showing up in
top.
>
> Are you suggesting that perhaps this system somehow spins down disks every
That isnt what I was suggesting. Every process in an uninterruptible
wait will drive up the load average. For example, zombie processes do
this. If, for example, you have a process that forks off another and
then doesnt clean it up for a few seconds when it quits, you will see
load average spike. Another example would be as the original poster
mentioned, processes waiting on disk io.
The point, really, is that load average isnt always an accurate measure
of CPU 'busyness'.
Also, your earlier suggestion that ipmi or dell_rbu might have something
to do with it seem rather unlikely to me. If you dont have the ipmi
modules loaded, I dont see that it could cause your system to be busy.
And the dell_rbu driver doesnt do *anything* unless you specifically are
doing a BIOS update at that exact second.
> few minutes (5 or so?) and then has to spin them up again, which is what
> might be causing this? That makes sense, but out of 150+ identical hardware
> boxes we have, this one is the only one that is experiencing this problem so
> I find it odd that the SCSI or RAID bios would have settings set to spin
> drives down like this - again, with identical hardware and software loaded.
> I don't think we would have ever set it up like this, given the choice. If
> this is the case, how do we turn off disk spin-down?
>
> For reference, I checked all drives and they all pass with flying colors, so
> there's no issues with the drives going bad or RAID controller reporting any
> errors either.
Shooting in the dark: have you checked all of the RAID card settings to
ensure they are identical between systems? BIOS Settings? Things like
'patrol read' on some raid cards *might* do something similar.
--
Michael
More information about the Linux-PowerEdge
mailing list