About KIPMI0 process

RB aoz.syn at gmail.com
Tue May 29 20:11:55 CDT 2007


http://people.redhat.com/sgrubb/audit/

[bob at tst ~] sudo zgrep AUDIT /proc/config.gz
Password:
# CONFIG_AUDIT is not set

My system's not set up to do it, but if you're looking to find out
*every* userspace process consuming resources, CONFIG_AUDIT and it's
accompanying tools are where you'll need to go.


RB


On 5/29/07, Chance Reschke <reschke at bakerlab.org> wrote:
> Hi,
>
> A couple of things:
>
> 1) ditto to ME_Brown at dell on the disk spin-down, etc.
>
> 2) You say you have 150 identical boxes and that only one exhibits
> this behavior in the top(1) output.  Really?  You've set up all 150
> with minimal installs, killed every daemon, fired up top and stared
> at the screen for hours?  Sounds hellish! ;)
>
> My guess is that there's nothing wrong and that some normal low-level
> process is occasionally having to wait a bit for access to a device.
> There's lots of nonsense in even a minimal install these days -
> selinux, cpuspeed management, irq balancing, and on and on.  Any of
> this stuff could be the culprit.
>
> Good luck!
>
>   - Chance
>
>
> --
> Chance Reschke
> Biochemistry Department
> University of Washington
>
>
>
> On May 29, 2007, at 2:38 PM, Michael E Brown wrote:
>
> > On Tue, May 29, 2007 at 11:37:48AM -0700, Chris - PowerEdge Linux
> > List wrote:
> >> ----- Original Message -----
> >> From: "Michael E Brown" <Michael_E_Brown at dell.com>
> >> Sent: Tuesday, May 29, 2007 9:49 AM
> >> Subject: Re: About KIPMI0 process
> >>
> >>
> >>> On Fri, May 25, 2007 at 05:35:20PM -0700, Chance Reschke wrote:
> >>>>
> >>>> I'm don't follow this list closely and I'm just jumping into this
> >>>> thread in the middle and apologize if I'm bringing something up
> >>>> that's already been covered.
> >>>>
> >>>> Anyway, the load average doesn't necessarily have anything to do
> >>>> with
> >>>> how busy or not busy any of your CPUs are.  Rather, it's a
> >>>> reflection
> >>>> of the depth of the run queue.  The run queue can back up because
> >>>> there aren't enough cycles to service the load being offered by the
> >>>> computing you want to do, but it can just as easily be the
> >>>> product of
> >>>> the CPUs waiting around for access to a device - usually a disk.
> >>>> Look for processes in state 'D' - device-wait - and you might find
> >>>> the culprit(s).  Even with very fast storage, this can easily
> >>>> happen
> >>>> when two or more processes attempt to write to a single file
> >>>> simultaneously.  Fast CPUs, fast disk, almost nothing even
> >>>> trying to
> >>>> run, but high load average anyway.
> >>>
> >>> +1. I was just about to say the exact same thing.
> >>
> >> Guys - really appreciate these suggestions, and while what you
> >> suggest does
> >> make logical sense - this server was loaded with RHEL 4.4 "minimal
> >> install"
> >> and as of yet has *nothing* running on it, no apps waiting for
> >> disk or
> >> anything else.  I just spent another half hour staring non-stop at
> >> 'top'
> >> watching the load go from 0.00 to 0.60 every few minutes and did
> >> not see any
> >> processes in the 'D' state - just the usual 'S', while 'top'
> >> itself shows up
> >> as state 'R' which is normal - and there's nothing else running -
> >> all other
> >> active processes are permanently in the 'S' state and nothing
> >> changes when
> >> the load hits 0.60.  Nothing.  The top 25 processes remain the
> >> same, all in
> >> 'S' state, all showing 0.00% CPU usage, yet load goes from 0.00 to
> >> 0.60
> >> within a 10-20 second span every few minutes.
> >
> > What about the rest of the processes? If one of the bottom 25
> > processes
> > went zombie, it would bring the load average up without showing up in
> > top.
> >
> >>
> >> Are you suggesting that perhaps this system somehow spins down
> >> disks every
> >
> > That isnt what I was suggesting. Every process in an uninterruptible
> > wait will drive up the load average. For example, zombie processes do
> > this. If, for example, you have a process that forks off another and
> > then doesnt clean it up for a few seconds when it quits, you will see
> > load average spike. Another example would be as the original poster
> > mentioned, processes waiting on disk io.
> >
> > The point, really, is that load average isnt always an accurate
> > measure
> > of CPU 'busyness'.
> >
> > Also, your earlier suggestion that ipmi or dell_rbu might have
> > something
> > to do with it seem rather unlikely to me. If you dont have the ipmi
> > modules loaded, I dont see that it could cause your system to be busy.
> > And the dell_rbu driver doesnt do *anything* unless you
> > specifically are
> > doing a BIOS update at that exact second.
> >
> >> few minutes (5 or so?) and then has to spin them up again, which
> >> is what
> >> might be causing this?  That makes sense, but out of 150+
> >> identical hardware
> >> boxes we have, this one is the only one that is experiencing this
> >> problem so
> >> I find it odd that the SCSI or RAID bios would have settings set
> >> to spin
> >> drives down like this - again, with identical hardware and
> >> software loaded.
> >> I don't think we would have ever set it up like this, given the
> >> choice.  If
> >> this is the case, how do we turn off disk spin-down?
> >>
> >> For reference, I checked all drives and they all pass with flying
> >> colors, so
> >> there's no issues with the drives going bad or RAID controller
> >> reporting any
> >> errors either.
> >
> > Shooting in the dark: have you checked all of the RAID card
> > settings to
> > ensure they are identical between systems? BIOS Settings? Things like
> > 'patrol read' on some raid cards *might* do something similar.
> > --
> > Michael
> >
> > _______________________________________________
> > Linux-PowerEdge mailing list
> > Linux-PowerEdge at dell.com
> > http://lists.us.dell.com/mailman/listinfo/linux-poweredge
> > Please read the FAQ at http://lists.us.dell.com/faq
>
> _______________________________________________
> Linux-PowerEdge mailing list
> Linux-PowerEdge at dell.com
> http://lists.us.dell.com/mailman/listinfo/linux-poweredge
> Please read the FAQ at http://lists.us.dell.com/faq
>



More information about the Linux-PowerEdge mailing list