About KIPMI0 process
Michael E Brown
Michael_E_Brown at dell.com
Tue May 29 11:49:03 CDT 2007
On Fri, May 25, 2007 at 05:35:20PM -0700, Chance Reschke wrote:
> Hi,
>
> I'm don't follow this list closely and I'm just jumping into this
> thread in the middle and apologize if I'm bringing something up
> that's already been covered.
>
> Anyway, the load average doesn't necessarily have anything to do with
> how busy or not busy any of your CPUs are. Rather, it's a reflection
> of the depth of the run queue. The run queue can back up because
> there aren't enough cycles to service the load being offered by the
> computing you want to do, but it can just as easily be the product of
> the CPUs waiting around for access to a device - usually a disk.
> Look for processes in state 'D' - device-wait - and you might find
> the culprit(s). Even with very fast storage, this can easily happen
> when two or more processes attempt to write to a single file
> simultaneously. Fast CPUs, fast disk, almost nothing even trying to
> run, but high load average anyway.
+1. I was just about to say the exact same thing.
--
Michael
>
> On May 25, 2007, at 5:20 PM, Chris - PowerEdge Linux List wrote:
>
> > ----- Original Message -----
> > From: "Michael E Brown" <Michael_E_Brown at dell.com>
> > Sent: Wednesday, May 23, 2007 3:56 PM
> > Subject: Re: About KIPMI0 process
> >
> >>> What do we have to do to turn all this stuff completely off to
> >>> bring the
> >>> CPU
> >>> load down to 0.00 when it's not running anything at all? I'm
> >>> open to any
> >>> and all suggestions at this point. We've resisted putting this
> >>> server
> >>> into
> >>> production. I know this is considered "harmless" load by Dell,
> >>> but it
> >>> really messes up our monitoring systems and alters the true CPU
> >>> load that
> >>> we
> >>> monitor for best application processing. There's no reason we
> >>> should be
> >>> seeing anything but 0.00 on a system that has nothing installed and
> >>> nothing
> >>> running on it.
> >>
> >> You sure it isnt some random system daemon? You havent provided
> >> any data
> >> to show what is causing the cpu load.
> >
> > That's the problem. I disabled virtually all daemons that get
> > installed
> > with 'RHEL4 minimal install' and I've been watching 'top' and even
> > have a
> > script running that constantly checks the load and if it exceeds
> > 0.40 it
> > loggs the top 20 processes once a second, and NOTHING is
> > showing... I just
> > spent the last 15 minutes staring non-stop at 'top' and here's what
> > the
> > results look like, when the load suddenly spikes at 0.60:
> >
> > ----------------------------------------------------------------------
> > ------------
> > Thu May 24 17:04:02 PDT 2007
> > top - 17:04:03 up 1 day, 22:08, 2 users, load average: 0.60,
> > 0.23, 0.08
> > Tasks: 59 total, 1 running, 58 sleeping, 0 stopped, 0 zombie
> > Cpu0 : 0.0% us, 0.0% sy, 0.0% ni, 100.0% id, 0.0% wa, 0.0%
> > hi, 0.0%
> > si
> > Cpu1 : 0.0% us, 0.0% sy, 0.0% ni, 100.0% id, 0.0% wa, 0.0%
> > hi, 0.0%
> > si
> > Cpu2 : 0.0% us, 0.0% sy, 0.0% ni, 100.0% id, 0.0% wa, 0.0%
> > hi, 0.0%
> > si
> > Cpu3 : 0.0% us, 0.0% sy, 0.0% ni, 100.0% id, 0.0% wa, 0.0%
> > hi, 0.0%
> > si
> > Mem: 4149240k total, 336916k used, 3812324k free, 46620k
> > buffers
> > Swap: 4192956k total, 0k used, 4192956k free, 231060k
> > cached
> >
> > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> > 1 root 15 0 3536 548 472 S 0 0.0 0:00.63 init
> > 2 root RT 0 0 0 0 S 0 0.0 0:00.02
> > migration/0
> > 3 root 34 19 0 0 0 S 0 0.0 0:00.00
> > ksoftirqd/0
> > 4 root RT 0 0 0 0 S 0 0.0 0:00.01
> > migration/1
> > 5 root 34 19 0 0 0 S 0 0.0 0:00.00
> > ksoftirqd/1
> > 6 root RT 0 0 0 0 S 0 0.0 0:00.02
> > migration/2
> > 7 root 34 19 0 0 0 S 0 0.0 0:00.00
> > ksoftirqd/2
> > 8 root RT 0 0 0 0 S 0 0.0 0:00.01
> > migration/3
> > 9 root 34 19 0 0 0 S 0 0.0 0:00.00
> > ksoftirqd/3
> > 10 root 5 -10 0 0 0 S 0 0.0 0:00.00 events/0
> > ----------------------------------------------------------------------
> > ------------
> >
> > The 2nd user is me - one SSH session running 'top', second session me
> > grabbing data from the log. Nothing else is running. 'init' seems
> > to stay
> > at the top of the 'top' list, and someone for no reason whatsoever
> > the load
> > goes from 0.00 to around 0.5 to 0.6 for about 20-40 seconds, then
> > drops back
> > down to 0.00. I don't see any other processes running when the
> > load spikes
> > to ~0.60, and as you can see from the 'top' list above, there's
> > nothing in
> > the %CPU column either, which is the part that is driving me nuts.
> > SOMETHING is causing the CPU load to spike, but not a single
> > process is
> > showing in 'top' as using anything but 0% CPU.
> >
> > Any ideas what else I can try or how else to troubleshoot this to
> > figure out
> > what on earth might be causing this? I really don't see any
> > "processes"
> > using CPU resources when this load issue occurs.
> >
> > Chris
> >
> > _______________________________________________
> > Linux-PowerEdge mailing list
> > Linux-PowerEdge at dell.com
> > http://lists.us.dell.com/mailman/listinfo/linux-poweredge
> > Please read the FAQ at http://lists.us.dell.com/faq
>
> _______________________________________________
> Linux-PowerEdge mailing list
> Linux-PowerEdge at dell.com
> http://lists.us.dell.com/mailman/listinfo/linux-poweredge
> Please read the FAQ at http://lists.us.dell.com/faq
More information about the Linux-PowerEdge
mailing list