About KIPMI0 process
Chance Reschke
reschke at bakerlab.org
Fri May 25 19:35:20 CDT 2007
Hi,
I'm don't follow this list closely and I'm just jumping into this
thread in the middle and apologize if I'm bringing something up
that's already been covered.
Anyway, the load average doesn't necessarily have anything to do with
how busy or not busy any of your CPUs are. Rather, it's a reflection
of the depth of the run queue. The run queue can back up because
there aren't enough cycles to service the load being offered by the
computing you want to do, but it can just as easily be the product of
the CPUs waiting around for access to a device - usually a disk.
Look for processes in state 'D' - device-wait - and you might find
the culprit(s). Even with very fast storage, this can easily happen
when two or more processes attempt to write to a single file
simultaneously. Fast CPUs, fast disk, almost nothing even trying to
run, but high load average anyway.
Good luck!
- Chance
--
Chance Reschke
Department of Biochemistry
University of Washington
On May 25, 2007, at 5:20 PM, Chris - PowerEdge Linux List wrote:
> ----- Original Message -----
> From: "Michael E Brown" <Michael_E_Brown at dell.com>
> Sent: Wednesday, May 23, 2007 3:56 PM
> Subject: Re: About KIPMI0 process
>
>>> What do we have to do to turn all this stuff completely off to
>>> bring the
>>> CPU
>>> load down to 0.00 when it's not running anything at all? I'm
>>> open to any
>>> and all suggestions at this point. We've resisted putting this
>>> server
>>> into
>>> production. I know this is considered "harmless" load by Dell,
>>> but it
>>> really messes up our monitoring systems and alters the true CPU
>>> load that
>>> we
>>> monitor for best application processing. There's no reason we
>>> should be
>>> seeing anything but 0.00 on a system that has nothing installed and
>>> nothing
>>> running on it.
>>
>> You sure it isnt some random system daemon? You havent provided
>> any data
>> to show what is causing the cpu load.
>
> That's the problem. I disabled virtually all daemons that get
> installed
> with 'RHEL4 minimal install' and I've been watching 'top' and even
> have a
> script running that constantly checks the load and if it exceeds
> 0.40 it
> loggs the top 20 processes once a second, and NOTHING is
> showing... I just
> spent the last 15 minutes staring non-stop at 'top' and here's what
> the
> results look like, when the load suddenly spikes at 0.60:
>
> ----------------------------------------------------------------------
> ------------
> Thu May 24 17:04:02 PDT 2007
> top - 17:04:03 up 1 day, 22:08, 2 users, load average: 0.60,
> 0.23, 0.08
> Tasks: 59 total, 1 running, 58 sleeping, 0 stopped, 0 zombie
> Cpu0 : 0.0% us, 0.0% sy, 0.0% ni, 100.0% id, 0.0% wa, 0.0%
> hi, 0.0%
> si
> Cpu1 : 0.0% us, 0.0% sy, 0.0% ni, 100.0% id, 0.0% wa, 0.0%
> hi, 0.0%
> si
> Cpu2 : 0.0% us, 0.0% sy, 0.0% ni, 100.0% id, 0.0% wa, 0.0%
> hi, 0.0%
> si
> Cpu3 : 0.0% us, 0.0% sy, 0.0% ni, 100.0% id, 0.0% wa, 0.0%
> hi, 0.0%
> si
> Mem: 4149240k total, 336916k used, 3812324k free, 46620k
> buffers
> Swap: 4192956k total, 0k used, 4192956k free, 231060k
> cached
>
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> 1 root 15 0 3536 548 472 S 0 0.0 0:00.63 init
> 2 root RT 0 0 0 0 S 0 0.0 0:00.02
> migration/0
> 3 root 34 19 0 0 0 S 0 0.0 0:00.00
> ksoftirqd/0
> 4 root RT 0 0 0 0 S 0 0.0 0:00.01
> migration/1
> 5 root 34 19 0 0 0 S 0 0.0 0:00.00
> ksoftirqd/1
> 6 root RT 0 0 0 0 S 0 0.0 0:00.02
> migration/2
> 7 root 34 19 0 0 0 S 0 0.0 0:00.00
> ksoftirqd/2
> 8 root RT 0 0 0 0 S 0 0.0 0:00.01
> migration/3
> 9 root 34 19 0 0 0 S 0 0.0 0:00.00
> ksoftirqd/3
> 10 root 5 -10 0 0 0 S 0 0.0 0:00.00 events/0
> ----------------------------------------------------------------------
> ------------
>
> The 2nd user is me - one SSH session running 'top', second session me
> grabbing data from the log. Nothing else is running. 'init' seems
> to stay
> at the top of the 'top' list, and someone for no reason whatsoever
> the load
> goes from 0.00 to around 0.5 to 0.6 for about 20-40 seconds, then
> drops back
> down to 0.00. I don't see any other processes running when the
> load spikes
> to ~0.60, and as you can see from the 'top' list above, there's
> nothing in
> the %CPU column either, which is the part that is driving me nuts.
> SOMETHING is causing the CPU load to spike, but not a single
> process is
> showing in 'top' as using anything but 0% CPU.
>
> Any ideas what else I can try or how else to troubleshoot this to
> figure out
> what on earth might be causing this? I really don't see any
> "processes"
> using CPU resources when this load issue occurs.
>
> Chris
>
> _______________________________________________
> Linux-PowerEdge mailing list
> Linux-PowerEdge at dell.com
> http://lists.us.dell.com/mailman/listinfo/linux-poweredge
> Please read the FAQ at http://lists.us.dell.com/faq
More information about the Linux-PowerEdge
mailing list