c2100 CentOS 6.2 kernel hang.

Scott Clark scott.clark at webfusion.com
Mon Mar 26 04:00:09 CDT 2012


I've just received a batch of c2100s, installed CentOS 6.2 on them,
after a few minutes of running, I get the following on the console:

Uhhuh. NMI received for unknown reason 2d on CPU 0.
Do you have a strange power saving mode enabled?
Dazed and confused, but trying to continue.

And the 4 port gigabit ethernet adaptor goes offline:
idb 0000:06:00.0: eth0 reset adapter
idb 0000:07:00.1: eth3 reset adapter
idb 0000:06:00.1: eth1 reset adapter
idb 0000:07:00.0: eth2 reset adapter

Output from dmesg:
Uhhuh. NMI received for unknown reason 2d on CPU 0.
Do you have a strange power saving mode enabled?
Dazed and confused, but trying to continue
------------[ cut here ]------------
WARNING: at net/sched/sch_generic.c:261 dev_watchdog+0x26d/0x280() (Not
tainted)
Hardware name: PowerEdge C2100
NETDEV WATCHDOG: eth3 (igb): transmit queue 0 timed out
Modules linked in: ipmi_si mpt2sas scsi_transport_sas raid_class mptctl
mptbase ipmi_devintf ipmi_msghandler dell_rbu 8021q garp stp llc bonding
ipv6 dm_mod ses enclosure sg igb dca dcdbas serio_raw i2c_i801 i2c_core
iTCO_wdt iTCO_vendor_support i7core_edac edac_core shpchp ext3 jbd
mbcache sd_mod crc_t10dif megaraid_sas pata_acpi ata_generic ata_piix
[last unloaded: ipmi_si]
Pid: 0, comm: swapper Not tainted 2.6.32-220.7.1.el6.x86_64 0000001
Call Trace:
 <IRQ> [<ffffffff81069a17>] ? warn_slowpath_common+0x87/0xc0
 [<ffffffff81069b06>] ? warn_slowpath_fmt+0x46/0x50
 [<ffffffff8144a60d>] ? dev_watchdog+0x26d/0x280
 [<ffffffff8107cff4>] ? mod_timer+0x144/0x220
 [<ffffffff8144a3a0>] ? dev_watchdog+0x0/0x280
 [<ffffffff8107c7f7>] ? run_timer_softirq+0x197/0x340
 [<ffffffff810a0b20>] ? tick_sched_timer+0x0/0xc0
 [<ffffffff8102af2d>] ? lapic_next_event+0x1d/0x30
 [<ffffffff81072001>] ? __do_softirq+0xc1/0x1d0
 [<ffffffff81095610>] ? hrtimer_interrupt+0x140/0x250
 [<ffffffff8100c24c>] ? call_softirq+0x1c/0x30
 [<ffffffff8100de85>] ? do_softirq+0x65/0xa0
 [<ffffffff81071de5>] ? irq_exit+0x85/0x90
 [<ffffffff814f4eb0>] ? smp_apic_timer_interrupt+0x70/0x9b
 [<ffffffff8100bc13>] ? apic_timer_interrupt+0x13/0x20
 <EOI> [<ffffffff812c4b0e>] ? intel_idle+0xde/0x170
 [<ffffffff812c4af1>] ? intel_idle+0xc1/0x170
 [<ffffffff813fa027>] ? cpuidle_idle_call+0xa7/0x140
 [<ffffffff81009e06>] ? cpu_idle+0xb6/0x110
 [<ffffffff814d420a>] ? rest_init+0x7a/0x80
 [<ffffffff81c1ff76>] ? start_kernel+0x424/0x430
 [<ffffffff81c1f33a>] ? x86_64_start_reservations+0x125/0x129
 [<ffffffff81c1f438>] ? x86_64_start_kernel+0xfa/0x109
---[ end trace 120c4b9c89ff5465 ]---
igb 0000:07:00.1: eth3: Reset adapter
bonding: bond0: link status definitely down for interface eth3, disabling it
igb 0000:06:00.0: eth0: Reset adapter
bonding: bond0: link status definitely down for interface eth0, disabling it
igb 0000:06:00.1: eth1: Reset adapter
bonding: bond0: link status definitely down for interface eth1, disabling it
igb 0000:07:00.0: eth2: Reset adapter
bonding: bond0: link status definitely down for interface eth2, disabling it

I've got CentOS 5.7 installed on c2100s as well which don't experience
this issue.

Any ideas whats causing this?

-- 
Regards,

Scott Clark
Unix System Administrator
Webfusion
Web: http://www.webfusion.com/



More information about the Linux-PowerEdge mailing list