RHEL4 System Crash: Unable to handle kernel paging request at virtual address

Shane Presley shane.presley at gmail.com
Wed Jun 6 06:30:00 CDT 2007


Hello,

I had a RHEL4 system crash a day or two ago.  First RedHat system that
I've ever seen completely hung, requiring me to hard power cycle it.
Felt like my Windows days.  But then it happened yesterday as well.
So something is wrong with this server.

It's fully patched (up2date), and is a Dell 2850.  I captured the
/var/log/messages right before it panicked and here's the logs:

Jun  4 21:26:42 myhost kernel: Unable to handle kernel paging request
at virtual address 0f3514db
Jun  4 21:26:42 myhost kernel:  printing eip:
Jun  4 21:26:42 myhost kernel: c01705b8
Jun  4 21:26:42 myhost kernel: *pde = 33c68001
Jun  4 21:26:42 myhost kernel: Oops: 0000 [#1]
Jun  4 21:26:42 myhost kernel: SMP
Jun  4 21:26:42 myhost kernel: Modules linked in: mptctl mptbase
ipmi_devintf ipmi_si ipmi_msghandler dell_rbu autofs4 i2c_dev i2c_core
sunrpc dm_mirror dm_mod button battery ac md5 ipv6 uhci_hcd ehci_hcd
e1000 floppy ata_piix libata sg ext3 jbd megaraid_mbox megaraid_mm
sd_mod scsi_mod
Jun  4 21:26:42 myhost kernel: CPU:    0
Jun  4 21:26:42 myhost kernel: EIP:    0060:[<c01705b8>]    Not tainted VLI
Jun  4 21:26:42 myhost kernel: EFLAGS: 00010206   (2.6.9-55.ELsmp)
Jun  4 21:26:42 myhost kernel: EIP is at __d_lookup+0x65/0x109
Jun  4 21:26:42 myhost kernel: eax: c2155c30   ebx: cada98f6   ecx:
00000011   edx: c212e200
Jun  4 21:26:42 myhost kernel: esi: 0f3514db   edi: cada98f6   ebp:
f43aa50c   esp: f3789e0c
Jun  4 21:26:42 myhost kernel: ds: 007b   es: 007b   ss: 0068
Jun  4 21:26:42 myhost kernel: Process bbtest-net (pid: 2942,
threadinfo=f3789000 task=f24723b0)
Jun  4 21:26:42 myhost kernel: Stack: 00000000 c2155c30 e1cbe00e
cada98f6 0000000c f3789e80 cada98f6 00000000
Jun  4 21:26:42 myhost kernel:        cada98f6 f3789f50 c0166ba3
f7f1be00 f3789e78 f3789e80 cada98f6 f543b548
Jun  4 21:26:42 myhost kernel:        cada98f6 f3789f50 c0167475
00000000 00000000 00000000 fffcf000 c1c18aa0
Jun  4 21:26:42 myhost kernel: Call Trace:
Jun  4 21:26:42 myhost kernel:  [<c0166ba3>] do_lookup+0x23/0xb1
Jun  4 21:26:42 myhost kernel:  [<c0167475>] __link_path_walk+0x844/0xc25
Jun  4 21:26:42 myhost kernel:  [<c0167899>] link_path_walk+0x43/0xbe
Jun  4 21:26:42 myhost kernel:  [<c02d443f>] __cond_resched+0x14/0x39
Jun  4 21:26:42 myhost kernel:  [<c01c3e8a>] direct_strncpy_from_user+0x3e/0x5d
Jun  4 21:26:42 myhost kernel:  [<c011b01b>] do_page_fault+0x1ae/0x5c6
Jun  4 21:26:42 myhost kernel:  [<c0167c2e>] path_lookup+0x14b/0x17f
Jun  4 21:26:42 myhost kernel:  [<c0168309>] open_namei+0x99/0x579
Jun  4 21:26:42 myhost kernel:  [<c015a599>] filp_open+0x45/0x70
Jun  4 21:26:42 myhost kernel:  [<c02d443f>] __cond_resched+0x14/0x39
Jun  4 21:26:42 myhost kernel:  [<c01c3e8a>] direct_strncpy_from_user+0x3e/0x5d
Jun  4 21:26:42 myhost kernel:  [<c015a8f5>] sys_open+0x31/0x7d
Jun  4 21:26:42 myhost kernel:  [<c02d5ee3>] syscall_call+0x7/0xb
Jun  4 21:26:42 myhost kernel: Code: 24 0c 89 c2 81 f2 01 00 37 9e d3
ea 31 d0 8b 15 e8 a0 44 c0 23 05 e0 a0 44 c0 8d 04 82 89 44 24 04 8b
30 85 f6 0f 84 99 00 00 00 <8b> 06 0f 18 00 90 8d 5e 98 0f ae e8 8d 76
00 8b 44 24 0c 39 43
Jun  4 21:26:42 myhost kernel:  <0>Fatal exception: panic in 5 seconds

So I'm not sure what to make of that.  I noticed one process name in
there (bbtest-net) which is part of my BigBrother monitoring system.
But that's been running OK for years, and hasn't been changed
recently.  Not sure where else to look.  Could this be a hardware
(memory?) problem?

Shane



More information about the Linux-PowerEdge mailing list