PowerEdge 2800 lockup running RHEL 4 U2

Greg Cope greg.cope at e-dba.net
Fri Apr 21 03:06:00 CDT 2006


Sounds like the disks went offline (input/ouput errors).

We sometimes get this on our dell - powering off is the only answer.

Greg

On Fri, 2006-04-21 at 12:54 +1000, Victor Orgos wrote:
> Hi,
> 
> Our Oracle 10g Application Server was misbehaving this morning and no
> matter what I did could not get it to restart or even get details as
> to the cause. By the little I could find out it appears that it was
> the dell module related to the virtual cdrom from the DRAC. 
> 
> The system though not hang was VERY slow. A developer logged on but
> his profile was all screwed up, like it didnt run the scripts and an
> input/output error was reported. When I remotely logged on as root, it
> took several minutes to get to the prompt but no error. System load
> reported by uptime was steady at about 14. LS and PS tools responded
> immediately but top vmstat seem to hang. I tried to kill some
> processes but as far as I could tell they never got the signal. Only
> one defunct process reported by ps command, [40-hal-hotplug].
> 
> I tried to stop the oracle application server using our start/stop
> scripts but got input/output error after a long time. After trying the
> reboot and init 0 commands and not getting anywhere, we powered off
> the machine. After the reboot everything is ok.
> 
> The system is a Poweredge 2800 with 2 xeon cpus and 4g of ram. We are
> running the smp kernel that comes with RHEL and the DELL
> management/system software provided. We've never had any issues before
> and the system has been running for several months ok. Below are log
> excerpts that maybe related. 
> 
> I would appreciate any assistance. The system is due to go into
> production in a few months and we need to be sure that its stable.
> 
> 
> Victor
> 
> ------------------------------------------------
> 
> Apr 20 20:40:01 appsrv crond(pam_unix)[17467]: session closed for user
> root
> Apr 20 20:50:01 appsrv crond(pam_unix)[18151]: session opened for user
> root by (uid=0)
> Apr 20 20:50:01 appsrv crond(pam_unix)[18151]: session closed for user
> root
> Apr 20 20:58:17 appsrv kernel: drivers/usb/input/hid-core.c: input irq
> status -84 received
> Apr 20 20:58:17 appsrv last message repeated 62 times
> Apr 20 20:58:17 appsrv kernel: usb 2-1: USB disconnect, address 2
> Apr 20 20:58:17 appsrv hal.hotplug[18784]: DEVPATH is not set
> Apr 20 20:58:17 appsrv hal.hotplug[18806]: DEVPATH is not set
> Apr 20 20:58:17 appsrv kernel: hdf: status error: status=0x7f
> { DriveReady DeviceFault SeekComplete DataRequest CorrectedError Index
>  Error }
> Apr 20 20:58:17 appsrv kernel: hdf: status error:
> error=0x7fIllegalLengthIndication EndOfMedia Aborted Command
> MediaChangeRequested 
> LastFailedSense 0x07 
> Apr 20 20:58:17 appsrv kernel: hdf: drive not ready for command
> Apr 20 20:58:17 appsrv kernel: hdf: ATAPI reset complete
> Apr 20 20:58:17 appsrv kernel: hdf: status error: status=0x7f
> { DriveReady DeviceFault SeekComplete DataRequest CorrectedError Index
>  Error }
> 
> Last few lines repeat for over 10000 times. Just before the switch
> off,
> 
> 
> Apr 20 20:58:31 appsrv kernel: hdf: status error:
> error=0x7fIllegalLengthIndication EndOfMedia Aborted Command
> MediaChangeRequested 
> LastFailedSense 0x07 
> Apr 20 20:58:31 appsrv kernel: hdf: drive not ready for command
> Apr 20 20:58:31 appsrv kernel: hdf: status error: status=0x7f
> { DriveReady DeviceFault SeekComplete DataRequest CorrectedError Index
>  Error }
> Apr 20 20:58:31 appsrv kernel: hdf: status error:
> error=0x7fIllegalLengthIndication EndOfMedia Aborted Command
> MediaChangeRequested 
> LastFailedSense 0x07 
> Apr 20 20:58:31 appsrv kernel: hdf: drive not ready for command
> Apr 20 20:58:31 appsrv kernel: hdf: status error: status=0x80 { Busy }
> Apr 20 20:58:31 appsrv kernel: hdf: status error:
> error=0x80LastFailedSense 0x08 
> Apr 20 20:58:31 appsrv kernel: hdf: drive not ready for command
> Apr 20 20:58:32 appsrv kernel: irq 193: nobody cared! (screaming
> interrupt?)
> Apr 20 20:58:32 appsrv kernel: irq 193: Please try booting with
> acpi=off and report a bug
> Apr 20 20:58:32 appsrv kernel:  [<c01074c2>] __report_bad_irq
> +0x3a/0x77
> Apr 20 20:58:32 appsrv kernel:  [<c0107739>] note_interrupt+0xea/0x115
> Apr 20 20:58:32 appsrv kernel:  [<c01079e5>] do_IRQ+0x143/0x1ae
> Apr 20 20:58:32 appsrv kernel:  [<c02d1a8c>] common_interrupt
> +0x18/0x20
> Apr 20 20:58:32 appsrv kernel:  [<c01040e5>] mwait_idle+0x33/0x42
> Apr 20 20:58:32 appsrv kernel:  [<c010409d>] cpu_idle+0x26/0x3b
> Apr 20 20:58:32 appsrv kernel: handlers:
> Apr 20 20:58:32 appsrv kernel: [<c023f519>] (ide_intr+0x0/0x11e)
> Apr 20 20:58:32 appsrv kernel: [<c0257f38>] (usb_hcd_irq+0x0/0x4b)
> Apr 20 20:58:32 appsrv kernel: Disabling IRQ #193
> Apr 20 20:58:43 appsrv kernel: usb 2-1: new full speed USB device
> using address 3
> Apr 20 20:58:43 appsrv kernel: usb 2-1: device not accepting address
> 3, error -71
> Apr 20 20:58:44 appsrv kernel: usb 2-1: new full speed USB device
> using address 4
> Apr 20 20:58:44 appsrv hal.hotplug[18938]: DEVPATH is not set
> Apr 20 20:58:44 appsrv kernel: input: USB HID v1.10 Keyboard [Dell
> DRAC4] on usb-0000:00:1d.0-1
> Apr 20 20:58:44 appsrv hal.hotplug[18997]: DEVPATH is not set
> Apr 20 20:58:45 appsrv kernel: input: USB HID v1.10 Mouse [Dell DRAC4]
> on usb-0000:00:1d.0-1
> Apr 21 11:51:12 appsrv syslogd 1.4.1: restart.
> Apr 21 11:51:12 appsrv syslog: syslogd startup succeeded
> Apr 21 11:51:12 appsrv kernel: klogd 1.4.1, log source = /proc/kmsg
> started.
> Apr 21 11:51:12 appsrv kernel: Linux version 2.6.9-22.ELsmp
> (bhcompile at porky.build.redhat.com) (gcc version 3.4.4 20050721 (Red
> Hat 
> 3.4.4-2)) #1 SMP Mon Sep 19 18:32:14 EDT 2005
> Apr 21 11:51:12 appsrv kernel: BIOS-provided physical RAM map:
> 
> 
> 
> 
> Send instant messages to your online friends
> http://au.messenger.yahoo.com 
> 
> _______________________________________________
> Linux-PowerEdge mailing list
> Linux-PowerEdge at dell.com
> http://lists.us.dell.com/mailman/listinfo/linux-poweredge
> Please read the FAQ at http://lists.us.dell.com/faq



More information about the Linux-PowerEdge mailing list