FW: PE600sc + RH 7.3, hangs under heavy IDE load?
bryan@flyingiranch.com
bryan at flyingiranch.com
Sat Jan 25 18:09:00 CST 2003
I hadn't revisited this problem in quite a long time (see below), but
after upgrading to a more recent kernel (RH 2.4.18-19.7.x #1 via
up2date), I now get the following message when the hang occurs:
------------------------
Message from syslogd at mule at Sat Jan 25 15:34:45 2003 ...
mule kernel: Bank 0: be0000001008081f[0000000000000000] at
0000000000000000
Message from syslogd at mule at Sat Jan 25 15:34:45 2003 ...
mule kernel: CPU 0: Machine Check Exception: 0000000000000004
Message from syslogd at mule at Sat Jan 25 15:34:45 2003 ...
mule kernel: Kernel panic: CPU context corrupt
------------------------
I really don't believe it's the drive - happens with either of 2 WD
disks I have tried to use in the machine.
Any ideas?
TIA,
Bryan
-----Original Message-----
From: Steve.Boley
Sent: Monday, October 21, 2002 4:33 PM
To: Bryan White
Subject: RE: PE600sc + RH 7.3, hangs under heavy IDE load?
Let me know which kernel you are running and all specifics and I'll see
if I
can get a 600 and reproduce your error.
Steve
-----Original Message-----
From: bryan at flyingiranch.com [mailto:bryan at flyingiranch.com]
Sent: Monday, October 21, 2002 5:57 PM
To: Steve_Boley at exchange.dell.com
Subject: RE: PE600sc + RH 7.3, hangs under heavy IDE load?
OK, cool. I am going to have to do this in the evening, so I'll report
back any findings tomorrow.
Again, I appreciate your insight and help...
Bryan
> -----Original Message-----
> From: Steve.Boley [mailto:Steve_Boley at Dell.com]
> Sent: Monday, October 21, 2002 3:52 PM
> To: Bryan White
> Subject: RE: PE600sc + RH 7.3, hangs under heavy IDE load?
>
>
> Yeah pull the drive in there due to it is running the
> operating system and
> even if you tar on the other drive it is still accessing the
> original. This
> requires reinstall but can isolate if it is the drive or not.
> Steve
>
> -----Original Message-----
> From: bryan at flyingiranch.com [mailto:bryan at flyingiranch.com]
> Sent: Monday, October 21, 2002 5:49 PM
> To: Steve_Boley at exchange.dell.com
> Subject: RE: PE600sc + RH 7.3, hangs under heavy IDE load?
>
>
> The drive is on the primary channel. I do have a spare 40gb WD Caviar
> drive sitting right here in front of me - to sufficiently test this
> issue, am I going to need to boot *without* the
> now-in-question drive on
> the bus, or can I leave it on the chain and slave the new
> drive to it? I
> am assuming the former (which is more work, but oh well) in
> order to do
> a clean test...?
>
> Thanks for the fast info - very helpful indeed. Too bad we already
> migrated our mail to this machine before realizing the stability
> problems...
>
> Bryan
>
> > -----Original Message-----
> > From: Steve.Boley [mailto:Steve_Boley at Dell.com]
> > Sent: Monday, October 21, 2002 3:41 PM
> > To: Bryan White
> > Subject: RE: PE600sc + RH 7.3, hangs under heavy IDE load?
> >
> >
> > Alan is saying he thinks it's something hardware. Do you
> > have a spare ide
> > drive laying around that you can throw in and load real quick
> > and see if it
> > is possibly the drive that is causing it? Also that box
> has the dual
> > channel ide as well as the tertiary (3rd channel) controller
> > as well. Which
> > channel do you have the drive in question on?
> > Steve
> >
> > -----Original Message-----
> > From: bryan at flyingiranch.com [mailto:bryan at flyingiranch.com]
> > Sent: Monday, October 21, 2002 5:37 PM
> > To: Steve_Boley at exchange.dell.com
> > Subject: RE: PE600sc + RH 7.3, hangs under heavy IDE load?
> >
> >
> > Thanks - very much appreciated!
> >
> > Bryan
> >
> > > -----Original Message-----
> > > From: Steve.Boley [mailto:Steve_Boley at Dell.com]
> > > Sent: Monday, October 21, 2002 3:33 PM
> > > To: Bryan White
> > > Subject: RE: PE600sc + RH 7.3, hangs under heavy IDE load?
> > >
> > >
> > > It's an issue with serverworks ide and udma that has been
> > > known since redhat
> > > 7.1. I've got an email out to Alan Cox to find out if anyone
> > > has touched
> > > this as far as a kernel source driver patch and will get back
> > > to you as soon
> > > as I find something out.
> > > Steve
> > >
> > > -----Original Message-----
> > > From: bryan at flyingiranch.com [mailto:bryan at flyingiranch.com]
> > > Sent: Monday, October 21, 2002 4:36 PM
> > > To: Steve_Boley at exchange.dell.com
> > > Subject: RE: PE600sc + RH 7.3, hangs under heavy IDE load?
> > >
> > >
> > > Hmm. No tape drive in this situation, just the barebones
> w/ one 20GB
> > > drive... behavior seems the same with UDMA / MDMA2. Do you
> > know of an
> > > up-to-date info source on this "serverworks ide bug"? From
> > the way you
> > > phrase it, it sounds like a known thing...
> > >
> > > It's pretty unacceptable to me if I can't put this machine
> > under even
> > > the load of "tar czf"...!
> > >
> > > Thanks for the response,
> > >
> > > Bryan
> > >
> > > > -----Original Message-----
> > > > From: Steve.Boley [mailto:Steve_Boley at Dell.com]
> > > > Sent: Monday, October 21, 2002 1:36 PM
> > > > To: Bryan White
> > > > Subject: RE: PE600sc + RH 7.3, hangs under heavy IDE load?
> > > >
> > > >
> > > > Probably would be better to leave udma enabled for the hard
> > > drive and
> > > > disable for the tape drive if it's ide tbu. Otherwise it
> > > > looks like the
> > > > serverworks ide bug has got ya good.
> > > > Steve
> > > >
> > > > -----Original Message-----
> > > > From: bryan at flyingiranch.com [mailto:bryan at flyingiranch.com]
> > > > Sent: Monday, October 21, 2002 3:24 PM
> > > > To: rappleye at cse.Buffalo.EDU
> > > > Cc: linux-poweredge at exchange.dell.com
> > > > Subject: RE: PE600sc + RH 7.3, hangs under heavy IDE load?
> > > >
> > > >
> > > > Unfortunately, about 14 hours after my initial positive
> > > report re: the
> > > > switch to MDMA2 mode, I experienced another hard hang
> > during a tar +
> > > > gzip of a large archive containing some particularly large
> > > individual
> > > > files.
> > > >
> > > > Is anyone else experiencing system hangs with the 600sc (or
> > > > other system
> > > > with SystemWorks chipset) during intense IDE operation?
> > > >
> > > > FYI, just in case, I am pasting (below) the output of
> > "hdparm -I -v
> > > > /dev/hda"...
> > > >
> > > > Bryan
> > > >
> > > > [root at mule root]# hdparm -I -v /dev/hda
> > > >
> > > > /dev/hda:
> > > > multcount = 16 (on)
> > > > I/O support = 0 (default 16-bit)
> > > > unmaskirq = 0 (off)
> > > > using_dma = 1 (on)
> > > > keepsettings = 0 (off)
> > > > nowerr = 0 (off)
> > > > readonly = 0 (off)
> > > > readahead = 8 (on)
> > > > geometry = 2431/255/63, sectors = 39062500, start = 0
> > > >
> > > > non-removable ATA device, with non-removable media
> > > > Model Number: WDC WD200BB-18DEA0
> > > >
> > > > Serial Number: WD-WMAD21182988
> > > > Firmware Revision: 05.03E05
> > > > Standards:
> > > > Supported: 1 2 3 4 5
> > > > Likely used: 5
> > > > Configuration:
> > > > Logical max current
> > > > cylinders 16383 16383
> > > > heads 16 16
> > > > sectors/track 63 63
> > > > bytes/track: 57600 (obsolete)
> > > > bytes/sector: 600 (obsolete)
> > > > current sector capacity: 16514064
> > > > LBA user addressable sectors = 39062500
> > > > Capabilities:
> > > > LBA, IORDY(can be disabled)
> > > > Buffer size: 2048.0kB ECC bytes: 40 Queue depth: 1
> > > > Standby timer values: spec'd by standard, with
> > > device specific
> > > > minimum
> > > > r/w multiple sector transfer: Max = 16 Current = 16
> > > > DMA: mdma0 mdma1 *mdma2 udma0 udma1 udma2 udma3
> > udma4 udma5
> > > > Cycle time: min=120ns recommended=120ns
> > > > PIO: pio0 pio1 pio2 pio3 pio4
> > > > Cycle time: no flow control=120ns IORDY flow
> > > > control=120ns
> > > > Commands/features:
> > > > Enabled Supported:
> > > > * READ BUFFER cmd
> > > > * WRITE BUFFER cmd
> > > > * Host Protected Area feature set
> > > > * look-ahead
> > > > * write cache
> > > > * Power Management feature set
> > > > * SMART feature set
> > > > SET MAX security extension
> > > > * DOWNLOAD MICROCODE cmd
> > > > HW reset results:
> > > > CBLID- above Vih
> > > > Device num = 0 determined by CSEL
> > > > Checksum: correct
> > > >
> > > >
> > > >
> > >
> > >
> > >
> >
> >
> >
>
>
>
More information about the Linux-PowerEdge
mailing list