FW: PE600sc + RH 7.3, hangs under heavy IDE load?

bryan@flyingiranch.com bryan at flyingiranch.com
Sat Jan 25 18:09:00 CST 2003


I hadn't revisited this problem in quite a long time (see below), but 
after upgrading to a more recent kernel (RH 2.4.18-19.7.x #1 via 
up2date), I now get the following message when the hang occurs:

------------------------
Message from syslogd at mule at Sat Jan 25 15:34:45 2003 ...
mule kernel: Bank 0: be0000001008081f[0000000000000000] at 
0000000000000000

Message from syslogd at mule at Sat Jan 25 15:34:45 2003 ...
mule kernel: CPU 0: Machine Check Exception: 0000000000000004

Message from syslogd at mule at Sat Jan 25 15:34:45 2003 ...
mule kernel: Kernel panic: CPU context corrupt
------------------------

I really don't believe it's the drive - happens with either of 2 WD 
disks I have tried to use in the machine.

Any ideas?

TIA,

Bryan

-----Original Message-----
From: Steve.Boley 	
Sent: Monday, October 21, 2002 4:33 PM
To: Bryan White
Subject: RE: PE600sc + RH 7.3, hangs under heavy IDE load?


Let me know which kernel you are running and all specifics and I'll see 
if I
can get a 600 and reproduce your error.
Steve

-----Original Message-----
From: bryan at flyingiranch.com [mailto:bryan at flyingiranch.com]
Sent: Monday, October 21, 2002 5:57 PM
To: Steve_Boley at exchange.dell.com
Subject: RE: PE600sc + RH 7.3, hangs under heavy IDE load?


OK, cool. I am going to have to do this in the evening, so I'll report
back any findings tomorrow.

Again, I appreciate your insight and help...

Bryan

> -----Original Message-----
> From: Steve.Boley [mailto:Steve_Boley at Dell.com]
> Sent: Monday, October 21, 2002 3:52 PM
> To: Bryan White
> Subject: RE: PE600sc + RH 7.3, hangs under heavy IDE load?
> 
> 
> Yeah pull the drive in there due to it is running the 
> operating system and
> even if you tar on the other drive it is still accessing the 
> original.  This
> requires reinstall but can isolate if it is the drive or not.
> Steve
> 
> -----Original Message-----
> From: bryan at flyingiranch.com [mailto:bryan at flyingiranch.com]
> Sent: Monday, October 21, 2002 5:49 PM
> To: Steve_Boley at exchange.dell.com
> Subject: RE: PE600sc + RH 7.3, hangs under heavy IDE load?
> 
> 
> The drive is on the primary channel. I do have a spare 40gb WD Caviar
> drive sitting right here in front of me - to sufficiently test this
> issue, am I going to need to boot *without* the 
> now-in-question drive on
> the bus, or can I leave it on the chain and slave the new 
> drive to it? I
> am assuming the former (which is more work, but oh well) in 
> order to do
> a clean test...?
> 
> Thanks for the fast info - very helpful indeed. Too bad we already
> migrated our mail to this machine before realizing the stability
> problems...
> 
> Bryan
> 
> > -----Original Message-----
> > From: Steve.Boley [mailto:Steve_Boley at Dell.com]
> > Sent: Monday, October 21, 2002 3:41 PM
> > To: Bryan White
> > Subject: RE: PE600sc + RH 7.3, hangs under heavy IDE load?
> > 
> > 
> > Alan is saying he thinks it's something hardware.  Do you 
> > have a spare ide
> > drive laying around that you can throw in and load real quick 
> > and see if it
> > is possibly the drive that is causing it?  Also that box 
> has the dual
> > channel ide as well as the tertiary (3rd channel) controller 
> > as well.  Which
> > channel do you have the drive in question on?
> > Steve
> > 
> > -----Original Message-----
> > From: bryan at flyingiranch.com [mailto:bryan at flyingiranch.com]
> > Sent: Monday, October 21, 2002 5:37 PM
> > To: Steve_Boley at exchange.dell.com
> > Subject: RE: PE600sc + RH 7.3, hangs under heavy IDE load?
> > 
> > 
> > Thanks - very much appreciated!
> > 
> > Bryan
> > 
> > > -----Original Message-----
> > > From: Steve.Boley [mailto:Steve_Boley at Dell.com]
> > > Sent: Monday, October 21, 2002 3:33 PM
> > > To: Bryan White
> > > Subject: RE: PE600sc + RH 7.3, hangs under heavy IDE load?
> > > 
> > > 
> > > It's an issue with serverworks ide and udma that has been 
> > > known since redhat
> > > 7.1.  I've got an email out to Alan Cox to find out if anyone 
> > > has touched
> > > this as far as a kernel source driver patch and will get back 
> > > to you as soon
> > > as I find something out.
> > > Steve
> > > 
> > > -----Original Message-----
> > > From: bryan at flyingiranch.com [mailto:bryan at flyingiranch.com]
> > > Sent: Monday, October 21, 2002 4:36 PM
> > > To: Steve_Boley at exchange.dell.com
> > > Subject: RE: PE600sc + RH 7.3, hangs under heavy IDE load?
> > > 
> > > 
> > > Hmm. No tape drive in this situation, just the barebones 
> w/ one 20GB
> > > drive... behavior seems the same with UDMA / MDMA2. Do you 
> > know of an
> > > up-to-date info source on this "serverworks ide bug"? From 
> > the way you
> > > phrase it, it sounds like a known thing...
> > > 
> > > It's pretty unacceptable to me if I can't put this machine 
> > under even
> > > the load of "tar czf"...!
> > > 
> > > Thanks for the response,
> > > 
> > > Bryan
> > > 
> > > > -----Original Message-----
> > > > From: Steve.Boley [mailto:Steve_Boley at Dell.com]
> > > > Sent: Monday, October 21, 2002 1:36 PM
> > > > To: Bryan White
> > > > Subject: RE: PE600sc + RH 7.3, hangs under heavy IDE load?
> > > > 
> > > > 
> > > > Probably would be better to leave udma enabled for the hard 
> > > drive and
> > > > disable for the tape drive if it's ide tbu.  Otherwise it 
> > > > looks like the
> > > > serverworks ide bug has got ya good.
> > > > Steve
> > > > 
> > > > -----Original Message-----
> > > > From: bryan at flyingiranch.com [mailto:bryan at flyingiranch.com]
> > > > Sent: Monday, October 21, 2002 3:24 PM
> > > > To: rappleye at cse.Buffalo.EDU
> > > > Cc: linux-poweredge at exchange.dell.com
> > > > Subject: RE: PE600sc + RH 7.3, hangs under heavy IDE load?
> > > > 
> > > > 
> > > > Unfortunately, about 14 hours after my initial positive 
> > > report re: the
> > > > switch to MDMA2 mode, I experienced another hard hang 
> > during a tar +
> > > > gzip of a large archive containing some particularly large 
> > > individual
> > > > files.
> > > > 
> > > > Is anyone else experiencing system hangs with the 600sc (or 
> > > > other system
> > > > with SystemWorks chipset) during intense IDE operation?
> > > > 
> > > > FYI, just in case, I am pasting (below) the output of 
> > "hdparm -I -v
> > > > /dev/hda"...
> > > > 
> > > > Bryan
> > > > 
> > > > [root at mule root]# hdparm -I -v /dev/hda
> > > > 
> > > > /dev/hda:
> > > >  multcount    = 16 (on)
> > > >  I/O support  =  0 (default 16-bit)
> > > >  unmaskirq    =  0 (off)
> > > >  using_dma    =  1 (on)
> > > >  keepsettings =  0 (off)
> > > >  nowerr       =  0 (off)
> > > >  readonly     =  0 (off)
> > > >  readahead    =  8 (on)
> > > >  geometry     = 2431/255/63, sectors = 39062500, start = 0
> > > > 
> > > > non-removable ATA device, with non-removable media
> > > >         Model Number:           WDC WD200BB-18DEA0            
> > > >           
> > > >         Serial Number:          WD-WMAD21182988
> > > >         Firmware Revision:      05.03E05
> > > > Standards:
> > > >         Supported: 1 2 3 4 5 
> > > >         Likely used: 5
> > > > Configuration:
> > > >         Logical         max     current
> > > >         cylinders       16383   16383
> > > >         heads           16      16
> > > >         sectors/track   63      63
> > > >         bytes/track:    57600           (obsolete)
> > > >         bytes/sector:   600             (obsolete)
> > > >         current sector capacity: 16514064
> > > >         LBA user addressable sectors = 39062500
> > > > Capabilities:
> > > >         LBA, IORDY(can be disabled)
> > > >         Buffer size: 2048.0kB   ECC bytes: 40   Queue depth: 1
> > > >         Standby timer values: spec'd by standard, with 
> > > device specific
> > > > minimum
> > > >         r/w multiple sector transfer: Max = 16  Current = 16
> > > >         DMA: mdma0 mdma1 *mdma2 udma0 udma1 udma2 udma3 
> > udma4 udma5 
> > > >              Cycle time: min=120ns recommended=120ns
> > > >         PIO: pio0 pio1 pio2 pio3 pio4 
> > > >              Cycle time: no flow control=120ns  IORDY flow 
> > > > control=120ns
> > > > Commands/features:
> > > >         Enabled Supported:
> > > >            *    READ BUFFER cmd
> > > >            *    WRITE BUFFER cmd
> > > >            *    Host Protected Area feature set
> > > >            *    look-ahead
> > > >            *    write cache
> > > >            *    Power Management feature set
> > > >            *    SMART feature set
> > > >                 SET MAX security extension
> > > >            *    DOWNLOAD MICROCODE cmd
> > > > HW reset results:
> > > >         CBLID- above Vih
> > > >         Device num = 0 determined by CSEL
> > > > Checksum: correct
> > > > 
> > > > 
> > > > 
> > > 
> > > 
> > > 
> > 
> > 
> > 
> 
> 
> 






More information about the Linux-PowerEdge mailing list