RedHat 9 aacraid - system fails under extreme disk IO - Repro ducable test case

Russell Stuart rstuart at lubemobile.com.au
Mon Oct 27 15:07:01 CST 2003


Just speaking for myself - no, I didn't do anything 
with the firmware.  It was a backup server that was
causing problems.  If it had been my production server
I may have been desperate enough to try that.

On Mon, 2003-10-27 at 19:36, Joe Stevens wrote:
> Did we ever confirm if there is a big problem with just running firmware
> 3157 unill the fix is released?
> 
> 
> Russell Stuart wrote:
> 
> > Just to confirm, your work around works.  I have been thrashing it for a
> > week.  No failures.  It would of failed by now under all other tests I
> > have done.  SMP, SMT and raid write caching are enabled.  There is one
> > odd message in /var/log/messages:
> > 
> > Oct 25 10:03:17 mephisto kernel: aacraid: Host adapter reset request.
> > SCSI hang ?
> > Oct 25 10:03:17 mephisto kernel: aacraid: Outstanding commands on
> > (0,0,0,0):
> > Oct 25 10:03:17 mephisto kernel:    0 C  2a 00 00 00 00 6f 00 00 08 00
> > Oct 25 10:03:17 mephisto kernel:    1 A* 2a 00 03 20 31 77 00 00 60 00
> > Oct 25 10:03:17 mephisto kernel:    2 C  2a 00 00 08 00 6f 00 00 08 00
> > Oct 25 10:03:17 mephisto kernel:    3 C  2a 00 00 00 00 47 00 00 08 00
> > Oct 25 10:03:17 mephisto kernel:    4 A  2a 00 02 46 e6 a7 00 00 80 00
> > Oct 25 10:03:17 mephisto kernel:    5 A  28 00 01 10 2a bf 00 00 08 00
> > Oct 25 10:03:17 mephisto kernel:    6 C  2a 00 00 10 00 87 00 00 10 00
> > Oct 25 10:03:17 mephisto kernel:    7 A  2a 00 02 46 e7 27 00 00 20 00
> > Oct 25 10:03:17 mephisto kernel:    8 C  2a 00 00 10 00 6f 00 00 10 00
> > Oct 25 10:03:17 mephisto kernel:    9 C  2a 00 02 c0 00 87 00 00 10 00
> > 
> > So thanks.  Best of luck with the firmware bug.
> > 
> > On Mon, 2003-10-20 at 23:33, Salyzyn, Mark wrote:
> > 
> >>This also deals with the long standing thread (Since April)
> >>
> >>Subject: kernel: aacraid: Host adapter reset request. SCSI hang ?
> >>
> >>We have a driver workaround!!!
> >>
> >>The root cause is traced to `something' triggering the Firmware to flush
> >>it's cache at too high of a priority, causing the adapter to be reticent on
> >>new commands until the flush has completed. That something could be
> >>management applications, device misbehavior, bus conditions or the position
> >>of the moon. I have clocked a worst case of 73 seconds where the adapter is
> >>too busy, a Firmware fix is in the works, but given the longer lead times
> >>for acceptance of new Firmware and Driver packaging I am providing a driver
> >>source workaround for those that are experiencing this problem and have the
> >>savvy to build their own driver modules. Not all Firmware variants have this
> >>problem, so taking this driver is optional. This driver has only been unit
> >>tested on a handful of systems.
> >>
> >>The workaround is to wait up to an additional 60 seconds until all commands
> >>are complete in aac_eh_reset handler, effectively waiting for the firmware
> >>to complete the cache flush. Upon return, the error recovery code in the
> >>SCSI layer then can issue it's test unit ready and get a timely enough
> >>response so that it does not take the device(s) offline.
> >>
> >>Sincerely -- Mark Salyzyn
> >>
> >>-----Original Message-----
> >>From: Russell Stuart [mailto:rstuart at lubemobile.com.au]
> >>Sent: Wednesday, October 08, 2003 9:19 PM
> >>To: Salyzyn, Mark
> >>Cc: linux-poweredge at dell.com
> >>Subject: RE: RedHat 9 aacraid - system fails under extreme disk IO -
> >>Reproducable test case
> >>
> >>
> >>You describe my setup exactly.  I have two machines, one fails within 4
> >>hours and the other works perfectly.
> >>
> >>Are you saying then that if I revert the firmware to Build 3157 the
> >>problem will go away?  Is there some reason I should not do this (like
> >>other bugs in 3157)?
> >>
> >>On Wed, 2003-10-08 at 23:56, Salyzyn, Mark wrote:
> >>
> >>>I have not been able to duplicate this issue, so I am somewhat of a JAFO,
> >>>and am *not* a definitive resource.
> >>>
> >>>This issue is not just one problem. noapic kernel option and turning off
> >>>HyperThreading have resolved some of the reported issues. Driver changes
> >>>thus far can not eliminate the problem, but can delay the inevitable.
> >>
> >>Build
> >>
> >>>3157 of the Firmware appears to work fine, Build 3170 fails, but only with
> >>>certain Seagate 15K rpm U320 drives. 
> >>>
> >>>I may be wrong ... any corrections to my assumptions above would be
> >>
> >>greatly
> >>
> >>>appreciated.
> >>>
> >>>Sincerely -- Mark Salyzyn
> >>>
> >>>-----Original Message-----
> >>>From: Thomas Petersen [mailto:tomp at securityminded.net]
> >>>Sent: Tuesday, October 07, 2003 8:52 PM
> >>>To: 'Andrew Mann'
> >>>Cc: linux-poweredge at dell.com; Salyzyn, Mark
> >>>Subject: RE: RedHat 9 aacraid - system fails under extreme disk IO -
> >>>Reproducable test case
> >>>
> >>>
> >>>I am pretty disappointed in Dell for failing to follow up on this and
> >>>resolve the issue once and for all.  This is not a new problem but it is
> >>>Dell's responsibility to rectify it as they -certify- Redhat on the 2650
> >>
> >>--
> >>
> >>>regardless if it's a hardware or software issue Dell is responsible to
> >>
> >>their
> >>
> >>>customers.  
> >>>
> >>>If this was an issue on the Microsoft platform you can bet Dell would of
> >>>worked with Microsoft and issued a patch/update long before it became a
> >>
> >>wide
> >>
> >>>spread problem.  I have always been a huge fan of Dell equipment but their
> >>>failure in this instance to support what they sell is very troubling. 
> >>>
> >>>Don't get me wrong I will probably purchase Dell servers again in the
> >>
> >>future
> >>
> >>>(though not the 2650) but can anyone name one problem affecting the
> >>>Microsoft platform, related to Dell hardware and had a problem of this
> >>>magnitude, that went unresolved for as long as this one has?  System
> >>
> >>lockups
> >>
> >>>are -totally- unacceptable.  
> >>>
> >>>I guess when people start choosing with their checkbooks Dell might wake
> >>
> >>up.
> >>
> >>>Thomas Petersen
> >>>SecurityMinded Technologies 
> >>>
> >>>
> >>>>>-----Original Message-----
> >>>>>From: Andrew Mann [mailto:amann at mythicentertainment.com] 
> >>>>>Sent: Tuesday, October 07, 2003 6:20 PM
> >>>>>To: linux-poweredge at dell.com
> >>>>>Cc: mark_salyzyn at adaptec.com
> >>>>>Subject: Re: RedHat 9 aacraid - system fails under extreme 
> >>>>>disk IO - Reproducable test case
> >>>>>
> >>>>>
> >>>>>	Unfortunately we've got a good number of 2550s and 
> >>>>>2650s in use, and 
> >>>>>replacing the RAID cards isn't ideal.  Mostly we don't have 
> >>>>>enough load 
> >>>>>to cause this problem, but every now and then we do get an 
> >>>>>unexplained 
> >>>>>lockup that pulls someone out of bed at 2 AM.
> >>>>>	I searched back through the reports of this and found 
> >>>>>some posts from 
> >>>>>Mark Salyzyn referencing AAC_NUM_FIB and AAC_NUM_IO_FIB 
> >>>>>settings.  The 
> >>>>>last comment I see is on 9/9/2003:
> >>>>>"I am suggesting that this value be (AAC_NUM_IO_FIB+64), and 
> >>>>>limited to 
> >>>>>below 512 (the maximum number of hardware FIBS the Firmware 
> >>>>>can absorb). 
> >>>>>I will begin testing the stability and side effects of this input."
> >>>>>	However, I don't see any followup, nor does the latest 
> >>>>>patchset to the 
> >>>>>2.4 series seem to contain any modifications in this area (or 
> >>>>>2.5 or 2.6 
> >>>>>since June 2003).
> >>>>>	Additionally, I've just rebuilt the aacraid module here 
> >>>>
> >>>>>from the RedHat 
> >>>>
> >>>>>SRPM of 2.4.20-20.9 with AAC_NUM_FIB=512 and 
> >>>>>AAC_NUM_IO_FIB=448, rebuilt 
> >>>>>the rdimage and such and got another crash within 5 minutes 
> >>>>>of starting 
> >>>>>the test.
> >>>>>
> >>>>>	I also see a note from Mark on 8/27/2003:
> >>>>>-----
> >>>>>There is code that does the following in the driver:
> >>>>>
> >>>>>	scsicmd->result = DID_OK << 16 | COMMAND_COMPLETE << 8 
> >>>>>| SAM_STAT_TASK_SET_FULL;
> >>>>>	aac_io_done(scsicmd);
> >>>>>	return -1;
> >>>>>
> >>>>>This is *wrong*, because the none zero return causes the 
> >>>>>system to hold 
> >>>>>the command in the queue due to the use of the new error 
> >>>>>handler, yet we 
> >>>>>have also completed the command as `BUSY' *and* as a result of the 
> >>>>>constraints of the aac_io_done call which relocks (on 
> >>>>>io_request_lock) 
> >>>>>the caller had to unlock leaving a hole that SMP machines fill. By 
> >>>>>dropping the result and done calls in these situations, and 
> >>>>>holding the 
> >>>>>locks in the caller of such routines, I believe we will close 
> >>>>>this hole.
> >>>>>
> >>>>>....
> >>>>>
> >>>>>I will report back on my tests of these changes, but will need a 
> >>>>>volunteer with kernel compile experience to report on the success in 
> >>>>>resolving this issue in the field *please*.
> >>>>>-----
> >>>>>
> >>>>>	I'm not familiar enough with the aacraid driver or scsi 
> >>>>>in general to 
> >>>>>gather the code changes necessary.  There also don't appear to be any 
> >>>>>followups.
> >>>>>
> >>>>>	Mark, do you have any updates on this?  I can make code 
> >>>>>changes, 
> >>>>>recompile, and run a test case that reliably reveals the 
> >>>>>problem here if 
> >>>>>that's helpful.
> >>>>>
> >>>>>
> >>>>>I can't see the full panic message, but the parts I can see are 
> >>>>>basically (copied by hand):
> >>>>>
> >>>>>CPU 1
> >>>>>EFLAGS: 00010086
> >>>>>
> >>>>>EIP is at rmqueue [kernel] 0x127  (2.4.20-20.9smp)
> >>>>>eax: c0343400    ebx: c03445dc    ecx: 00000000
> >>>>>edx: b6d7ca63    esi: 00000000    edi: c03445d0
> >>>>>ebp: 00038000    esp: ee643e80     ds: 0068
> >>>>>es: 0068  ss: 0068
> >>>>>
> >>>>>Process dd (pid: 956, stack page = ee643000)
> >>>>>
> >>>>>Call trace:   wakeup_kswapd   0xfb (0xee643e90)
> >>>>>              __aloc_pages_limit   0x57
> >>>>>              __alloc_pages        0x101
> >>>>>              generic_file_write   0x394
> >>>>>              ext3_file_write      0x39
> >>>>>              sys_write            0x97
> >>>>>              system_call          0x33
> >>>>>
> >>>>>	Although aacraid isn't directly implicated here, I can 
> >>>>>reproduce this 
> >>>>>on the 2550s and 2650s (aacraid) but not 1750s (megaraid).
> >>>>>
> >>>>>Andrew
> >>>>>
> >>>>>Paul Anderson wrote:
> >>>>>
> >>>>>
> >>>>>>We had this same issue with our 2650's running AS 2.1.  Don't know 
> >>>>>>that this is the best answer, but it is the one that worked for 
> >>>>>>us...Replace the on board adapter with a PERC 3/DC (LSI) adapter.  
> >>>>>>Make sure that you put it on its own bus, we used slot 
> >>>>>
> >>>>>three.  In 2 of 
> >>>>>
> >>>>>>our 2650's we are even running this with the HBA's for SAN 
> >>>>>>connectivity.  That said, our solution is about 2 weeks 
> >>>>>
> >>>>>old, though I 
> >>>>>
> >>>>>>did run similar tests on the systems after the new install 
> >>>>>
> >>>>>for 8 days 
> >>>>>
> >>>>>>and was unable to make them crash.
> >>>>>>
> >>>>>>Paul
> >>>>>>
> >>>>>>-----Original Message-----
> >>>>>>From: Andrew Mann [mailto:amann at mythicentertainment.com]
> >>>>>>Sent: Tuesday, October 07, 2003 12:47 PM
> >>>>>>To: linux-poweredge at dell.com
> >>>>>>Cc: Matt Domsch; deanna_bonds at adaptec.com; alan at redhat.com
> >>>>>>Subject: RedHat 9 aacraid - system fails under extreme disk IO - 
> >>>>>>Reproducable test case
> >>>>>>
> >>>>>>
> >>>>>>	This has been brought up on the Dell Linux Poweredge 
> >>>>>
> >>>>>list previously,
> >>>>>
> >>>>>>but it doesn't appear that a definative solution or reproducable 
> >>>>>>situation has been presented.  It also seems like the 
> >>>>>
> >>>>>previous reports 
> >>>>>
> >>>>>>involved both heavy disk IO as well as heavy network 
> >>>>>
> >>>>>traffic, and so the 
> >>>>>
> >>>>>>NIC driver was suspect.
> >>>>>>	Since we have a number of 2550s and 2650s using the 
> >>>>>
> >>>>>onboard PERC3/Di 
> >>>>>
> >>>>>>raid controller (aacraid driver), this issue concerns us.
> >>>>>>
> >>>>>>	The following script was run with 6 instances at once 
> >>>>>
> >>>>>on two 2550s 
> >>>>>
> >>>>>>and
> >>>>>>one 2650.
> >>>>>>
> >>>>>>2550 configuration
> >>>>>>2 x P3 1.2 Ghz  kernel: 2.4.20-20.9smp #1 SMP
> >>>>>>1GB of ram, 2GB of swap, 2 x 18 GB drives in a raid 1 configuration
> >>>>>>
> >>>>>>2650 configuration
> >>>>>>2 x Xeon 2.2 Ghz   kernel: 2.4.20-20.9smp #1 SMP
> >>>>>>2GB of ram, 2GB of swap, 2 x 18 GB drives in a raid 1 configuration 
> >>>>>>Hyperthreading enabled
> >>>>>>
> >>>>>>
> >>>>>>	The 2550s fail within 30 minutes of starting the tests 
> >>>>>
> >>>>>each time 
> >>>>>
> >>>>>>(tests
> >>>>>>were run 6 times in a row).  The 2650 failed prior to 2.5 
> >>>>>
> >>>>>days (only 1 
> >>>>>
> >>>>>>test run due to duration before failure).  In some cases the 2550 
> >>>>>>displayed a null pointer dereference in the kernel.  I'll copy down 
> >>>>>>details next time I can catch it on screen.  It does not 
> >>>>>
> >>>>>get logged to 
> >>>>>
> >>>>>>disk, which doesn't surprise me in this situation.  In most 
> >>>>>
> >>>>>cases the 
> >>>>>
> >>>>>>screen was blank (due to APM I'd guess?).
> >>>>>>	The systems still respond to pings, but do not respond 
> >>>>>
> >>>>>to keyboard 
> >>>>>
> >>>>>>actions and do not complete any tcp connections.  These 
> >>>>>
> >>>>>systems do not 
> >>>>>
> >>>>>>have a graphical desktop installed, and in fact have a 
> >>>>>
> >>>>>fairly minimal 
> >>>>>
> >>>>>>set of packages installed at all.
> >>>>>>	I don't know why the 2550 would consistantly fail in 
> >>>>>
> >>>>>such a brief 
> >>>>>
> >>>>>>period while the 2650 would take a much longer time before failure. 
> >>>>>>I've been running the same tests on a 1750 (PERC4/Di - 
> >>>>>
> >>>>>Megaraid based) 
> >>>>>
> >>>>>>for some days now without a failure.
> >>>>>>	I plan on testing a non-SMP kernel on the 2550 next - 
> >>>>>
> >>>>>not because we 
> >>>>>
> >>>>>>can run things that way, but to maybe give some more clues.
> >>>>>>
> >>>>>>	The following script creates a 300 MB file, then rm's 
> >>>>>
> >>>>>it, then does 
> >>>>>
> >>>>>>it
> >>>>>>all over again.  For my tests I ran 6 of these concurrently.  Don't 
> >>>>>>expect the system to respond to much while these are 
> >>>>>
> >>>>>running, though I 
> >>>>>
> >>>>>>was able to get decent updates from top.
> >>>>>>	Alter the script as you see fit, I'm no guru with bash 
> >>>>>
> >>>>>scripting!
> >>>>>
> >>>>>>cat diskgrind.sh
> >>>>>>#!/bin/sh
> >>>>>>
> >>>>>>
> >>>>>>MEGS=300
> >>>>>>TOTAL=0
> >>>>>>
> >>>>>>while [ "1" != "0" ]; do
> >>>>>>         dd ibs=1048576 count=$MEGS if=/dev/zero 
> >>>>>
> >>>>>of=/test/diskgrind.$$
> >>>>>
> >>>>>>2>&1 | cat >/dev/null
> >>>>>>         rm -f /test/diskgrind.$$
> >>>>>>         TOTAL=`expr $TOTAL + $MEGS`
> >>>>>>         echo "[$$] Completed $TOTAL megs."
> >>>>>>done
> >>>>>>
> >>>>>>
> >>>>>>./diskgrind.sh &
> >>>>>>./diskgrind.sh &
> >>>>>>./diskgrind.sh &
> >>>>>>./diskgrind.sh &
> >>>>>>./diskgrind.sh &
> >>>>>>./diskgrind.sh &
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>Andrew
> >>>>>>
> >>>>>
> >>>>>-- 
> >>>>>Andrew Mann
> >>>>>Systems Administrator
> >>>>>Mythic Entertainment
> >>>>>703-934-0446 x 224
> >>>>>
> >>>>>_______________________________________________
> >>>>>Linux-PowerEdge mailing list
> >>>>>Linux-PowerEdge at dell.com 
> >>>>>
> >>>>>>>http://lists.us.dell.com/mailman/listinfo/linux->>poweredge
> >>>>>
> >>>>>
> >>>>>Please read the FAQ at 
> >>>>>http://lists.us.dell.com/faq or search the list archives at 
> >>>
> >>>http://lists.us.dell.com/htdig/
> >>>
> >>>
> >>>_______________________________________________
> >>>Linux-PowerEdge mailing list
> >>>Linux-PowerEdge at dell.com
> >>>http://lists.us.dell.com/mailman/listinfo/linux-poweredge
> >>>Please read the FAQ at http://lists.us.dell.com/faq or search the list
> >>
> >>archives at http://lists.us.dell.com/htdig/
> >>
> > 
> > 
> > _______________________________________________
> > Linux-PowerEdge mailing list
> > Linux-PowerEdge at dell.com
> > http://lists.us.dell.com/mailman/listinfo/linux-poweredge
> > Please read the FAQ at http://lists.us.dell.com/faq or search the list archives at http://lists.us.dell.com/htdig/
> 
> _______________________________________________
> Linux-aacraid-devel mailing list
> Linux-aacraid-devel at dell.com
> http://lists.us.dell.com/mailman/listinfo/linux-aacraid-devel
> Please read the FAQ at http://lists.us.dell.com/faq or search the list archives at http://lists.us.dell.com/htdig/




More information about the Linux-PowerEdge mailing list