RedHat 9 aacraid - system fails under extreme disk IO - Repro ducable test case

Russell Stuart rstuart at lubemobile.com.au
Wed Oct 8 20:21:01 CDT 2003


You describe my setup exactly.  I have two machines, one fails within 4
hours and the other works perfectly.

Are you saying then that if I revert the firmware to Build 3157 the
problem will go away?  Is there some reason I should not do this (like
other bugs in 3157)?

On Wed, 2003-10-08 at 23:56, Salyzyn, Mark wrote:
> I have not been able to duplicate this issue, so I am somewhat of a JAFO,
> and am *not* a definitive resource.
> 
> This issue is not just one problem. noapic kernel option and turning off
> HyperThreading have resolved some of the reported issues. Driver changes
> thus far can not eliminate the problem, but can delay the inevitable. Build
> 3157 of the Firmware appears to work fine, Build 3170 fails, but only with
> certain Seagate 15K rpm U320 drives. 
> 
> I may be wrong ... any corrections to my assumptions above would be greatly
> appreciated.
> 
> Sincerely -- Mark Salyzyn
> 
> -----Original Message-----
> From: Thomas Petersen [mailto:tomp at securityminded.net]
> Sent: Tuesday, October 07, 2003 8:52 PM
> To: 'Andrew Mann'
> Cc: linux-poweredge at dell.com; Salyzyn, Mark
> Subject: RE: RedHat 9 aacraid - system fails under extreme disk IO -
> Reproducable test case
> 
> 
> I am pretty disappointed in Dell for failing to follow up on this and
> resolve the issue once and for all.  This is not a new problem but it is
> Dell's responsibility to rectify it as they -certify- Redhat on the 2650 --
> regardless if it's a hardware or software issue Dell is responsible to their
> customers.  
> 
> If this was an issue on the Microsoft platform you can bet Dell would of
> worked with Microsoft and issued a patch/update long before it became a wide
> spread problem.  I have always been a huge fan of Dell equipment but their
> failure in this instance to support what they sell is very troubling. 
> 
> Don't get me wrong I will probably purchase Dell servers again in the future
> (though not the 2650) but can anyone name one problem affecting the
> Microsoft platform, related to Dell hardware and had a problem of this
> magnitude, that went unresolved for as long as this one has?  System lockups
> are -totally- unacceptable.  
> 
> I guess when people start choosing with their checkbooks Dell might wake up.
> 
> Thomas Petersen
> SecurityMinded Technologies 
> 
> >>-----Original Message-----
> >>From: Andrew Mann [mailto:amann at mythicentertainment.com] 
> >>Sent: Tuesday, October 07, 2003 6:20 PM
> >>To: linux-poweredge at dell.com
> >>Cc: mark_salyzyn at adaptec.com
> >>Subject: Re: RedHat 9 aacraid - system fails under extreme 
> >>disk IO - Reproducable test case
> >>
> >>
> >>	Unfortunately we've got a good number of 2550s and 
> >>2650s in use, and 
> >>replacing the RAID cards isn't ideal.  Mostly we don't have 
> >>enough load 
> >>to cause this problem, but every now and then we do get an 
> >>unexplained 
> >>lockup that pulls someone out of bed at 2 AM.
> >>	I searched back through the reports of this and found 
> >>some posts from 
> >>Mark Salyzyn referencing AAC_NUM_FIB and AAC_NUM_IO_FIB 
> >>settings.  The 
> >>last comment I see is on 9/9/2003:
> >>"I am suggesting that this value be (AAC_NUM_IO_FIB+64), and 
> >>limited to 
> >>below 512 (the maximum number of hardware FIBS the Firmware 
> >>can absorb). 
> >>I will begin testing the stability and side effects of this input."
> >>	However, I don't see any followup, nor does the latest 
> >>patchset to the 
> >>2.4 series seem to contain any modifications in this area (or 
> >>2.5 or 2.6 
> >>since June 2003).
> >>	Additionally, I've just rebuilt the aacraid module here 
> >>from the RedHat 
> >>SRPM of 2.4.20-20.9 with AAC_NUM_FIB=512 and 
> >>AAC_NUM_IO_FIB=448, rebuilt 
> >>the rdimage and such and got another crash within 5 minutes 
> >>of starting 
> >>the test.
> >>
> >>	I also see a note from Mark on 8/27/2003:
> >>-----
> >>There is code that does the following in the driver:
> >>
> >>	scsicmd->result = DID_OK << 16 | COMMAND_COMPLETE << 8 
> >>| SAM_STAT_TASK_SET_FULL;
> >>	aac_io_done(scsicmd);
> >>	return -1;
> >>
> >>This is *wrong*, because the none zero return causes the 
> >>system to hold 
> >>the command in the queue due to the use of the new error 
> >>handler, yet we 
> >>have also completed the command as `BUSY' *and* as a result of the 
> >>constraints of the aac_io_done call which relocks (on 
> >>io_request_lock) 
> >>the caller had to unlock leaving a hole that SMP machines fill. By 
> >>dropping the result and done calls in these situations, and 
> >>holding the 
> >>locks in the caller of such routines, I believe we will close 
> >>this hole.
> >>
> >>....
> >>
> >>I will report back on my tests of these changes, but will need a 
> >>volunteer with kernel compile experience to report on the success in 
> >>resolving this issue in the field *please*.
> >>-----
> >>
> >>	I'm not familiar enough with the aacraid driver or scsi 
> >>in general to 
> >>gather the code changes necessary.  There also don't appear to be any 
> >>followups.
> >>
> >>	Mark, do you have any updates on this?  I can make code 
> >>changes, 
> >>recompile, and run a test case that reliably reveals the 
> >>problem here if 
> >>that's helpful.
> >>
> >>
> >>I can't see the full panic message, but the parts I can see are 
> >>basically (copied by hand):
> >>
> >>CPU 1
> >>EFLAGS: 00010086
> >>
> >>EIP is at rmqueue [kernel] 0x127  (2.4.20-20.9smp)
> >>eax: c0343400    ebx: c03445dc    ecx: 00000000
> >>edx: b6d7ca63    esi: 00000000    edi: c03445d0
> >>ebp: 00038000    esp: ee643e80     ds: 0068
> >>es: 0068  ss: 0068
> >>
> >>Process dd (pid: 956, stack page = ee643000)
> >>
> >>Call trace:   wakeup_kswapd   0xfb (0xee643e90)
> >>               __aloc_pages_limit   0x57
> >>               __alloc_pages        0x101
> >>               generic_file_write   0x394
> >>               ext3_file_write      0x39
> >>               sys_write            0x97
> >>               system_call          0x33
> >>
> >>	Although aacraid isn't directly implicated here, I can 
> >>reproduce this 
> >>on the 2550s and 2650s (aacraid) but not 1750s (megaraid).
> >>
> >>Andrew
> >>
> >>Paul Anderson wrote:
> >>
> >>> We had this same issue with our 2650's running AS 2.1.  Don't know 
> >>> that this is the best answer, but it is the one that worked for 
> >>> us...Replace the on board adapter with a PERC 3/DC (LSI) adapter.  
> >>> Make sure that you put it on its own bus, we used slot 
> >>three.  In 2 of 
> >>> our 2650's we are even running this with the HBA's for SAN 
> >>> connectivity.  That said, our solution is about 2 weeks 
> >>old, though I 
> >>> did run similar tests on the systems after the new install 
> >>for 8 days 
> >>> and was unable to make them crash.
> >>> 
> >>> Paul
> >>> 
> >>> -----Original Message-----
> >>> From: Andrew Mann [mailto:amann at mythicentertainment.com]
> >>> Sent: Tuesday, October 07, 2003 12:47 PM
> >>> To: linux-poweredge at dell.com
> >>> Cc: Matt Domsch; deanna_bonds at adaptec.com; alan at redhat.com
> >>> Subject: RedHat 9 aacraid - system fails under extreme disk IO - 
> >>> Reproducable test case
> >>> 
> >>> 
> >>> 	This has been brought up on the Dell Linux Poweredge 
> >>list previously,
> >>> but it doesn't appear that a definative solution or reproducable 
> >>> situation has been presented.  It also seems like the 
> >>previous reports 
> >>> involved both heavy disk IO as well as heavy network 
> >>traffic, and so the 
> >>> NIC driver was suspect.
> >>> 	Since we have a number of 2550s and 2650s using the 
> >>onboard PERC3/Di 
> >>> raid controller (aacraid driver), this issue concerns us.
> >>> 
> >>> 	The following script was run with 6 instances at once 
> >>on two 2550s 
> >>> and
> >>> one 2650.
> >>> 
> >>> 2550 configuration
> >>> 2 x P3 1.2 Ghz  kernel: 2.4.20-20.9smp #1 SMP
> >>> 1GB of ram, 2GB of swap, 2 x 18 GB drives in a raid 1 configuration
> >>> 
> >>> 2650 configuration
> >>> 2 x Xeon 2.2 Ghz   kernel: 2.4.20-20.9smp #1 SMP
> >>> 2GB of ram, 2GB of swap, 2 x 18 GB drives in a raid 1 configuration 
> >>> Hyperthreading enabled
> >>> 
> >>> 
> >>> 	The 2550s fail within 30 minutes of starting the tests 
> >>each time 
> >>> (tests
> >>> were run 6 times in a row).  The 2650 failed prior to 2.5 
> >>days (only 1 
> >>> test run due to duration before failure).  In some cases the 2550 
> >>> displayed a null pointer dereference in the kernel.  I'll copy down 
> >>> details next time I can catch it on screen.  It does not 
> >>get logged to 
> >>> disk, which doesn't surprise me in this situation.  In most 
> >>cases the 
> >>> screen was blank (due to APM I'd guess?).
> >>> 	The systems still respond to pings, but do not respond 
> >>to keyboard 
> >>> actions and do not complete any tcp connections.  These 
> >>systems do not 
> >>> have a graphical desktop installed, and in fact have a 
> >>fairly minimal 
> >>> set of packages installed at all.
> >>> 	I don't know why the 2550 would consistantly fail in 
> >>such a brief 
> >>> period while the 2650 would take a much longer time before failure. 
> >>> I've been running the same tests on a 1750 (PERC4/Di - 
> >>Megaraid based) 
> >>> for some days now without a failure.
> >>> 	I plan on testing a non-SMP kernel on the 2550 next - 
> >>not because we 
> >>> can run things that way, but to maybe give some more clues.
> >>> 
> >>> 	The following script creates a 300 MB file, then rm's 
> >>it, then does 
> >>> it
> >>> all over again.  For my tests I ran 6 of these concurrently.  Don't 
> >>> expect the system to respond to much while these are 
> >>running, though I 
> >>> was able to get decent updates from top.
> >>> 	Alter the script as you see fit, I'm no guru with bash 
> >>scripting!
> >>> 
> >>> cat diskgrind.sh
> >>> #!/bin/sh
> >>> 
> >>> 
> >>> MEGS=300
> >>> TOTAL=0
> >>> 
> >>> while [ "1" != "0" ]; do
> >>>          dd ibs=1048576 count=$MEGS if=/dev/zero 
> >>of=/test/diskgrind.$$
> >>> 2>&1 | cat >/dev/null
> >>>          rm -f /test/diskgrind.$$
> >>>          TOTAL=`expr $TOTAL + $MEGS`
> >>>          echo "[$$] Completed $TOTAL megs."
> >>> done
> >>> 
> >>> 
> >>> ./diskgrind.sh &
> >>> ./diskgrind.sh &
> >>> ./diskgrind.sh &
> >>> ./diskgrind.sh &
> >>> ./diskgrind.sh &
> >>> ./diskgrind.sh &
> >>> 
> >>> 
> >>> 
> >>> Andrew
> >>> 
> >>
> >>-- 
> >>Andrew Mann
> >>Systems Administrator
> >>Mythic Entertainment
> >>703-934-0446 x 224
> >>
> >>_______________________________________________
> >>Linux-PowerEdge mailing list
> >>Linux-PowerEdge at dell.com 
> >>>>http://lists.us.dell.com/mailman/listinfo/linux->>poweredge
> >>
> >>
> >>Please read the FAQ at 
> >>http://lists.us.dell.com/faq or search the list archives at 
> http://lists.us.dell.com/htdig/
> 
> 
> _______________________________________________
> Linux-PowerEdge mailing list
> Linux-PowerEdge at dell.com
> http://lists.us.dell.com/mailman/listinfo/linux-poweredge
> Please read the FAQ at http://lists.us.dell.com/faq or search the list archives at http://lists.us.dell.com/htdig/




More information about the Linux-PowerEdge mailing list