RedHat 9 aacraid - system fails under extreme disk IO - Repro ducable test case

McDougall, Marshall (FSH) MarMcDouga at gov.mb.ca
Wed Oct 8 14:13:00 CDT 2003


If your question is asked purely within the limited context of this thread
then one would have to conclude that, yes, ES is better than 8/9 :-}  I am
in the process of running the same tests on a 7.3 box.  We'll see what it
looks like tomorrow.

Regards, Marshall

-----Original Message-----

From: Nick Nelson [mailto:nick at lunarpages.com] 
Sent: Wednesday, October 08, 2003 1:34 PM
To: McDougall, Marshall (FSH)
Cc: 'Salyzyn, Mark'; 'tomp at securityminded.net'; 'Andrew Mann';
linux-poweredge at dell.com
Subject: RE: RedHat 9 aacraid - system fails under extreme disk IO - Repro
ducable test case



> I ran 14 iterations of Andrew's script on one of my 2550's for about 20
> hours before I stopped it.  I ran it on a newly installed RHES2.1 with the
> 2.4.9-e.27smp kernel.  I have the 3/DI controller V2.7-1 build 3571 with
> mirrored 18 GB drives.

So in conclusion, RHES2.1 is significantly better than RH8/9?

> Regards, Marshall
>
> -----Original Message-----
> From: Salyzyn, Mark [mailto:mark_salyzyn at adaptec.com]
> Sent: Wednesday, October 08, 2003 8:57 AM
> To: 'tomp at securityminded.net'; 'Andrew Mann'
> Cc: linux-poweredge at dell.com
> Subject: RE: RedHat 9 aacraid - system fails under extreme disk IO - Repro
> ducable test case
>
>
> I have not been able to duplicate this issue, so I am somewhat of a JAFO,
> and am *not* a definitive resource.
>
> This issue is not just one problem. noapic kernel option and turning off
> HyperThreading have resolved some of the reported issues. Driver changes
> thus far can not eliminate the problem, but can delay the inevitable.
> Build
> 3157 of the Firmware appears to work fine, Build 3170 fails, but only with
> certain Seagate 15K rpm U320 drives.
>
> I may be wrong ... any corrections to my assumptions above would be
> greatly
> appreciated.
>
> Sincerely -- Mark Salyzyn
>
> -----Original Message-----
> From: Thomas Petersen [mailto:tomp at securityminded.net]
> Sent: Tuesday, October 07, 2003 8:52 PM
> To: 'Andrew Mann'
> Cc: linux-poweredge at dell.com; Salyzyn, Mark
> Subject: RE: RedHat 9 aacraid - system fails under extreme disk IO -
> Reproducable test case
>
>
> I am pretty disappointed in Dell for failing to follow up on this and
> resolve the issue once and for all.  This is not a new problem but it is
> Dell's responsibility to rectify it as they -certify- Redhat on the 2650
> --
> regardless if it's a hardware or software issue Dell is responsible to
> their
> customers.
>
> If this was an issue on the Microsoft platform you can bet Dell would of
> worked with Microsoft and issued a patch/update long before it became a
> wide
> spread problem.  I have always been a huge fan of Dell equipment but their
> failure in this instance to support what they sell is very troubling.
>
> Don't get me wrong I will probably purchase Dell servers again in the
> future
> (though not the 2650) but can anyone name one problem affecting the
> Microsoft platform, related to Dell hardware and had a problem of this
> magnitude, that went unresolved for as long as this one has?  System
> lockups
> are -totally- unacceptable.
>
> I guess when people start choosing with their checkbooks Dell might wake
> up.
>
> Thomas Petersen
> SecurityMinded Technologies
>
>>>-----Original Message-----
>>>From: Andrew Mann [mailto:amann at mythicentertainment.com]
>>>Sent: Tuesday, October 07, 2003 6:20 PM
>>>To: linux-poweredge at dell.com
>>>Cc: mark_salyzyn at adaptec.com
>>>Subject: Re: RedHat 9 aacraid - system fails under extreme
>>>disk IO - Reproducable test case
>>>
>>>
>>>	Unfortunately we've got a good number of 2550s and
>>>2650s in use, and
>>>replacing the RAID cards isn't ideal.  Mostly we don't have
>>>enough load
>>>to cause this problem, but every now and then we do get an
>>>unexplained
>>>lockup that pulls someone out of bed at 2 AM.
>>>	I searched back through the reports of this and found
>>>some posts from
>>>Mark Salyzyn referencing AAC_NUM_FIB and AAC_NUM_IO_FIB
>>>settings.  The
>>>last comment I see is on 9/9/2003:
>>>"I am suggesting that this value be (AAC_NUM_IO_FIB+64), and
>>>limited to
>>>below 512 (the maximum number of hardware FIBS the Firmware
>>>can absorb).
>>>I will begin testing the stability and side effects of this input."
>>>	However, I don't see any followup, nor does the latest
>>>patchset to the
>>>2.4 series seem to contain any modifications in this area (or
>>>2.5 or 2.6
>>>since June 2003).
>>>	Additionally, I've just rebuilt the aacraid module here
>>>from the RedHat
>>>SRPM of 2.4.20-20.9 with AAC_NUM_FIB=512 and
>>>AAC_NUM_IO_FIB=448, rebuilt
>>>the rdimage and such and got another crash within 5 minutes
>>>of starting
>>>the test.
>>>
>>>	I also see a note from Mark on 8/27/2003:
>>>-----
>>>There is code that does the following in the driver:
>>>
>>>	scsicmd->result = DID_OK << 16 | COMMAND_COMPLETE << 8
>>>| SAM_STAT_TASK_SET_FULL;
>>>	aac_io_done(scsicmd);
>>>	return -1;
>>>
>>>This is *wrong*, because the none zero return causes the
>>>system to hold
>>>the command in the queue due to the use of the new error
>>>handler, yet we
>>>have also completed the command as `BUSY' *and* as a result of the
>>>constraints of the aac_io_done call which relocks (on
>>>io_request_lock)
>>>the caller had to unlock leaving a hole that SMP machines fill. By
>>>dropping the result and done calls in these situations, and
>>>holding the
>>>locks in the caller of such routines, I believe we will close
>>>this hole.
>>>
>>>....
>>>
>>>I will report back on my tests of these changes, but will need a
>>>volunteer with kernel compile experience to report on the success in
>>>resolving this issue in the field *please*.
>>>-----
>>>
>>>	I'm not familiar enough with the aacraid driver or scsi
>>>in general to
>>>gather the code changes necessary.  There also don't appear to be any
>>>followups.
>>>
>>>	Mark, do you have any updates on this?  I can make code
>>>changes,
>>>recompile, and run a test case that reliably reveals the
>>>problem here if
>>>that's helpful.
>>>
>>>
>>>I can't see the full panic message, but the parts I can see are
>>>basically (copied by hand):
>>>
>>>CPU 1
>>>EFLAGS: 00010086
>>>
>>>EIP is at rmqueue [kernel] 0x127  (2.4.20-20.9smp)
>>>eax: c0343400    ebx: c03445dc    ecx: 00000000
>>>edx: b6d7ca63    esi: 00000000    edi: c03445d0
>>>ebp: 00038000    esp: ee643e80     ds: 0068
>>>es: 0068  ss: 0068
>>>
>>>Process dd (pid: 956, stack page = ee643000)
>>>
>>>Call trace:   wakeup_kswapd   0xfb (0xee643e90)
>>>               __aloc_pages_limit   0x57
>>>               __alloc_pages        0x101
>>>               generic_file_write   0x394
>>>               ext3_file_write      0x39
>>>               sys_write            0x97
>>>               system_call          0x33
>>>
>>>	Although aacraid isn't directly implicated here, I can
>>>reproduce this
>>>on the 2550s and 2650s (aacraid) but not 1750s (megaraid).
>>>
>>>Andrew
>>>
>>>Paul Anderson wrote:
>>>
>>>> We had this same issue with our 2650's running AS 2.1.  Don't know
>>>> that this is the best answer, but it is the one that worked for
>>>> us...Replace the on board adapter with a PERC 3/DC (LSI) adapter.
>>>> Make sure that you put it on its own bus, we used slot
>>>three.  In 2 of
>>>> our 2650's we are even running this with the HBA's for SAN
>>>> connectivity.  That said, our solution is about 2 weeks
>>>old, though I
>>>> did run similar tests on the systems after the new install
>>>for 8 days
>>>> and was unable to make them crash.
>>>>
>>>> Paul
>>>>
>>>> -----Original Message-----
>>>> From: Andrew Mann [mailto:amann at mythicentertainment.com]
>>>> Sent: Tuesday, October 07, 2003 12:47 PM
>>>> To: linux-poweredge at dell.com
>>>> Cc: Matt Domsch; deanna_bonds at adaptec.com; alan at redhat.com
>>>> Subject: RedHat 9 aacraid - system fails under extreme disk IO -
>>>> Reproducable test case
>>>>
>>>>
>>>> 	This has been brought up on the Dell Linux Poweredge
>>>list previously,
>>>> but it doesn't appear that a definative solution or reproducable
>>>> situation has been presented.  It also seems like the
>>>previous reports
>>>> involved both heavy disk IO as well as heavy network
>>>traffic, and so the
>>>> NIC driver was suspect.
>>>> 	Since we have a number of 2550s and 2650s using the
>>>onboard PERC3/Di
>>>> raid controller (aacraid driver), this issue concerns us.
>>>>
>>>> 	The following script was run with 6 instances at once
>>>on two 2550s
>>>> and
>>>> one 2650.
>>>>
>>>> 2550 configuration
>>>> 2 x P3 1.2 Ghz  kernel: 2.4.20-20.9smp #1 SMP
>>>> 1GB of ram, 2GB of swap, 2 x 18 GB drives in a raid 1 configuration
>>>>
>>>> 2650 configuration
>>>> 2 x Xeon 2.2 Ghz   kernel: 2.4.20-20.9smp #1 SMP
>>>> 2GB of ram, 2GB of swap, 2 x 18 GB drives in a raid 1 configuration
>>>> Hyperthreading enabled
>>>>
>>>>
>>>> 	The 2550s fail within 30 minutes of starting the tests
>>>each time
>>>> (tests
>>>> were run 6 times in a row).  The 2650 failed prior to 2.5
>>>days (only 1
>>>> test run due to duration before failure).  In some cases the 2550
>>>> displayed a null pointer dereference in the kernel.  I'll copy down
>>>> details next time I can catch it on screen.  It does not
>>>get logged to
>>>> disk, which doesn't surprise me in this situation.  In most
>>>cases the
>>>> screen was blank (due to APM I'd guess?).
>>>> 	The systems still respond to pings, but do not respond
>>>to keyboard
>>>> actions and do not complete any tcp connections.  These
>>>systems do not
>>>> have a graphical desktop installed, and in fact have a
>>>fairly minimal
>>>> set of packages installed at all.
>>>> 	I don't know why the 2550 would consistantly fail in
>>>such a brief
>>>> period while the 2650 would take a much longer time before failure.
>>>> I've been running the same tests on a 1750 (PERC4/Di -
>>>Megaraid based)
>>>> for some days now without a failure.
>>>> 	I plan on testing a non-SMP kernel on the 2550 next -
>>>not because we
>>>> can run things that way, but to maybe give some more clues.
>>>>
>>>> 	The following script creates a 300 MB file, then rm's
>>>it, then does
>>>> it
>>>> all over again.  For my tests I ran 6 of these concurrently.  Don't
>>>> expect the system to respond to much while these are
>>>running, though I
>>>> was able to get decent updates from top.
>>>> 	Alter the script as you see fit, I'm no guru with bash
>>>scripting!
>>>>
>>>> cat diskgrind.sh
>>>> #!/bin/sh
>>>>
>>>>
>>>> MEGS=300
>>>> TOTAL=0
>>>>
>>>> while [ "1" != "0" ]; do
>>>>          dd ibs=1048576 count=$MEGS if=/dev/zero
>>>of=/test/diskgrind.$$
>>>> 2>&1 | cat >/dev/null
>>>>          rm -f /test/diskgrind.$$
>>>>          TOTAL=`expr $TOTAL + $MEGS`
>>>>          echo "[$$] Completed $TOTAL megs."
>>>> done
>>>>
>>>>
>>>> ./diskgrind.sh &
>>>> ./diskgrind.sh &
>>>> ./diskgrind.sh &
>>>> ./diskgrind.sh &
>>>> ./diskgrind.sh &
>>>> ./diskgrind.sh &
>>>>
>>>>
>>>>
>>>> Andrew
>>>>
>>>
>>>--
>>>Andrew Mann
>>>Systems Administrator
>>>Mythic Entertainment
>>>703-934-0446 x 224
>>>
>>>_______________________________________________
>>>Linux-PowerEdge mailing list
>>>Linux-PowerEdge at dell.com
>>>>>http://lists.us.dell.com/mailman/listinfo/linux->>poweredge
>>>
>>>
>>>Please read the FAQ at
>>>http://lists.us.dell.com/faq or search the list archives at
> http://lists.us.dell.com/htdig/
>
>
> _______________________________________________
> Linux-PowerEdge mailing list
> Linux-PowerEdge at dell.com
> http://lists.us.dell.com/mailman/listinfo/linux-poweredge
> Please read the FAQ at http://lists.us.dell.com/faq or search the list
> archives at http://lists.us.dell.com/htdig/
>
> _______________________________________________
> Linux-PowerEdge mailing list
> Linux-PowerEdge at dell.com
> http://lists.us.dell.com/mailman/listinfo/linux-poweredge
> Please read the FAQ at http://lists.us.dell.com/faq or search the list
> archives at http://lists.us.dell.com/htdig/
>


nick
--
Nick Nelson            //   USA: 1-877-586-2772 ext. 223
Systems Engineer       //   UK: 0800 0729150
nick at lunarpages.com    //   INTL: 1-714-521-8150




More information about the Linux-PowerEdge mailing list