kernel: aacraid: Host adapter reset request. SCSI hang ?

Javier Rodriguez jlr at jlrconsulting.com
Wed Aug 27 08:20:01 CDT 2003


Hi,

Thank you for the feedback. For us, re-enabling hypterthreading on the 2650s
causes the problem to return, so at least we know that it is playing a role
in the SCSI hang problem. As for high sustained I/O, with hyperthreading
enabled, we've encountered the problem with both low and high I/O rates.
With hyperthreading disabled, we can now sustain high I/O loads without a
problem.

Thanks,
Jav

-----Original Message-----
From: linux-aacraid-devel-admin at dell.com
[mailto:linux-aacraid-devel-admin at dell.com] On Behalf Of Stefano Turolla
Sent: Wednesday, August 27, 2003 6:27 AM
To: Javier Rodriguez
Cc: linux-aacraid-devel at dell.com; linux-poweredge at dell.com
Subject: RE: kernel: aacraid: Host adapter reset request. SCSI hang ?


Hell
we have the same problem with poweredge 1650 and 2650.
We tried different versions of kernel and redhat releases (7.3 and 9) kernel
tried 
2.4.18-18.7.x
2.4.18-24.7.x
2.4.18-26.7.x
2.4.20-13.7
2.4.20-19.7
2.4.21.ac2-rc2
As a workaround i removed raid controller form some 1650 and re-install the
machine with only scsi interface connected. We didn't have any more crash in
the last month!

Besides, for most of our machines (1650) disabling the hyperthreading has no
sense as they have one cpu (pentium III from 1.4 to 1.7 GHz) with no
hyperthreading, of course. On the other hand we have other 2650 that are
running since 2 or three moths without problems, some of them with
hyperthreading disabled. A couple of other 2650 had several crashes whene
they were used as ftp server. I don't know what it really means but it seems
something not really related to
hyperthreading, but only to a high substained i/o   

On Fri, 2003-08-22 at 13:07, Javier Rodriguez wrote:
> Hello,
>  
> Does anyone developing the aacraid driver have an update regarding the 
> problem below? Disabling HyperThreading (Logical Processor) within the 
> Dell 2650 BIOS has without a doubt circumvented the problem for us (as 
> well as a few others), but it would be nice to reenable the feature.
>  
> For reference, with HyperThreading disabled, we've been able to 
> successfully execute Red Hat's distribution of Linux kernel-2.4.20-9, 
> kernel-smp-2.4.20-9, kernel-2.4.20-13.9, kernel-smp-2.4.20-13.9, 
> kernel-2.4.20-18.9 and kernel-smp-2.4.20-18.9. We currently have two 
> Dell 2650s executing kernel-smp-2.4.20-18.9 for 70 days without 
> incident. Prior to disabling HyperThreading, our systems would 
> normally crash within 24 hours (no longer than 48 hours) with both the 
> smp and non-smp version of the kernel.
>  
> Thanks,
> Javier
>         -----Original Message-----
>         From: linux-aacraid-devel-admin at dell.com
>         [mailto:linux-aacraid-devel-admin at dell.com] On Behalf Of
>         Javier Rodriguez
>         Sent: Saturday, May 31, 2003 7:43 PM
>         To: linux-aacraid-devel at dell.com
>         Subject: kernel: aacraid: Host adapter reset request. SCSI
>         hang ?
>         
>         
>         Hello,
>          
>         We recently purchased two Dell PowerEdge 2650 servers with
>         PERC3/Di controllers. Both servers are executing RedHat Linux
>         9.0. On both servers we are encountering the following error:
>          
>         <<< Portion of server message log >>>
>         May 31 16:14:07 server1 kernel: aacraid: Host adapter reset
>         request. SCSI hang ?
>         May 31 16:14:17 server1 kernel: scsi: device set offline -
>         command error recover failed: host 0 channel 0 id 0 lun 0
>         May 31 16:14:17 server1 kernel: SCSI disk error : host 0
>         channel 0 id 0 lun 0 return code = 6000000
>         May 31 16:14:17 server1 kernel:  I/O error: dev 08:03, sector
>         83200
>         May 31 16:14:17 server1 kernel:  I/O error: dev 08:03, sector
>         13568
>         May 31 16:14:17 server1 kernel:  I/O error: dev 08:03, sector
>         13616
>         May 31 16:14:17 server1 kernel:  I/O error: dev 08:03, sector
>         83200
>         May 31 16:14:17 server1 kernel:  I/O error: dev 08:03, sector
>         22030904
>         May 31 16:14:17 server1 kernel:  I/O error: dev 08:03, sector
>         88348712
>         May 31 16:14:17 server1 kernel:  I/O error: dev 08:03, sector
>         72976
>         May 31 16:14:17 server1 kernel:  I/O error: dev 08:03, sector
>         13624
>         May 31 16:14:17 server1 kernel:  I/O error: dev 08:03, sector
>         13752
>         May 31 16:14:17 server1 kernel:  I/O error: dev 08:03, sector
>         13768
>         May 31 16:14:17 server1 kernel:  I/O error: dev 08:03, sector
>         72976
>         <<< I/O error messages continue until the server is rebooted
>         >>>
>          
>          
>         Here are a few notes regarding the error and operating
>         environment:
>          
>         - The error occurs with RedHat's kernel RPMs
>         kernel-smp-2.4.20-9 and kernel-smp-2.4.20-13.9. As of today,
>         we are testing kernel-2.4.20-9 to determine if the problem
>         occurs under a non-smp environment.
>         - The time between failures varies from several hours to
>         several days.
>         - The failures occur both during light and heavy system loads.
>         - PowerEdge 2650 BIOS is at 1.10 A10
>         - Backplane firmware is at 1.01
>         - PERC3/Di BIOS is at V2.7-1 (build 3170)
>         - A full system diagnostics has been successfully executed on
>         both servers.
>         - The RAID media has been successfully 'verified' on both
>         servers.
>          
>         Thank you in advance for your assistance in helping to get
>         this problem resolved.
>          
>         Javier
>          
>          
>         JLR Consulting, PO Box 638, Bernville, PA 19506-0638
>         mailto:jlr at jlrconsulting.com
>          
-- 
+------+---------+--------+--------+--------+---------+--------+-------+
| Stefano Turolla                             Phone : +49 89 32006537  |
| UNIX System Manager                         Fax   : +49 89 32006380  |
| European Southern Observatory (ESO):        E-Mail: sturolla at eso.org |
| Karl-Schwarzschild-strasse 2 D-85748 Garching bei Muenchen           |
+------+---------+--------+--------+--------+---------+--------+-------+
Computers are like airconditioners ,
they stop working properly if you open WINDOWS


_______________________________________________
Linux-aacraid-devel mailing list
Linux-aacraid-devel at dell.com
http://lists.us.dell.com/mailman/listinfo/linux-aacraid-devel
Please read the FAQ at http://lists.us.dell.com/faq or search the list
archives at http://lists.us.dell.com/htdig/





More information about the Linux-PowerEdge mailing list