PowerEdge SC1435: problem with SATA disks

Davide Ferrari davide.ferrari at atrapalo.com
Fri Nov 23 04:22:49 CST 2007


Hi

we've got two PowerEdge SC1435 which are giving us lots of troubles with the 
SATA disks. From time to time, when there is disk activity, the disk 
just "disconnects" from the SATA channel and the processes accessing the disk 
freeze during 60-120 seconds. Here it is the relevant kernel output:

Nov 22 00:51:03 static2 ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x40000000 
action 0x2 frozen
Nov 22 00:51:03 static2 ata1.00: cmd ca/00:08:00:61:1f/00:00:00:00:00/e0 tag 0 
cdb 0x0 data 4096 out
Nov 22 00:52:14 static2 ata1: port is slow to respond, please be patient 
(Status 0xd0)
Nov 22 00:52:14 static2 ata1: port failed to respond (30 secs, Status 0xd0)
Nov 22 00:52:14 static2 ata1: soft resetting port
Nov 22 00:52:14 static2 ata1: port is slow to respond, please be patient 
(Status 0xd0)
Nov 22 00:52:14 static2 ata1: port failed to respond (30 secs, Status 0xd0)
Nov 22 00:52:14 static2 ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
Nov 22 00:52:14 static2 ata1.00: revalidation failed (errno=-2)
Nov 22 00:52:14 static2 ata1: failed to recover some devices, retrying in 5 
secs
Nov 22 00:52:14 static2 ata1: hard resetting port
Nov 22 00:52:14 static2 ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
Nov 22 00:52:14 static2 ata1.00: configured for UDMA/100
Nov 22 00:52:14 static2 ata1: EH complete
Nov 22 00:52:17 static2 ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x40000000 
action 0x2 frozen
Nov 22 00:52:17 static2 ata1.00: cmd ec/00:00:00:00:00/00:00:00:00:00/00 tag 0 
cdb 0x0 data 512 in
Nov 22 00:52:19 static2 ata1: soft resetting port
Nov 22 00:52:19 static2 ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
Nov 22 00:52:19 static2 ata1.00: configured for UDMA/100
Nov 22 00:52:19 static2 ata1: EH complete

and this is some info about server HW/configuration:

lspci output:
00:01.0 PCI bridge: Broadcom HT1000 PCI/PCI-X bridge
00:02.0 Host bridge: Broadcom HT1000 Legacy South Bridge
00:02.1 IDE interface: Broadcom HT1000 Legacy IDE controller
00:02.2 ISA bridge: Broadcom HT1000 LPC Bridge
00:03.0 USB Controller: Broadcom HT1000 USB Controller (rev 01)
00:03.1 USB Controller: Broadcom HT1000 USB Controller (rev 01)
00:03.2 USB Controller: Broadcom HT1000 USB Controller (rev 01)
00:04.0 VGA compatible controller: ATI Technologies Inc ES1000 (rev 02)
00:07.0 PCI bridge: Broadcom Unknown device 0140 (rev a2)
00:08.0 PCI bridge: Broadcom Unknown device 0142 (rev a2)
00:09.0 PCI bridge: Broadcom Unknown device 0144 (rev a2)
00:0a.0 PCI bridge: Broadcom Unknown device 0142 (rev a2)
00:0b.0 PCI bridge: Broadcom Unknown device 0144 (rev a2)
00:18.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] 
HyperTransport Technology Configuration
00:18.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] 
Address Map
00:18.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM 
Controller
00:18.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] 
Miscellaneous Control
00:19.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] 
HyperTransport Technology Configuration
00:19.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] 
Address Map
00:19.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM 
Controller
00:19.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] 
Miscellaneous Control
01:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5721 Gigabit 
Ethernet PCI Express (rev 21)
02:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5721 Gigabit 
Ethernet PCI Express (rev 21)
03:0d.0 PCI bridge: Broadcom HT1000 PCI/PCI-X bridge (rev c0)
03:0e.0 IDE interface: Broadcom BCM5785 (HT1000) PATA/IDE Mode


bios version:
 RN50 A21 BIOS

uname -a:

Linux static2.atrapalo.com 2.6.20-gentoo-r8 #3 SMP Wed May 23 11:16:09 CEST 
2007 x86_64 Dual-Core AMD Opteron(tm) Processor 2212 AuthenticAMD GNU/Linux

We have already tried to:

- change the disks
- append the "pci=noacpi" parameter to the kernel

but the problem is still here.
I was wondering about this line:
03:0e.0 IDE interface: Broadcom BCM5785 (HT1000) PATA/IDE Mode

does this really mean that the disk is working in PATA compatibility mode? We 
didn't touch anything BIOS related, everything is set as factory default.

Thanks in advance

-- 
Davide Ferrari
System Administrator
Atrapalo.com



More information about the Linux-PowerEdge mailing list