strange I/O errors with SAN storage
Robert von Bismarck
robert.vonbismarck at vtx-telecom.ch
Thu Jul 31 09:56:16 CDT 2008
Hello,
Have you tried booting with only one HBA ?
We have seen the same kind of errors with dual-ported SAN connections because the linux kernel tried to access the SAN volumes because
PowerPath (the EMC failover software) did not load correctly after a kernel update.
We disabled one path so that we could perform the necessary maintenance, which was to get the latest release of PowerPath and install it on the host. Reboot, reconnect the fiber, and the system was back to being it's happy self again.
NB: we do not boot from the SAN as you do, we have a local OS installation and data storage on the SAN.
This was in a PE2850 with centos 4.5 and qlogic 2340 pci-x adapters connected to a Clariion array.
Kind regards,
Robert von Bismarck
> -----Message d'origine-----
> De : linux-poweredge-bounces at dell.com
> [mailto:linux-poweredge-bounces at dell.com] De la part de
> christian.peper at kpn.com
> Envoyé : jeudi, 31. juillet 2008 11:36
> À : linux-poweredge at dell.com
> Objet : strange I/O errors with SAN storage
>
> Hi everyone,
> I hope someone can make a few suggestions as to where (what)
> to look (for). Because we're baffled and apart from creating
> new disks on the SAN, we have run out of things to check.
> Except booting from the same LUNs on a different server: no
> more hardware :( ...
>
> We've found some really strange I/O errors (PE2950, OEL/RHEL
> AS 4u5, 2x Qlogic qle2460, firmware 1.24) using LUNs on our
> DMX-3 SAN. One HBA was faulty so we replaced it. However upon
> restoring the OS and reinstalling it, more problems appeared.
> The new HBA would not boot at all using the existing disks.
> So we disabled it in the BIOS and booted from the other
> (original) HBA. Both HBAs have the same firmware, same settings.
>
> Upon booting, anything involving the disks (we boot from SAN
> and have data disks there as well) is extremely sluggish.
> Letting the server do its thing, I got a ton of I/O errors
> first during disk discovery, then again during mounting of
> file systems.
>
> ERROR: ddf1: reading /dev/sdb[Input/output error]
> ERROR: hpt37x: reading /dev/sdb[Input/output error]
> ERROR: pdc: reading /dev/sdb[Input/output error]
> ERROR: pdc: reading /dev/sdb[Input/output error]
> ERROR: pdc: reading /dev/sdb[Input/output error]
> ERROR: pdc: reading /dev/sdb[Input/output error]
> ERROR: pdc: reading /dev/sdb[Input/output error]
> ERROR: sil: reading /dev/sdb[Input/output error]
> ERROR: ddf1: reading /dev/sdc[Input/output error]
> ERROR: hpt37x: reading /dev/sdc[Input/output error]
> ERROR: pdc: reading /dev/sdc[Input/output error]
> ERROR: pdc: reading /dev/sdc[Input/output error]
> ERROR: pdc: reading /dev/sdc[Input/output error]
> ERROR: pdc: reading /dev/sdc[Input/output error]
> ERROR: pdc: reading /dev/sdc[Input/output error]
> ERROR: sil: reading /dev/sdc[Input/output error]
> ERROR: ddf1: reading /dev/sdd[Input/output error]
> ERROR: hpt37x: reading /dev/sdd[Input/output error]
> ERROR: pdc: reading /dev/sdd[Input/output error]
> ERROR: pdc: reading /dev/sdd[Input/output error]
> ERROR: pdc: reading /dev/sdd[Input/output error]
> ERROR: pdc: reading /dev/sdd[Input/output error]
> ERROR: pdc: reading /dev/sdd[Input/output error]
> ERROR: sil: reading /dev/sdd[Input/output error] ...
> and so on for all disks (LUNs) attached.
>
> Searching the web gave me a few hits but no solutions (see 1|2|3).
> However, all errors were related to local RAID setups using
> ATA/SATA disks. I am not using local RAID. We have Dell
> Poweredge 2950 servers with 2 qle2460 HBAs. The internal
> PERC5/i is enabled as it provides the swap disk space, but it
> doesn't do anything. Furthermore, sdb, sdc and so on are SAN
> disks. So why do I get RAID errors from them? Could this
> point to motherboard errors? PCI bus errors? Broken FC
> cables? Bad FC switch configuration of simply damaged LUNs
> from the SAN?
>
> I'm keeping a blog of this updated with anything new I run into...
> http://breakablelinux.blogspot.com/2008/07/strange-io-errors-w
> ith-san.ht
> ml
>
> thanks in advance,
> Chris.
>
> _______________________________________________
> Linux-PowerEdge mailing list
> Linux-PowerEdge at dell.com
> http://lists.us.dell.com/mailman/listinfo/linux-poweredge
> Please read the FAQ at http://lists.us.dell.com/faq
>
More information about the Linux-PowerEdge
mailing list