strange I/O errors with SAN storage
Robert von Bismarck
robert.vonbismarck at vtx-telecom.ch
Mon Aug 4 04:35:17 CDT 2008
Sijis,
Yes, we have to do a reinstall of PP after every kernel update, but well, that's in the update cycle now, we know that the PP-equipped boxes need an hour of work for a kernel update instead of 5 minutes :)
MPIO from the Device-Mapper seems like the way to go, but I need to find some time to test it with our production environment though.
Cheers,
RvB
> -----Message d'origine-----
> De : linux-poweredge-bounces at dell.com
> [mailto:linux-poweredge-bounces at dell.com] De la part de Sijis Aviles
> Envoyé : jeudi, 31. juillet 2008 21:08
> À : linux-poweredge at dell.com
> Objet : RE: strange I/O errors with SAN storage
>
> Robert,
>
> Did you have have to reinstall PowerPath after a kernel
> upgrade? We are using PP 4.5.1 and everytime we upgrade the
> kernel, I have to reinstall the software.
>
> Chris,
>
> I've seen similar errors on my RHEL4 servers when the path
> dies because the path to the SAN dies (HBA, cable, switch
> port) . From looking at your errors, it seems like it could
> be connectivity issue. Try changing the FC cable, it might be
> bad or seeing if there are any errors in the switch.
> Does dmesg or /var/log/messages show anything useful?
>
> We are using PE2950's and we use local disk for OS and SAN
> for storage.
>
> As a note, in our documentation, this is our config for the
> HBA to connect to our EMC Symmetrix.
> a. Host Adapter BIOS = Enabled
> b. Connection Options = 1 (Point to point only) c. Data Rate
> = 1 (2GB/S) Everyone else is defaults
>
> If I think of anything else, I'll pass it along.
>
> Sijis
> --------------
> Sijis Aviles | Systems Administrator | Empire Today, LLC
> -----Original Message-----
> From: linux-poweredge-bounces at dell.com
> [mailto:linux-poweredge-bounces at dell.com] On Behalf Of
> linux-poweredge-request at dell.com
> Sent: Thursday, July 31, 2008 12:00 PM
> To: linux-poweredge at dell.com
> Subject: Linux-PowerEdge Digest, Vol 47, Issue 55
>
> Send Linux-PowerEdge mailing list submissions to
> linux-poweredge at dell.com
>
> To subscribe or unsubscribe via the World Wide Web, visit
> http://lists.us.dell.com/mailman/listinfo/linux-poweredge
> or, via email, send a message with subject or body 'help' to
> linux-poweredge-request at dell.com
>
> You can reach the person managing the list at
> linux-poweredge-owner at dell.com
>
> When replying, please edit your Subject line so it is more
> specific than
> "Re: Contents of Linux-PowerEdge digest..."
>
>
> Today's Topics:
>
> 1. RE: Help with md3000i (Nick_Parrott at Dell.com)
> 2. RE: Help with md3000i (Harald_Jensas at Dell.com)
> 3. OMSA 5.4 Centos4.6: WARNING: srvadmin-storage configuration
> not performed; > '/etc/omreg.cfg' is missing or damaged
> (Robert Hart)
> 4. RE: strange I/O errors with SAN storage (Robert von Bismarck)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Thu, 31 Jul 2008 10:47:34 +0100
> From: <Nick_Parrott at Dell.com>
> Subject: RE: Help with md3000i
> To: <Sidney.Young at globalx.com.au>, <matthew at acfr.usyd.edu.au>
> Cc: linux-poweredge at lists.us.dell.com
> Message-ID:
>
> <2AB12D8DC5E4564DB294A4C66F0C68F1DD677D at DUBX3M11.dub.emea.dell.com>
> Content-Type: text/plain; charset="us-ascii"
>
> Hi Guys,
>
> Close, but not close enough. Some MD3000's were shipped with
> the serial cables, some were not.. not sure of the reasons
> why, just know that it's 50/50 as to whether a customer
> calling in has one.
>
> You can do a full reset (this needs a Dell tech to provide
> the syntax/credentials for the reset command) or you can
> simply look at the IP's of the management ports with a
> similar command to ifconfig, then use the Java software to
> fire in on that IP.
>
> I'm checking out the availability of this cable, as far as
> I'm aware it's not "for sale" but can be provided with a
> service call if there is another reason for us to be out to the unit.
>
> Regards,
>
> Nick
>
> -----Original Message-----
> From: linux-poweredge-bounces at dell.com
> [mailto:linux-poweredge-bounces at dell.com] On Behalf Of Sid Young
> Sent: 31 July 2008 04:51
> To: Matthew Geier
> Cc: linux-poweredge-Lists
> Subject: RE: Help with md3000i
>
>
> Yes I suspect that is the case, the serial cable activates
> something, however I am told there is a password that needs
> to be applied and instructions from a level 2/3 tech to
> resolve access.
>
> Could be something really simple like "wipe config all<enter>" ;)
>
> Sid
>
> -----Original Message-----
> From: Matthew Geier [mailto:matthew at acfr.usyd.edu.au]
> Sent: Thursday, July 31, 2008 12:15 PM
> To: Scott R. Ehrlich
> Cc: Sid Young; Linux-PowerEdge at dell.com
> Subject: Re: Help with md3000i
>
> Scott R. Ehrlich wrote:
> > Can you put it on an isolated LAN or use a crossover cable
> between it
> > and a PC, and use a network sniffer to monitor IP
> addresses? If so,
> > you should be able to get the IP from it through the sniffer.
> >
> > As for the unlock kit, I'd like to learn more about that, too -
> > exactly what it does, what it costs, etc.
> >
> >
> What's the bet the 'unlock kit' is the serial console cable :-)
>
> My MD3000i came with a serial cable which when plugged in
> gives you access to a service console. You can do all sorts
> of scary stuff from the service console - including
> determining the IP address of the controller :-)
>
>
>
> _______________________________________________
> Linux-PowerEdge mailing list
> Linux-PowerEdge at dell.com
> http://lists.us.dell.com/mailman/listinfo/linux-poweredge
> Please read the FAQ at http://lists.us.dell.com/faq
>
>
>
> ------------------------------
>
> Message: 2
> Date: Thu, 31 Jul 2008 11:52:41 +0200
> From: <Harald_Jensas at Dell.com>
> Subject: RE: Help with md3000i
> To: <matthew at acfr.usyd.edu.au>, <johnh at comp.leeds.ac.uk>
> Cc: linux-poweredge at lists.us.dell.com
> Message-ID:
>
> <87C820D35C176D428A1A8A3B34F6FB86BACE06 at uppx3m1.upp.emea.dell.com>
> Content-Type: text/plain; charset="US-ASCII"
>
> > -----Original Message-----
> > From: linux-poweredge-bounces at dell.com [mailto:linux-poweredge-
> > bounces at dell.com] On Behalf Of Matthew Geier
> > Sent: 31 July 2008 04:08
> > To: John Hodrien
> > Cc: linux-poweredge-Lists
> > Subject: Re: Help with md3000i
> >
> > John Hodrien wrote:
> > > On Tue, 29 Jul 2008, Matthew Geier wrote:
> > >
> > >
> > >> There is a windows app that some how finds what IP address the
> > >> MD3000i has given itself. If you don't have control of your DHCP
> > >> server and thus the IP address it gets, this might be
> the only easy
> way to
> > find out what IP it got.
> > >>
> > >
> > > Does it not respond to a broadcast ping?
> > >
> >
> > It does, but so does at lot of other stuff, it really
> doesn't help a
> great
> > deal unless your network is small, and if it's small you
> probably can
> see what
> > your DHCP server is up to if you have one.
>
> The MD3000i Configuration Utility has an option to
> Automatically detect MD3000i arrays in the subnet. It will
> take quite some time in a large network, but it should work.
> The utility is available on version 1.4 or later of the
> MD3000i resource CD.
>
>
> 1. Start MD3000i Configuration Utility. (Run 'MDconfig.sh' if
> you are in
> Linux.)
> 2. Select "Configure MD3000i" and Click Next.
> 3. To discover available storage arrays choose "Discover New
> Arrays" and click Next.
> 4. To perform an automatic discovery of storage arrays within
> the local subnet choose "Automatic" and click next.
> 5. Follow trouch the wizard to configure your MD3000i.
>
>
>
> --
> Harald
>
>
>
>
>
>
>
>
>
> ------------------------------
>
> Message: 3
> Date: Thu, 31 Jul 2008 10:02:52 -0400
> From: "Robert Hart" <meteobobdell at gmail.com>
> Subject: OMSA 5.4 Centos4.6: WARNING: srvadmin-storage configuration
> not performed; > '/etc/omreg.cfg' is missing or damaged
> To: linux-poweredge at dell.com
> Message-ID:
> <c6551ca10807310702m399be5ccue5b0472ecea26787 at mail.gmail.com>
> Content-Type: text/plain; charset="iso-8859-1"
>
> I have searched through every prior post and via google, and
> cannot find an answer to this. I apologize if it has been
> and I missed it.
>
> I have successfully run OMSA on seven different PE. They have worked
> wonderfully. However, yesterday, I got two more: PE2900
> and PET300.
> Unfortunately, I cannot get OMSA to work on these. They all
> run CentOS4.6, including the machines on which OMSAworks.
>
> The only difference with the machines that don't work is that
> they are new hardware (PE2900 and PET300 for first time) and
> I believe the yum install is from OMSA 5.4 directly. I
> believe prior installs were 5.3 which then upgraded to 5.4 with yum.
>
> The specific error I am seeing on both machines is:
>
> WARNING: srvadmin-storage configuration not performed;
> '/etc/omreg.cfg'
> is missing or damaged
>
> [Yet of course the file in etc is there with no obvious
> errors when comparing to "good" installs].
>
> After srvadmin-services start, I cannot even get
> https:...1311 to bring up the web page.
>
> Is the problem going with 5.4 to start with? If so, is there
> an easy way to have the yum repo use 5.3 to start?
>
> Thanks again for your help.
>
> Bob
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL:
> http://lists.us.dell.com/pipermail/linux-poweredge/attachments
> /20080731/
> 207455bf/attachment-0001.htm
>
> ------------------------------
>
> Message: 4
> Date: Thu, 31 Jul 2008 16:56:16 +0200
> From: "Robert von Bismarck" <robert.vonbismarck at vtx-telecom.ch>
> Subject: RE: strange I/O errors with SAN storage
> To: <christian.peper at kpn.com>, <linux-poweredge at dell.com>
> Message-ID:
>
> <07C3015B9E21F949AB59B28F2D5B567F01EBC0C9 at exch-pul-01.interne.
> smart-tele
> com.ch>
>
> Content-Type: text/plain; charset="iso-8859-1"
>
> Hello,
>
> Have you tried booting with only one HBA ?
>
> We have seen the same kind of errors with dual-ported SAN
> connections because the linux kernel tried to access the SAN
> volumes because PowerPath (the EMC failover software) did not
> load correctly after a kernel update.
> We disabled one path so that we could perform the necessary
> maintenance, which was to get the latest release of PowerPath
> and install it on the host. Reboot, reconnect the fiber, and
> the system was back to being it's happy self again.
> NB: we do not boot from the SAN as you do, we have a local OS
> installation and data storage on the SAN.
>
> This was in a PE2850 with centos 4.5 and qlogic 2340 pci-x
> adapters connected to a Clariion array.
>
> Kind regards,
>
> Robert von Bismarck
>
>
>
>
> > -----Message d'origine-----
> > De : linux-poweredge-bounces at dell.com
> > [mailto:linux-poweredge-bounces at dell.com] De la part de
> > christian.peper at kpn.com Envoy? : jeudi, 31. juillet 2008 11:36 ? :
> > linux-poweredge at dell.com Objet : strange I/O errors with SAN storage
> >
> > Hi everyone,
> > I hope someone can make a few suggestions as to where
> (what) to look
> > (for). Because we're baffled and apart from creating new
> disks on the
> > SAN, we have run out of things to check.
> > Except booting from the same LUNs on a different server: no more
> > hardware :( ...
> >
> > We've found some really strange I/O errors (PE2950,
> OEL/RHEL AS 4u5,
> > 2x Qlogic qle2460, firmware 1.24) using LUNs on our
> > DMX-3 SAN. One HBA was faulty so we replaced it. However upon
> > restoring the OS and reinstalling it, more problems appeared.
> > The new HBA would not boot at all using the existing disks.
> > So we disabled it in the BIOS and booted from the other
> > (original) HBA. Both HBAs have the same firmware, same settings.
> >
> > Upon booting, anything involving the disks (we boot from
> SAN and have
> > data disks there as well) is extremely sluggish.
> > Letting the server do its thing, I got a ton of I/O errors first
> > during disk discovery, then again during mounting of file systems.
> >
> > ERROR: ddf1: reading /dev/sdb[Input/output error]
> > ERROR: hpt37x: reading /dev/sdb[Input/output error]
> > ERROR: pdc: reading /dev/sdb[Input/output error]
> > ERROR: pdc: reading /dev/sdb[Input/output error]
> > ERROR: pdc: reading /dev/sdb[Input/output error]
> > ERROR: pdc: reading /dev/sdb[Input/output error]
> > ERROR: pdc: reading /dev/sdb[Input/output error]
> > ERROR: sil: reading /dev/sdb[Input/output error]
> > ERROR: ddf1: reading /dev/sdc[Input/output error]
> > ERROR: hpt37x: reading /dev/sdc[Input/output error]
> > ERROR: pdc: reading /dev/sdc[Input/output error]
> > ERROR: pdc: reading /dev/sdc[Input/output error]
> > ERROR: pdc: reading /dev/sdc[Input/output error]
> > ERROR: pdc: reading /dev/sdc[Input/output error]
> > ERROR: pdc: reading /dev/sdc[Input/output error]
> > ERROR: sil: reading /dev/sdc[Input/output error]
> > ERROR: ddf1: reading /dev/sdd[Input/output error]
> > ERROR: hpt37x: reading /dev/sdd[Input/output error]
> > ERROR: pdc: reading /dev/sdd[Input/output error]
> > ERROR: pdc: reading /dev/sdd[Input/output error]
> > ERROR: pdc: reading /dev/sdd[Input/output error]
> > ERROR: pdc: reading /dev/sdd[Input/output error]
> > ERROR: pdc: reading /dev/sdd[Input/output error]
> > ERROR: sil: reading /dev/sdd[Input/output error] ...
> > and so on for all disks (LUNs) attached.
> >
> > Searching the web gave me a few hits but no solutions (see 1|2|3).
> > However, all errors were related to local RAID setups using
> ATA/SATA
> > disks. I am not using local RAID. We have Dell Poweredge
> 2950 servers
> > with 2 qle2460 HBAs. The internal PERC5/i is enabled as it provides
> > the swap disk space, but it doesn't do anything.
> Furthermore, sdb, sdc
>
> > and so on are SAN disks. So why do I get RAID errors from
> them? Could
> > this point to motherboard errors? PCI bus errors? Broken FC cables?
> > Bad FC switch configuration of simply damaged LUNs from the SAN?
> >
> > I'm keeping a blog of this updated with anything new I run into...
> > http://breakablelinux.blogspot.com/2008/07/strange-io-errors-w
> > ith-san.ht
> > ml
> >
> > thanks in advance,
> > Chris.
> >
> > _______________________________________________
> > Linux-PowerEdge mailing list
> > Linux-PowerEdge at dell.com
> > http://lists.us.dell.com/mailman/listinfo/linux-poweredge
> > Please read the FAQ at http://lists.us.dell.com/faq
> >
>
>
>
> ------------------------------
>
> _______________________________________________
> Linux-PowerEdge mailing list
> Linux-PowerEdge at dell.com
> http://lists.us.dell.com/mailman/listinfo/linux-poweredge
> Please read the FAQ at http://lists.us.dell.com/faq
>
> End of Linux-PowerEdge Digest, Vol 47, Issue 55
> ***********************************************
>
>
> The information contained in this message may be confidential
> and is for the intended addressee only.
> Any unauthorized use, dissemination of the information, or
> copying of this message is prohibited.
> If you are not the intended addressee, please notify the
> sender immediately and delete this message.
>
> _______________________________________________
> Linux-PowerEdge mailing list
> Linux-PowerEdge at dell.com
> http://lists.us.dell.com/mailman/listinfo/linux-poweredge
> Please read the FAQ at http://lists.us.dell.com/faq
>
More information about the Linux-PowerEdge
mailing list