From ksuehring at web.de Sun Jan 2 07:52:28 2011
From: ksuehring at web.de (Karsten Suehring)
Date: Sun, 02 Jan 2011 14:52:28 +0100
Subject: mptlinux for newer kernel
In-Reply-To: <4CC1CCA2.5030006@web.de>
References: <4CC1CCA2.5030006@web.de>
Message-ID: <4D20831C.1080805@web.de>
Just as information for everybody else who might have issues with the
mptlinux kernel module:
LSI has recently released version 4.24 (although the release date still
says 02-JUN-10) of the driver. The dkms package compiles without
problems on Ubuntu 10.04 and 10.10.
Best regards,
Karsten
On 22.10.2010 19:40, Karsten Suehring wrote:
> Hi,
>
> I'm using SAS controllers (SAS 6/iR and SAS 5/E) in some of my machines which use the mptlinux
> driver.
>
> It seems that even recent kernels still come with version 3.04.15 while OMSA complains that the
> minimum supported version is 3.12.29.00.
>
> I usually installed version 4.18 from LSI via dkms on my Ubuntu machines. Unfortunately even
> the newest release 4.22 does not compile with a newer kernel, e.g. 2.6.35 that comes with
> Ubuntu 10.10.
>
> I found a partial patch on the kernel.org bugzilla, but it does not resolve all problems:
>
> https://bugzilla.kernel.org/show_bug.cgi?id=16010
>
> Are there any kernel hackers or LSI developers here that could help with that?
>
> Does anybody know if there is a new version of the driver planned?
>
> I'm attaching the build log.
>
> Best regards,
> Karsten
>
>
>
>
> _______________________________________________
> Linux-PowerEdge mailing list
> Linux-PowerEdge at dell.com
> https://lists.us.dell.com/mailman/listinfo/linux-poweredge
> Please read the FAQ at http://lists.us.dell.com/faq
From matthew.garman at gmail.com Sun Jan 2 17:30:07 2011
From: matthew.garman at gmail.com (Matt Garman)
Date: Sun, 2 Jan 2011 17:30:07 -0600
Subject: spontaneous reboots on R610 w/CentOS 4.7
In-Reply-To: <0A6B7C731722FD46B7FDC5CBF3A2FF509F36DC48@AUSX7MCPC102.AMER.DELL.COM>
References:
<0A6B7C731722FD46B7FDC5CBF3A2FF509F36DC48@AUSX7MCPC102.AMER.DELL.COM>
Message-ID:
On Thu, Dec 30, 2010 at 1:51 PM, wrote:
> It seems folly trying to stabilize such an old OS version. ?If I'm reading this correctly -- Centos 4.7. ?R610 is modern hardware, that shouldn't be your problem. ?Such an old OS version could well be your problem. ?Or at least a contributor.
>
> Even relatively modern Linux versions have had NIC driver bugs that trigger under heavy network load. ?In 5.2, Intel NICs had the theoretical TSO bug. ?In 5.4, Broadcom NICs had the MSI-X bug. ?Both bugs triggered only under high network load.
>
> 1. Why don't you start w/ Centos 5.5, at a min? ?Or if you go w/ Centos 5.4, put in enable_msi=0 option in your /etc/modprobe.conf file for your bnx2 driver.
The server is in a remote facility, so installing a new OS is tricky.
I'm looking into creating a custom CentOS install ISO that I can use
over IPMI, or maybe doing a pxeboot-based install. But I've never
done either, so I have some learning to do before I get there.
[snip]
It appears that it's not just network load. I was trying to simply
update to CentOS 4.8, and got it to spontaneously reboot three times:
twice during heavy disk activity, and once during the actual booting
of the OS. In other words, the reboots also correlate with heavy disk
I/O.
And FWIW, when I initially posted the message, when I said "heavy
network load", what I was doing was backing up huge files. So while
there was a high network load, there was implicitly decent disk load
(enough to saturate a gigabit connection anyway).
Anyway, after relaying this information to the tech support person
with whom I was working, he had me update to the latest raid card
driver. I did that, but continued to experience reboots. At this
point, the support individual agreed to send a tech out and have the
RAID card replaced under warranty.
> BTW, even though my email address says "@dell.com", I'm not in any department providing official support to customers. ?So the steps outlined above are not any official communiqu? from Dell, they are just one [seasoned & scarred :-) ] Linux sysadmin speaking informally to another. ?Capeesh?
Understood, I appreciate any help!
From matthew.garman at gmail.com Sun Jan 2 17:37:53 2011
From: matthew.garman at gmail.com (Matt Garman)
Date: Sun, 2 Jan 2011 17:37:53 -0600
Subject: spontaneous reboots on R610 w/CentOS 4.7
In-Reply-To: <8B828B002113784DAA5045FD776438D3013E653D@EXCHANGE.ad.wsicorp.com>
References:
<8B828B002113784DAA5045FD776438D3013E653D@EXCHANGE.ad.wsicorp.com>
Message-ID:
On Thu, Dec 30, 2010 at 3:52 PM, Flaherty, Patrick wrote:
>> We have a Dell PE R610 server running CentOS 4.7 that spontaneously
>> reboots under heavy network load. ?When there is little or no network
>> load, the machine appears perfectly stable.
>
> I would bet heavy network load also means heavy cpu/memory load. What
> about if you just tar/bzip up a large directory? Does it happen then?
> Oh, and the standard questions, are you running a stock kernel? Did you
> change any system packages? Are you doing anything dumb/weird (cronjob
> running corrupt_my_memory.bash)?
Yup, you are right. When I initially wrote "heavy network load", I
was actually backing up a massive file, which implies big disk I/O as
well. As I did more experimenting, I found that heavy disk I/O alone
would trigger the reboots. So now we're getting the raid card
replaced under warranty.
> This log entry looks very scary to me.
> Critical ? ? ? ?Thu Dec 23 2010 17:25:11 ? ? ? ?CPU1 Status: Processor
> sensor for CPU1, IERR was asserted
>
> This forum post says memory problem?
> http://communities.vmware.com/message/1404777
>
> Can you reboot the box and run memtest86 for a bit? Mount a virtual iso
> of memtest thru the drac?
I ran one pass of it without any problems. But I just kicked off
memtest again; this time I'll let it run longer than 1 pass. (Still
not 100% sure the raid card is the problem, so might as well cover all
the bases.)
Thanks!
Matt
From xpoinsard at openpricer.com Mon Jan 3 08:13:32 2011
From: xpoinsard at openpricer.com (Xavier Poinsard)
Date: Mon, 03 Jan 2011 15:13:32 +0100
Subject: rpm error trying to do firmware update on poweredge 2950
Message-ID:
Hi all,
I tried to do yum install $(/usr/sbin/bootstrap_firmware)
And I got an error :
Dependencies Resolved
=====================================================================================================================================================================================================
Package
Arch Version
Repository Size
=====================================================================================================================================================================================================
Installing:
BCM5708_Copper_LOM_ven_0x14e4_dev_0x164c
noarch a02-1
dell-omsa-indep 1.9 M
PERC_5_i_Integrated_ven_0x1028_dev_0x0015_subven_0x1028_subdev_0x1f03
noarch a09-1
dell-omsa-indep 992 k
PERC_6_E_Adapter_ven_0x1000_dev_0x0060_subven_0x1028_subdev_0x1f0a
noarch a12-1
dell-omsa-indep 1.5 M
system_bios_PowerEdge_2950
noarch 55:2.7.0-20
fwupdate 422 k
Installing for dependencies:
dell_ft_ie_interface
noarch 1.0.10-1.3.el5
dell-omsa-indep 24 k
dell_ie_bios
x86_64 3.0.0-1.15.2.el5
dell-omsa-indep 48 k
dell_ie_nic_broadcom
x86_64 1.1.0-3
dell-omsa-indep 1.3 M
dell_ie_sas
x86_64 3.0.0-1.15.2.el5
dell-omsa-indep 165 k
libsmal0
x86_64 3.0.0-1.15.1.el5
dell-omsa-indep 786 k
Transaction Summary
=====================================================================================================================================================================================================
Install 9 Package(s)
Upgrade 0 Package(s)
Total download size: 7.2 M
Is this ok [y/N]: y
Downloading Packages:
(1/9): dell_ft_ie_interface-1.0.10-1.3.el5.noarch.rpm
| 24 kB 00:00
(2/9): dell_ie_bios-3.0.0-1.15.2.el5.x86_64.rpm
| 48 kB 00:00
(3/9): dell_ie_sas-3.0.0-1.15.2.el5.x86_64.rpm
| 165 kB 00:00
(4/9): system_bios_PowerEdge_2950-2.7.0-20.noarch.rpm
| 422 kB 00:00
(5/9): libsmal0-3.0.0-1.15.1.el5.x86_64.rpm
| 786 kB 00:00
(6/9):
PERC_5_i_Integrated_ven_0x1028_dev_0x0015_subven_0x1028_subdev_0x1f03-a09-1.noarch.rpm
| 992 kB 00:00
(7/9): dell_ie_nic_broadcom-1.1.0-3.x86_64.rpm
| 1.3 MB 00:00
(8/9):
PERC_6_E_Adapter_ven_0x1000_dev_0x0060_subven_0x1028_subdev_0x1f0a-a12-1.noarch.rpm
| 1.5 MB 00:00
(9/9): BCM5708_Copper_LOM_ven_0x14e4_dev_0x164c-a02-1.noarch.rpm
| 1.9 MB 00:00
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Total
7.8 MB/s | 7.2 MB 00:00
Running rpm_check_debug
ERROR with rpm_check_debug vs depsolve:
rpmlib(FileDigests) is needed by system_bios_PowerEdge_2950-2.7.0-20.noarch
rpmlib(PayloadIsXz) is needed by system_bios_PowerEdge_2950-2.7.0-20.noarch
Complete!
(1, [u'Please report this error in
https://bugzilla.redhat.com/enter_bug.cgi?product=Red%20Hat%20Enterprise%20Linux%205&component=yum'])
Best regards,
Xavier Poinsard.
From ernst.pijper at sara.nl Mon Jan 3 10:33:24 2011
From: ernst.pijper at sara.nl (Ernst Pijper)
Date: Mon, 03 Jan 2011 17:33:24 +0100
Subject: mptlinux for newer kernel
In-Reply-To: <4D20831C.1080805@web.de>
References: <4CC1CCA2.5030006@web.de> <4D20831C.1080805@web.de>
Message-ID: <4D21FA54.2070705@sara.nl>
Hi,
I can't install the 4.24 version on Centos 5.5 because of a LZMA issue:
[root at speeltje ~]# rpm -Uvh mptlinux-4.24.00.00-1dkms.noarch.rpm
warning: mptlinux-4.24.00.00-1dkms.noarch.rpm: Header V4 DSA signature:
NOKEY, key ID fb780cbe
error: Failed dependencies:
rpmlib(PayloadIsLzma) <= 4.4.2-1 is needed by
mptlinux-4.24.00.00-1dkms.noarch
As i understand lzma is not supported until rpm version 4.7. On centos
5.5 version 4.4.2.3 is available. So, for now, i just stick with the
4.22 mptlinux version which installs and compiles just fine.
Ernst
Karsten Suehring wrote:
> Just as information for everybody else who might have issues with the
> mptlinux kernel module:
>
> LSI has recently released version 4.24 (although the release date still
> says 02-JUN-10) of the driver. The dkms package compiles without
> problems on Ubuntu 10.04 and 10.10.
>
>
>
> Best regards,
> Karsten
>
> On 22.10.2010 19:40, Karsten Suehring wrote:
>
>> Hi,
>>
>> I'm using SAS controllers (SAS 6/iR and SAS 5/E) in some of my machines which use the mptlinux
>> driver.
>>
>> It seems that even recent kernels still come with version 3.04.15 while OMSA complains that the
>> minimum supported version is 3.12.29.00.
>>
>> I usually installed version 4.18 from LSI via dkms on my Ubuntu machines. Unfortunately even
>> the newest release 4.22 does not compile with a newer kernel, e.g. 2.6.35 that comes with
>> Ubuntu 10.10.
>>
>> I found a partial patch on the kernel.org bugzilla, but it does not resolve all problems:
>>
>> https://bugzilla.kernel.org/show_bug.cgi?id=16010
>>
>> Are there any kernel hackers or LSI developers here that could help with that?
>>
>> Does anybody know if there is a new version of the driver planned?
>>
>> I'm attaching the build log.
>>
>> Best regards,
>> Karsten
>>
>>
>>
>>
>> _______________________________________________
>> Linux-PowerEdge mailing list
>> Linux-PowerEdge at dell.com
>> https://lists.us.dell.com/mailman/listinfo/linux-poweredge
>> Please read the FAQ at http://lists.us.dell.com/faq
>>
>
> _______________________________________________
> Linux-PowerEdge mailing list
> Linux-PowerEdge at dell.com
> https://lists.us.dell.com/mailman/listinfo/linux-poweredge
> Please read the FAQ at http://lists.us.dell.com/faq
>
From stroller at stellar.eclipse.co.uk Mon Jan 3 10:57:30 2011
From: stroller at stellar.eclipse.co.uk (Stroller)
Date: Mon, 3 Jan 2011 16:57:30 +0000
Subject: OT: upgrading 2850 to hardware RAID.
Message-ID: <0115A8AC-1481-4CDD-BD34-F2D70D25421B@stellar.eclipse.co.uk>
Hi there,
I've got a 2850 which I'm trying to upgrade to hardware RAID.
I kinda assumed that I'd just install the RAID key, RAM & battery, and on next boot-up the system would just recognise them. That's not happening for me at all.
All I see in BIOS, when the system boots up, is the same messages as before, referring to an LSI controller:
http://stuff.stroller.uk.eu.org/Poweredge%202850/LSI%20BIOS%201.jpg
http://stuff.stroller.uk.eu.org/Poweredge%202850/LSI%20BIOS%202.jpg
Pressing CTRL-A doesn't give any options to configure arrays, only regarding the order of the SCSI channels.
I think the hardware RAID BIOS is accessed with CTRL-M (??), but there is no mention of this option on the screen.
I have now tried two different RAID keys, and neither appear to work.
Am I missing an installation step, or is it possible the RAID hardware on the motherboard is knackered?
I assume the problem is not with the battery or RAM, because if that were the case I would expect the RAID BIOS stuff to be shown but give an error complaining about those.
The system simply boots to the operating system which was already installed on the non-RAID controller. Output of `lspci -vt` is attached - only the "LSI Logic / Symbios Logic 53c1030 PCI-X Fusion-MPT Dual Ultra320 SCSI (rev 08)" is mentioned. I'm not actually sure what I should be expecting to see here for the RAID hardware, as I don't have another 2850 (or equivalent) up and running at the moment.
I have upgraded the 2850's BIOS to the latest version (A07), but this makes no difference. There are mentions of an "embedded RAID controller" in a couple of BIOS screens, but I can't help thinking this is actually just referring to the non-RAID LSI SCSI controller.
http://stuff.stroller.uk.eu.org/Poweredge%202850/RAID_screenshot.jpg
http://stuff.stroller.uk.eu.org/Poweredge%202850/RAID_screenshot2.jpg
Does anyone have any suggestions, please?
The only things I can think of is that the 2850's current BIOS is newer than that of the RAID-on-motherboard, and I maybe need to downgrade the 2850's BIOS in order for it to be recognised (then to upgrade both). Alternatively, maybe I need to use a DOS boot disk and flash the SCSI firmware to make it RAID. But I don't really want to be blundering around in the dark, so I'd appreciate some advice before I proceed further - I've spent a couple of hours on this already, and the answer isn't immediately obvious (nor apparent to my Google searches).
TIA for any suggestions,
Stroller.
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: lspci.txt
Url: http://lists.us.dell.com/pipermail/linux-poweredge/attachments/20110103/c762d3d0/attachment.txt
From stroller at stellar.eclipse.co.uk Mon Jan 3 11:46:42 2011
From: stroller at stellar.eclipse.co.uk (Stroller)
Date: Mon, 3 Jan 2011 17:46:42 +0000
Subject: OT: upgrading 2850 to hardware RAID. RESOLVED
In-Reply-To:
References: <0115A8AC-1481-4CDD-BD34-F2D70D25421B@stellar.eclipse.co.uk>
Message-ID: <0AF6059F-14C0-41E7-A06C-61A78A45E90D@stellar.eclipse.co.uk>
Blimey, isn't it stonkingly obvious when you know how?
I didn't realise that *was* an option.
Many thanks for your help,
Stroller.
On 3/1/2011, at 5:08pm, Ben wrote:
> Have you gone into the BIOS and changed the controller's setting from SCSI to RAID (or whatever the other option is)?
>
> Ben
>
>
> On Mon, 3 Jan 2011, Stroller wrote:
>> ...
>> I've got a 2850 which I'm trying to upgrade to hardware RAID.
>>
>> I kinda assumed that I'd just install the RAID key, RAM & battery, and on next boot-up the system would just recognise them. That's not happening for me at all.
>>
>> All I see in BIOS, when the system boots up, is the same messages as before, referring to an LSI controller:
>> http://stuff.stroller.uk.eu.org/Poweredge%202850/LSI%20BIOS%201.jpg
>> http://stuff.stroller.uk.eu.org/Poweredge%202850/LSI%20BIOS%202.jpg
From tim at seoss.co.uk Mon Jan 3 12:41:50 2011
From: tim at seoss.co.uk (Tim Small)
Date: Mon, 03 Jan 2011 18:41:50 +0000
Subject: mptlinux for newer kernel
In-Reply-To: <4D21FA54.2070705@sara.nl>
References: <4CC1CCA2.5030006@web.de> <4D20831C.1080805@web.de>
<4D21FA54.2070705@sara.nl>
Message-ID: <4D22186E.3060102@seoss.co.uk>
On 03/01/11 16:33, Ernst Pijper wrote:
> I can't install the 4.24 version on Centos 5.5 because of a LZMA issue:
>
It's worth noting that the version of the mpt fusion driver in the
official kernel.org tree is v3.04.17. The kernel.org maintainers don't
promote the maintenance of out-of-tree code. If this 4.x driver is so
great, then why don't LSI have it in the kernel.org tree, instead of the
3.x versions which they seem to sort-of-maintain.
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=drivers/message/fusion/mptbase.h;h=f71f2294847780472851bd17835937aaa8fce6f1;hb=HEAD
This is a bit of an odd situation, but it's worth noting that this 4.x
branch does still have this out-of-tree status, and the kernel.org
maintainers generally don't keep code out of the kernel unless there's a
good reason. So the possibilities seem to be that either LSI don't want
the code in the main kernel for some weird reason, or it isn't good
enough to go in.
Same with the Redhat maintainers, so even if there was some sort of
political trouble between LSI and the kernel.org folks - if Redhat
thought that this 4.x branch was the best driver, they'd probably be
shipping it in their enterprise kernels.
AFAIK Redhat don't advise it's use either.
I would be wary of running this 4.x code for that reason...
In fact, that whole weird-attitude by LSI, together with various
reliability problems I've seen with LSI mpt chips means that I normally
advise avoiding them entirely if at all possible. Intel ICHx AHCI
controllers seem to be better engineered, better supported, and are
certainly many-many times better tested just because there are probably
1000x more of them in circulation.
Now when is Dell going to start supporting Intel ICHx for hotplug too,
like the other vendors do?
Tim.
--
South East Open Source Solutions Limited
Registered in England and Wales with company number 06134732.
Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ
VAT number: 900 6633 53 http://seoss.co.uk/ +44-(0)1273-808309
From sdowdy at ucar.edu Mon Jan 3 13:24:08 2011
From: sdowdy at ucar.edu (Stephen Dowdy)
Date: Mon, 03 Jan 2011 12:24:08 -0700
Subject: Nautilus SAS/SATA firmware update disks: Inconsistencies and Quality
Control and missing f/w images
Message-ID: <4D222258.1070301@ucar.edu>
Anyone else have issues with the Nautilus offline Disk firmware
update utilities? (specifically i'm referring to A28, the latest)
The Readme file has several errors and inconsistencies, so i
can't be sure if things aren't working as expected because the
documentation is wrong or because of something else. (See bottom of
message for examples, i don't want to clutter my core issue)
Specifically, i have a problem with updating ST3300655SS...
There's an *URGENT* update for f/w S52C for these disks i want
to apply.
The Nautilus A28 shows the following firmware updates as documented:
$ grep ST3300655SS *.txt
R288929.txt:17.0 Seagate SAS 3.5", offline download version S51A, for drive models ST3300655SS, ST3146855SS and ST373455SS
R288929.txt:45.0 Seagate 15k5 SAS, 3.5" version S52C for drive models ST3300655SS, ST3146855SS and ST373455SS.
R288929.txt:Seagate 15k5 SAS, 3.5" version S52C for drive models ST3300655SS, ST3146855SS and ST373455SS.
So, S52C deprecates S51A later on. However, the system i just
updated had f/w S517 and....
Nautilus' interactive update mode showed
Disk XXX ST3300655SS firmware S519 != S517 (or something like that).
Okay, so there's another firmware image on the ISO for this drive
that isn't either version mentioned in the Release Notes, and it is
LOWER than the either version mentioned.
Fine, do the update. Then Nautilus reports:
Success: Disk XXX ST3300655SS firmware S51A = S51A (or something like that).
So, it appears that Nautilus didn't update to the version it first reported,
but a version slightly higher (one mentioned earlier in the release notes).
But what about S52C, the one it SHOULD be applying?
$ grep -Rail S519 .
./sas/seagate/15k5/S519.fwh
./sas/seagate/15k5/S51a.fwh
$ cd ./sas/seagate/15k5/
$ ls
15dps515.lod 15fps515.lod S517.fwh S519.fwh S51a.fwh S527.fwh S52C.fwh
As i understand it, there's a 256byte header on the FWH files Dell adds:
sdowdy at zia:/d2/VBOX_Share/SAS_SATA_A28/files/fw/sas/seagate/15k5$ od -a -N 256 S52C.fwh
0000000 sp sp sp sp D E L L ht S 5 2 C S 5 2
0000020 7 x soh nul nul nul nul nul nul nul nul nul nul nul nul nul
0000040 nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul
0000060 nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul etx
0000100 sp sp sp 1 6 1 1 2 sp sp sp sp sp sp sp sp
0000120 sp sp sp sp sp sp sp sp sp sp sp sp sp sp sp sp
0000140 sp sp sp sp sp S T 3 3 0 0 6 5 5 S S
0000160 sp sp sp 1 6 1 1 1 sp sp sp sp sp sp sp sp
0000200 sp sp sp sp sp sp sp sp sp sp sp sp sp sp sp sp
0000220 sp sp sp sp sp S T 3 1 4 6 8 5 5 S S
0000240 sp sp sp 1 6 1 1 3 sp sp sp sp sp sp sp sp
0000260 sp sp sp sp sp sp sp sp sp sp sp sp sp sp sp sp
0000300 sp sp sp sp sp sp S T 3 7 3 4 5 5 S S
0000320 nul nul nul nul nul nul nul nul nul nul nul nul nul nul bel nul
0000340 nul soh nul nul nul nul nul nul nul nul nul nul nul nul nul nul
0000360 nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul
0000400
I don't know the structure of this header, but i'm guessing that maybe
this is saying S52C can only be applied to disks currently using S527
and higher?
Okay, check S527 Firmware header file...
0000000 sp sp sp sp D E L L ht S 5 2 7 S 5 2
0000020 0 nul soh nul nul nul nul nul nul nul nul nul nul nul nul nul
Looks like that can only be applied to disks running S520 or higher,
but there are NO firmware images in that directory that contain the
fw version S520.
sdowdy at zia:/d2/VBOX_Share/SAS_SATA_A28/files/fw/sas/seagate/15k5$ od -a -N 256 S51a.fwh
0000000 sp sp sp sp D E L L ht S 5 1 A S 5 1
0000020 9 x soh nul nul nul nul nul nul nul nul nul nul nul nul nul
...
sdowdy at zia:/d2/VBOX_Share/SAS_SATA_A28/files/fw/sas/seagate/15k5$ od -a -N 256 S519.fwh
0000000 sp sp sp sp D E L L ht S 5 1 9 S 5 1
0000020 5 x soh nul nul nul nul nul nul nul nul nul nul nul nul nul
(confirms that S51A requires S519 and S519 requires S515. So, i think
my suspicion that Nautilus only reports the first match it finds, but
iteratively does firmware updates that it CAN do, is accurate)
So, as best i can guess, there is a hole here in getting the drive
updated to the URGENT firmware release because no S520.fwh exists in
the Nautilus 'fw' directories.
Can someone at Dell (or elsewhere) confirm that my suspicions are
correct or not? (that a FWH file is simply missing here?)
And i know i'm hoping for too much, but a description of the Header
format would be awesome, because I don't know of any Dell ONLINE
tools/resources to identify disks with out of date firmware, so
i'm maintaining my own script based upon what i can glean from the
Nautilus release notes and files/fw/... files. E.G.
--------------------------------------------------------------------
# check-diskfw
This system has one or more disks that MAY require a firmware update.
DISK ST3300655SS firmware S517 is not S52C [Ref:Nautilus SAS/SATA Release A28 45.0 (urgent)]
DISK WDCWD5002ABYS-1 firmware 3B04 is not 3B05 [Ref:Nautilus SAS/SATA Release A28 40.0 (recommended)]
# DEBUG=1 /raid/check-diskfw
...
AWKDEBUG: MD1000,A.04,MD1000 RAID Enclosure Version should be A.04 or higher
AWKDEBUG: NO MATCH FOR HDS725050KLA360-AB5A
AWKDEBUG: ST3300655SS,S515,Nautilus SAS/SATA Release A07
AWKDEBUG: ST3300655SS,S51A,Nautilus SAS/SATA Release A28 45.0 (urgent) -- sdowdy: this is the version A28 installed, not S52C
AWKDEBUG: ST3300655SS,S528,http://ftp.us.dell.com/sas-hdd/FRMW_LX_R189919.BIN
AWKDEBUG: ST3300655SS,S52C,Nautilus SAS/SATA Release A28 45.0 (urgent)
AWKDEBUG: WDCWD5002ABYS-1,3B05,Nautilus SAS/SATA Release A28 40.0 (recommended)
DISK ST3300655SS firmware S517 is not S515 [Ref:Nautilus SAS/SATA Release A07]
DISK ST3300655SS firmware S517 is not S51A [Ref:Nautilus SAS/SATA Release A28 45.0 (urgent) -- sdowdy: this is the version A28 installed]
DISK ST3300655SS firmware S517 is not S528 [Ref:http://ftp.us.dell.com/sas-hdd/FRMW_LX_R189919.BIN]
DISK ST3300655SS firmware S517 is not S52C [Ref:Nautilus SAS/SATA Release A28 45.0 (urgent)]
DISK WDCWD5002ABYS-1 firmware 3B04 is not 3B05 [Ref:Nautilus SAS/SATA Release A28 40.0 (recommended)]
--------------------------------------------------------------------
If there is such a tool/resource, i would much appreciate a pointer.
Otherwise, if anyone wants the shell-script *AS-IS*, i can post it.
(it uses /proc/scsi/scsi and the output from megarc and megacli if
they can be found along with a big nasty inline awk table) Problem
is it requires continual manual updates/tweaks/presumptions.
Of course it would be great if Dell could manage the Nautilus
distribution in a way that could be run online (linux, of course)
for READ-ONLY verification so i wouldn't have to write my own tool.
--------------------------------------------------------------------
---- Inconsistencies and errors in Nautilus A28 release notes ------
--------------------------------------------------------------------
For example, there are several cases of probable typo errors like:
----------------------------------------------------------------------
===================================================
25.0 Seagate ES 7.2k SATA 3.5", Verison MA0D - Recommended
-----------------------------------^^^^^^^^^
===================================================
* Firmware Version MA0D *
Seagate ES 7.2k SATA 2.5", version MS0D for drive models ST3250310NS, ST3500320NS, ST31000340NS and ST31000340NS.
-----------------------------------^^^^
----------------------------------------------------------------------
(what is it, MA0D or MS0D?)
And this series of updates:
----------------------------------------------------------------------
$ grep ST3300555SS R288929.txt
26.0 Seagate T10 7.2K SAS 3.5", version T215, for drive models ST3300555SS (300GB), ST3146755SS (146GB) and ST373355SS (73GB).
52.0 Seagate SAS 2.5", 10K, offline download version T10D, for drive models ST3300555SS, ST3146755SS and ST373355SS.
53.0 Seagate T10 7.2K SAS 3.5", offline download version T210, for drive models ST3300555SS (300GB), ST3146755SS (146GB) and ST373355SS (73GB).
..
----------------------------------------------------------------------
T215->T10D->T210
Huh? Those don't increment lexicographically in my book.
But the middle one does say 2.5", the others 3.5", but Seagate
has always used different model designations for different
sized drives:
http://seagate.custkb.com/seagate/crm/selfservice/search.jsp?DocId=204763
and ST9 is for 2.5" drives, so ST3 shouldn't apply.
Sure, this is nitpicking, but if the Release Notes are this badly
QC'd, how do i know the firmware images are any better? Don't get
me wrong, i'm willing to live with bad quality release notes if the
updates work properly, anyway....
thanks,
--stephen
--
Stephen Dowdy - Systems Administrator - NCAR/RAL
303.497.2869 - sdowdy at ucar.edu - http://www.ral.ucar.edu/~sdowdy/
From sgalyen at email.arizona.edu Mon Jan 3 14:41:15 2011
From: sgalyen at email.arizona.edu (Galyen, Sean M - (sgalyen))
Date: Mon, 3 Jan 2011 12:41:15 -0800
Subject: OpenManage 6.4 yum repository posted (Santosh_Gore@Dell.com) /
driver inquiry / PERC4 not detected
Message-ID: <04E70FD8A5C62C4B92F39BB05BB1EC9D02FFBB1573@VA3DIAXVS211.RED001.local>
I joined this list to eagerly await OM6.4 with RHEL6 x64 support and here it comes as a Christmas present! Just wanted to say it installed easily/beautifully (three systems today!) and looks sexy when you access it - well done! Even promptly told me my SAS firmware was out of date (dang it).
To throw a few questions out there...
* OM is complaining the SAS driver (not firmware) version being dated on two PE860s. Does anyone know if RedHat will eventually incorporate the updated driver or will I need to consider updating it myself?
** Driver Version 3.04.16
** Minimum Required Driver Version 3.12.29.00
* OM does not seem to be detecting the PERC4 on my PE2850, is that controller not supported with OM6.4? If it is supported, any suggestions on where to start digging?
** From OM: No storage controllers detected.
* On the same PE2850 it seems that OM6.4 was repeatedly spamming my snmpd trying to connect - I had the localhost community to something other than public and snmpd was spamming /var/log/messages with OM connection attempts. However, two PE860s aren't doing this (no snmp connects at all reported). I fixed it temporarily by changing the localhost community string back to public (ack), any thoughts about what OM is doing differently between the hosts? Can I disable the snmp polling or change the string it is trying to use?
** /var/log/messages (repeating these two lines over and over, port incrementing)
snmpd[1522]: [smux_accept] accepted fd 10 from 127.0.0.1:58287
snmpd[1522]: refused smux peer: oid SNMPv2-SMI::enterprises.674.10892.1, descr Systems Management SNMP MIB Plug-in Manager
* The last question may be a red herring as after changing the community string back to my normal one the error isn't repeating. Go figure. Though OM (seemingly) has crashed my snmpd a few times while I was troubleshooting...
snmpd[11039]: [smux_accept] accepted fd 10 from 127.0.0.1:44440
snmpd[11039]: accepted smux peer: oid SNMPv2-SMI::enterprises.674.10892.1, descr Systems Management SNMP MIB Plug-in Manager
kernel: snmpd[11039]: segfault at 820 ip 00007f3ab68c68d8 sp 00007fff04e76db0 error 4 in libnetsnmpagent.so.20.0.0[7f3ab68a3000+47000]
Thanks and congratulations on OM6.4! Happy new year!
Any and all feedback is appreciated!
Sean Galyen
Office of Instruction and Assessment
University of Arizona
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.us.dell.com/pipermail/linux-poweredge/attachments/20110103/9ef5b1de/attachment-0001.htm
From stephan.van.hienen at thevalley.nl Mon Jan 3 14:49:52 2011
From: stephan.van.hienen at thevalley.nl (Stephan van Hienen)
Date: Mon, 3 Jan 2011 20:49:52 +0000
Subject: OpenManage 6.4 yum repository posted (Santosh_Gore@Dell.com) /
driver inquiry / PERC4 not detected
In-Reply-To: <04E70FD8A5C62C4B92F39BB05BB1EC9D02FFBB1573@VA3DIAXVS211.RED001.local>
References: <04E70FD8A5C62C4B92F39BB05BB1EC9D02FFBB1573@VA3DIAXVS211.RED001.local>
Message-ID:
> -----Original Message-----
> From: linux-poweredge-bounces at dell.com [mailto:linux-poweredge-bounces at dell.com] On Behalf Of Galyen, Sean M - (sgalyen)
> Sent: maandag 3 januari 2011 21:41
> To: linux-poweredge at dell.com
> Subject: Re: OpenManage 6.4 yum repository posted (Santosh_Gore at Dell.com) / driver inquiry / PERC4 not detected
>
> * OM does not seem to be detecting the PERC4 on my PE2850, is that controller not supported with OM6.4? If it is supported, any
> suggestions on where to start digging?
The 64bit openmanage doesn't support the perc4 :
http://stevejenkins.com/blog/2010/10/no-controllers-found-fix-set-up-dell-omsa-6-3-32-bit-on-rhel-centos-5-5-64-bit/
Stephan
From ksuehring at web.de Mon Jan 3 17:54:43 2011
From: ksuehring at web.de (Karsten Suehring)
Date: Tue, 04 Jan 2011 00:54:43 +0100
Subject: mptlinux for newer kernel
In-Reply-To: <4D22186E.3060102@seoss.co.uk>
References: <4CC1CCA2.5030006@web.de> <4D20831C.1080805@web.de>
<4D21FA54.2070705@sara.nl> <4D22186E.3060102@seoss.co.uk>
Message-ID: <4D2261C3.3010409@web.de>
Yes, I was also hoping LSI would push a newer version into the mainline
kernel. But I was unable to find any contact/forum/mailing list on the
LSI web site which I could ask for the reasons.
I think at least the Dell people should be interested in that push since
OMSA is complaining for several versions now that 3.04 is to old. I also
don't know if the Dell version of the driver has any modification except
changing the strings to the Dell controller names.
Some weeks ago I had some trouble with a machine which had an internal
SAS RAID card. I was unable to generate proper error reports with that
controller. The support case ended after I installed Windows 2008 Server
(for which the driver constantly logged errors) and Dell had replaced
half of the machine. My guess is that the problem was caused by the
backplane. After that we have chosen PERC controller for our newer
machines because they have their own error logs.
Unfortunately so far nobody at Dell was able to suggest an alternative
controller for connecting a PowerVault MD3000.
Best regards,
Karsten
On 03.01.2011 19:41, Tim Small wrote:
> On 03/01/11 16:33, Ernst Pijper wrote:
>> I can't install the 4.24 version on Centos 5.5 because of a LZMA issue:
>>
>
> It's worth noting that the version of the mpt fusion driver in the
> official kernel.org tree is v3.04.17. The kernel.org maintainers don't
> promote the maintenance of out-of-tree code. If this 4.x driver is so
> great, then why don't LSI have it in the kernel.org tree, instead of the
> 3.x versions which they seem to sort-of-maintain.
>
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=drivers/message/fusion/mptbase.h;h=f71f2294847780472851bd17835937aaa8fce6f1;hb=HEAD
>
> This is a bit of an odd situation, but it's worth noting that this 4.x
> branch does still have this out-of-tree status, and the kernel.org
> maintainers generally don't keep code out of the kernel unless there's a
> good reason. So the possibilities seem to be that either LSI don't want
> the code in the main kernel for some weird reason, or it isn't good
> enough to go in.
>
> Same with the Redhat maintainers, so even if there was some sort of
> political trouble between LSI and the kernel.org folks - if Redhat
> thought that this 4.x branch was the best driver, they'd probably be
> shipping it in their enterprise kernels.
>
> AFAIK Redhat don't advise it's use either.
>
> I would be wary of running this 4.x code for that reason...
>
> In fact, that whole weird-attitude by LSI, together with various
> reliability problems I've seen with LSI mpt chips means that I normally
> advise avoiding them entirely if at all possible. Intel ICHx AHCI
> controllers seem to be better engineered, better supported, and are
> certainly many-many times better tested just because there are probably
> 1000x more of them in circulation.
>
> Now when is Dell going to start supporting Intel ICHx for hotplug too,
> like the other vendors do?
>
> Tim.
>
From Matt_Domsch at dell.com Mon Jan 3 20:24:54 2011
From: Matt_Domsch at dell.com (Matt Domsch)
Date: Mon, 3 Jan 2011 20:24:54 -0600
Subject: rpm error trying to do firmware update on poweredge 2950
In-Reply-To:
References:
Message-ID: <20110104022454.GA31054@auslistsprd01.us.dell.com>
On Mon, Jan 03, 2011 at 03:13:32PM +0100, Xavier Poinsard wrote:
> Hi all,
>
> I tried to do yum install $(/usr/sbin/bootstrap_firmware)
> And I got an error :
[snip]
> ERROR with rpm_check_debug vs depsolve:
> rpmlib(FileDigests) is needed by system_bios_PowerEdge_2950-2.7.0-20.noarch
> rpmlib(PayloadIsXz) is needed by system_bios_PowerEdge_2950-2.7.0-20.noarch
Arrggh. That's the builder, which is running on Fedora 14, which
apparently picked up the new SHA2 digests and XZ compression method,
which older rpm versions don't know.
Looks like I'll have to force the builder to use older checksum and
compression methods, and rebuild the newer packages. Arrggh....
Thanks,
Matt
--
Matt Domsch
Technology Strategist
Dell | Office of the CTO
From Matt_Domsch at dell.com Tue Jan 4 08:50:30 2011
From: Matt_Domsch at dell.com (Matt Domsch)
Date: Tue, 4 Jan 2011 08:50:30 -0600
Subject: rpm error trying to do firmware update on poweredge 2950
In-Reply-To: <20110104022454.GA31054@auslistsprd01.us.dell.com>
References:
<20110104022454.GA31054@auslistsprd01.us.dell.com>
Message-ID: <20110104145030.GA11889@auslistsprd01.us.dell.com>
On Mon, Jan 03, 2011 at 08:24:54PM -0600, Matt Domsch wrote:
> On Mon, Jan 03, 2011 at 03:13:32PM +0100, Xavier Poinsard wrote:
> > Hi all,
> >
> > I tried to do yum install $(/usr/sbin/bootstrap_firmware)
> > And I got an error :
> [snip]
> > ERROR with rpm_check_debug vs depsolve:
> > rpmlib(FileDigests) is needed by system_bios_PowerEdge_2950-2.7.0-20.noarch
> > rpmlib(PayloadIsXz) is needed by system_bios_PowerEdge_2950-2.7.0-20.noarch
>
> Arrggh. That's the builder, which is running on Fedora 14, which
> apparently picked up the new SHA2 digests and XZ compression method,
> which older rpm versions don't know.
>
> Looks like I'll have to force the builder to use older checksum and
> compression methods, and rebuild the newer packages. Arrggh....
All BIOS packages in the firmware repository have been rebuilt using
the old checksums and compression methods. You'll see all -21.noarch
packages now. Please test and report success/failure now.
Thanks,
Matt
--
Matt Domsch
Technology Strategist
Dell | Office of the CTO
From orion at cora.nwra.com Tue Jan 4 09:29:10 2011
From: orion at cora.nwra.com (Orion Poplawski)
Date: Tue, 04 Jan 2011 08:29:10 -0700
Subject: update_firmware traceback
Message-ID: <4D233CC6.2010106@cora.nwra.com>
I'm getting this now:
# update_firmware
Traceback (most recent call last):
File "/usr/lib64/python2.4/logging/config.py", line 191, in fileConfig
logger.addHandler(handlers[hand])
KeyError: 'updatelog'
Running system inventory...
Searching storage directory for available BIOS updates...
Checking System BIOS for PowerEdge SC1435 - 2.2.5
Available: system_bios(ven_0x1028_dev_0x01eb) - 2.2.5
Did not find a newer package to install that meets all installation
checks.
This system does not appear to have any updates available.
No action necessary.
CentOS 5.5. Doesn't appear to cause any problems other than the ugly traceback.
--
Orion Poplawski
Technical Manager 303-415-9701 x222
NWRA/CoRA Division FAX: 303-415-9702
3380 Mitchell Lane orion at cora.nwra.com
Boulder, CO 80301 http://www.cora.nwra.com
From Matt_Domsch at dell.com Tue Jan 4 10:51:53 2011
From: Matt_Domsch at dell.com (Matt Domsch)
Date: Tue, 4 Jan 2011 10:51:53 -0600
Subject: update_firmware traceback
In-Reply-To: <4D233CC6.2010106@cora.nwra.com>
References: <4D233CC6.2010106@cora.nwra.com>
Message-ID: <20110104165153.GA31313@auslistsprd01.us.dell.com>
On Tue, Jan 04, 2011 at 08:29:10AM -0700, Orion Poplawski wrote:
> I'm getting this now:
>
> # update_firmware
> Traceback (most recent call last):
> File "/usr/lib64/python2.4/logging/config.py", line 191, in fileConfig
> logger.addHandler(handlers[hand])
> KeyError: 'updatelog'
>
> Running system inventory...
>
> Searching storage directory for available BIOS updates...
> Checking System BIOS for PowerEdge SC1435 - 2.2.5
> Available: system_bios(ven_0x1028_dev_0x01eb) - 2.2.5
> Did not find a newer package to install that meets all installation
> checks.
>
> This system does not appear to have any updates available.
> No action necessary.
>
> CentOS 5.5. Doesn't appear to cause any problems other than the ugly traceback.
Yep. 'updatelog' is a newer feature in the python logging module,
which RHEL5 doesn't have. It's harmless but annoying. Michael Brown
was going to look into it - we spoke before the holiday break.
Thanks,
Matt
--
Matt Domsch
Technology Strategist
Dell | Office of the CTO
From pflaherty at wsi.com Tue Jan 4 14:54:18 2011
From: pflaherty at wsi.com (Flaherty, Patrick)
Date: Tue, 4 Jan 2011 15:54:18 -0500
Subject: OpenManage 6.4 is lying to me about my driver version...
Message-ID: <8B828B002113784DAA5045FD776438D3014EBC49@EXCHANGE.ad.wsicorp.com>
...or I'm an idiot. Both seem plausible.
32bit CentOS 5.5 box, all yum upgraded to OMSA 6.4, upgraded firmware
with the 6.4 suu disk (had trouble updating the bios, had to use the
system build disk).
omreport and dmesg/modinfo disagree on what version of the driver I'm
running. It's upsetting the dell hardware check I use, and making me
trust OMSA just a little less.
Any guidance?
Patrick
# uname -a
Linux somehost.someplace.com 2.6.18-194.26.1.el5 #1 SMP Tue Nov 9
12:54:40 EST 2010 i686 i686 i386 GNU/Linux
# uptime && lsmod | grep mega
20:27:24 up 2 min, 1 user, load average: 0.65, 0.32, 0.12
megaraid_sas 50236 2
scsi_mod 141973 12
mptctl,scsi_dh,sr_mod,sg,megaraid_sas,mptspi,scsi_transport_spi,libata,m
ptsas,mptscsih,scsi_transport_sas,sd_mod
# dmesg | grep mega
megasas: 00.00.04.31.2 Thur July 08 14:13:02 EST 2010
# omreport storage controller
Controller SAS 6/iR Integrated (Embedded)
Controllers
ID : 0
Status : Non-Critical
Name : SAS 6/iR Integrated
Slot ID : Embedded
State : Degraded
Firmware Version : 00.25.47.00.06.22.03.00
Minimum Required Firmware Version : Not Applicable
Driver Version : 3.04.13rh
Minimum Required Driver Version : 3.12.29.00
Storport Driver Version : Not Applicable
Minimum Required Storport Driver Version : Not Applicable
Number of Connectors : 1
Rebuild Rate : Not Applicable
BGI Rate : Not Applicable
Check Consistency Rate : Not Applicable
Reconstruct Rate : Not Applicable
Alarm State : Not Applicable
Cluster Mode : Not Applicable
SCSI Initiator ID : Not Applicable
Cache Memory Size : Not Applicable
Patrol Read Mode : Not Applicable
Patrol Read State : Not Applicable
Patrol Read Rate : Not Applicable
Patrol Read Iterations : Not Applicable
Abort Check Consistency on Error : Not Applicable
Allow Revertible Hot Spare and Replace Member : Not Applicable
Load Balance : Not Applicable
Auto Replace Member on Predictive Failure : Not Applicable
Redundant Path view : Not Applicable
CacheCade Capable : Not Applicable
Persistent Hot Spare : Not Applicable
Encryption Capable : Not Applicable
Encryption Key Present : Not Applicable
Encryption Mode : Not Applicable
Spin Down Unconfigured Drives : Not Applicable
Spin Down Hot Spares : Not Applicable
# uname -r
2.6.18-194.26.1.el5
# modinfo megaraid_sas
filename: /lib/modules/2.6.18-194.26.1.el5/extra/megaraid_sas.ko
description: LSI MegaRAID SAS Driver
author: megaraidlinux at lsi.com
version: 00.00.04.31.2
license: GPL
srcversion: 26F6E879192F1EF88B1FB53
alias: pci:v00001028d00000015sv*sd*bc*sc*i*
alias: pci:v00001000d00000413sv*sd*bc*sc*i*
alias: pci:v00001000d00000071sv*sd*bc*sc*i*
alias: pci:v00001000d00000073sv*sd*bc*sc*i*
alias: pci:v00001000d00000079sv*sd*bc*sc*i*
alias: pci:v00001000d00000078sv*sd*bc*sc*i*
alias: pci:v00001000d0000007Csv*sd*bc*sc*i*
alias: pci:v00001000d00000060sv*sd*bc*sc*i*
alias: pci:v00001000d00000411sv*sd*bc*sc*i*
depends: scsi_mod
vermagic: 2.6.18-194.26.1.el5 SMP mod_unload 686 REGPARM 4KSTACKS
gcc-4.1
parm: poll_mode_io:Complete cmds from IO path, (default=0)
(int)
parm: max_sectors:Maximum number of sectors per IO command
(int)
# here's after I unpacked my initrd and looked at it.
# modinfo ./lib/megaraid_sas.ko
filename: ./lib/megaraid_sas.ko
description: LSI MegaRAID SAS Driver
author: megaraidlinux at lsi.com
version: 00.00.04.31.2
license: GPL
srcversion: 26F6E879192F1EF88B1FB53
alias: pci:v00001028d00000015sv*sd*bc*sc*i*
alias: pci:v00001000d00000413sv*sd*bc*sc*i*
alias: pci:v00001000d00000071sv*sd*bc*sc*i*
alias: pci:v00001000d00000073sv*sd*bc*sc*i*
alias: pci:v00001000d00000079sv*sd*bc*sc*i*
alias: pci:v00001000d00000078sv*sd*bc*sc*i*
alias: pci:v00001000d0000007Csv*sd*bc*sc*i*
alias: pci:v00001000d00000060sv*sd*bc*sc*i*
alias: pci:v00001000d00000411sv*sd*bc*sc*i*
depends: scsi_mod
vermagic: 2.6.18-194.26.1.el5 SMP mod_unload 686 REGPARM 4KSTACKS
gcc-4.1
parm: poll_mode_io:Complete cmds from IO path, (default=0)
(int)
parm: max_sectors:Maximum number of sectors per IO command
(int)
From Shyam_Iyer at Dell.com Tue Jan 4 15:08:16 2011
From: Shyam_Iyer at Dell.com (Shyam_Iyer at Dell.com)
Date: Wed, 5 Jan 2011 02:38:16 +0530
Subject: OpenManage 6.4 is lying to me about my driver version...
In-Reply-To: <8B828B002113784DAA5045FD776438D3014EBC49@EXCHANGE.ad.wsicorp.com>
References: <8B828B002113784DAA5045FD776438D3014EBC49@EXCHANGE.ad.wsicorp.com>
Message-ID:
The SAS6 uses the mptscsih driver. Don't check the megaraid_sas driver version.
> -----Original Message-----
> From: linux-poweredge-bounces-Lists On Behalf Of Flaherty, Patrick
> Sent: Tuesday, January 04, 2011 3:54 PM
> To: linux-poweredge-Lists
> Subject: OpenManage 6.4 is lying to me about my driver version...
>
> ...or I'm an idiot. Both seem plausible.
>
> 32bit CentOS 5.5 box, all yum upgraded to OMSA 6.4, upgraded firmware
> with the 6.4 suu disk (had trouble updating the bios, had to use the
> system build disk).
>
> omreport and dmesg/modinfo disagree on what version of the driver I'm
> running. It's upsetting the dell hardware check I use, and making me
> trust OMSA just a little less.
>
> Any guidance?
>
> Patrick
>
> # uname -a
> Linux somehost.someplace.com 2.6.18-194.26.1.el5 #1 SMP Tue Nov 9
> 12:54:40 EST 2010 i686 i686 i386 GNU/Linux
>
> # uptime && lsmod | grep mega
> 20:27:24 up 2 min, 1 user, load average: 0.65, 0.32, 0.12
> megaraid_sas 50236 2
> scsi_mod 141973 12
> mptctl,scsi_dh,sr_mod,sg,megaraid_sas,mptspi,scsi_transport_spi,libata,
> m
> ptsas,mptscsih,scsi_transport_sas,sd_mod
>
> # dmesg | grep mega
> megasas: 00.00.04.31.2 Thur July 08 14:13:02 EST 2010
>
> # omreport storage controller
> Controller SAS 6/iR Integrated (Embedded)
>
> Controllers
> ID : 0
> Status : Non-Critical
> Name : SAS 6/iR Integrated
> Slot ID : Embedded
> State : Degraded
> Firmware Version : 00.25.47.00.06.22.03.00
> Minimum Required Firmware Version : Not Applicable
> Driver Version : 3.04.13rh
> Minimum Required Driver Version : 3.12.29.00
> Storport Driver Version : Not Applicable
> Minimum Required Storport Driver Version : Not Applicable
> Number of Connectors : 1
> Rebuild Rate : Not Applicable
> BGI Rate : Not Applicable
> Check Consistency Rate : Not Applicable
> Reconstruct Rate : Not Applicable
> Alarm State : Not Applicable
> Cluster Mode : Not Applicable
> SCSI Initiator ID : Not Applicable
> Cache Memory Size : Not Applicable
> Patrol Read Mode : Not Applicable
> Patrol Read State : Not Applicable
> Patrol Read Rate : Not Applicable
> Patrol Read Iterations : Not Applicable
> Abort Check Consistency on Error : Not Applicable
> Allow Revertible Hot Spare and Replace Member : Not Applicable
> Load Balance : Not Applicable
> Auto Replace Member on Predictive Failure : Not Applicable
> Redundant Path view : Not Applicable
> CacheCade Capable : Not Applicable
> Persistent Hot Spare : Not Applicable
> Encryption Capable : Not Applicable
> Encryption Key Present : Not Applicable
> Encryption Mode : Not Applicable
> Spin Down Unconfigured Drives : Not Applicable
> Spin Down Hot Spares : Not Applicable
>
> # uname -r
> 2.6.18-194.26.1.el5
>
> # modinfo megaraid_sas
> filename: /lib/modules/2.6.18-194.26.1.el5/extra/megaraid_sas.ko
> description: LSI MegaRAID SAS Driver
> author: megaraidlinux at lsi.com
> version: 00.00.04.31.2
> license: GPL
> srcversion: 26F6E879192F1EF88B1FB53
> alias: pci:v00001028d00000015sv*sd*bc*sc*i*
> alias: pci:v00001000d00000413sv*sd*bc*sc*i*
> alias: pci:v00001000d00000071sv*sd*bc*sc*i*
> alias: pci:v00001000d00000073sv*sd*bc*sc*i*
> alias: pci:v00001000d00000079sv*sd*bc*sc*i*
> alias: pci:v00001000d00000078sv*sd*bc*sc*i*
> alias: pci:v00001000d0000007Csv*sd*bc*sc*i*
> alias: pci:v00001000d00000060sv*sd*bc*sc*i*
> alias: pci:v00001000d00000411sv*sd*bc*sc*i*
> depends: scsi_mod
> vermagic: 2.6.18-194.26.1.el5 SMP mod_unload 686 REGPARM 4KSTACKS
> gcc-4.1
> parm: poll_mode_io:Complete cmds from IO path, (default=0)
> (int)
> parm: max_sectors:Maximum number of sectors per IO command
> (int)
>
> # here's after I unpacked my initrd and looked at it.
> # modinfo ./lib/megaraid_sas.ko
> filename: ./lib/megaraid_sas.ko
> description: LSI MegaRAID SAS Driver
> author: megaraidlinux at lsi.com
> version: 00.00.04.31.2
> license: GPL
> srcversion: 26F6E879192F1EF88B1FB53
> alias: pci:v00001028d00000015sv*sd*bc*sc*i*
> alias: pci:v00001000d00000413sv*sd*bc*sc*i*
> alias: pci:v00001000d00000071sv*sd*bc*sc*i*
> alias: pci:v00001000d00000073sv*sd*bc*sc*i*
> alias: pci:v00001000d00000079sv*sd*bc*sc*i*
> alias: pci:v00001000d00000078sv*sd*bc*sc*i*
> alias: pci:v00001000d0000007Csv*sd*bc*sc*i*
> alias: pci:v00001000d00000060sv*sd*bc*sc*i*
> alias: pci:v00001000d00000411sv*sd*bc*sc*i*
> depends: scsi_mod
> vermagic: 2.6.18-194.26.1.el5 SMP mod_unload 686 REGPARM 4KSTACKS
> gcc-4.1
> parm: poll_mode_io:Complete cmds from IO path, (default=0)
> (int)
> parm: max_sectors:Maximum number of sectors per IO command
> (int)
>
> _______________________________________________
> Linux-PowerEdge mailing list
> Linux-PowerEdge at dell.com
> https://lists.us.dell.com/mailman/listinfo/linux-poweredge
> Please read the FAQ at http://lists.us.dell.com/faq
From pflaherty at wsi.com Tue Jan 4 15:56:00 2011
From: pflaherty at wsi.com (Flaherty, Patrick)
Date: Tue, 4 Jan 2011 16:56:00 -0500
Subject: OpenManage 6.4 is lying to me about my driver version...
In-Reply-To:
References: <8B828B002113784DAA5045FD776438D3014EBC49@EXCHANGE.ad.wsicorp.com>
Message-ID: <8B828B002113784DAA5045FD776438D3014EBC74@EXCHANGE.ad.wsicorp.com>
That fixed it. Thank you Shyam.
Patrick
> -----Original Message-----
> From: Shyam_Iyer at Dell.com [mailto:Shyam_Iyer at Dell.com]
> Sent: Tuesday, January 04, 2011 4:08 PM
> To: Flaherty, Patrick; linux-poweredge at lists.us.dell.com
> Subject: RE: OpenManage 6.4 is lying to me about my driver version...
>
> The SAS6 uses the mptscsih driver. Don't check the megaraid_sas driver
> version.
>
> > -----Original Message-----
> > From: linux-poweredge-bounces-Lists On Behalf Of Flaherty, Patrick
> > Sent: Tuesday, January 04, 2011 3:54 PM
> > To: linux-poweredge-Lists
> > Subject: OpenManage 6.4 is lying to me about my driver version...
> >
> > ...or I'm an idiot. Both seem plausible.
> >
> > 32bit CentOS 5.5 box, all yum upgraded to OMSA 6.4, upgraded
firmware
> > with the 6.4 suu disk (had trouble updating the bios, had to use the
> > system build disk).
> >
> > omreport and dmesg/modinfo disagree on what version of the driver
I'm
> > running. It's upsetting the dell hardware check I use, and making me
> > trust OMSA just a little less.
> >
> > Any guidance?
> >
> > Patrick
> >
> > # uname -a
> > Linux somehost.someplace.com 2.6.18-194.26.1.el5 #1 SMP Tue Nov 9
> > 12:54:40 EST 2010 i686 i686 i386 GNU/Linux
> >
> > # uptime && lsmod | grep mega
> > 20:27:24 up 2 min, 1 user, load average: 0.65, 0.32, 0.12
> > megaraid_sas 50236 2
> > scsi_mod 141973 12
> >
>
mptctl,scsi_dh,sr_mod,sg,megaraid_sas,mptspi,scsi_transport_spi,libata,
> > m
> > ptsas,mptscsih,scsi_transport_sas,sd_mod
> >
> > # dmesg | grep mega
> > megasas: 00.00.04.31.2 Thur July 08 14:13:02 EST 2010
> >
> > # omreport storage controller
> > Controller SAS 6/iR Integrated (Embedded)
> >
> > Controllers
> > ID : 0
> > Status : Non-Critical
> > Name : SAS 6/iR Integrated
> > Slot ID : Embedded
> > State : Degraded
> > Firmware Version :
> 00.25.47.00.06.22.03.00
> > Minimum Required Firmware Version : Not Applicable
> > Driver Version : 3.04.13rh
> > Minimum Required Driver Version : 3.12.29.00
> > Storport Driver Version : Not Applicable
> > Minimum Required Storport Driver Version : Not Applicable
> > Number of Connectors : 1
> > Rebuild Rate : Not Applicable
> > BGI Rate : Not Applicable
> > Check Consistency Rate : Not Applicable
> > Reconstruct Rate : Not Applicable
> > Alarm State : Not Applicable
> > Cluster Mode : Not Applicable
> > SCSI Initiator ID : Not Applicable
> > Cache Memory Size : Not Applicable
> > Patrol Read Mode : Not Applicable
> > Patrol Read State : Not Applicable
> > Patrol Read Rate : Not Applicable
> > Patrol Read Iterations : Not Applicable
> > Abort Check Consistency on Error : Not Applicable
> > Allow Revertible Hot Spare and Replace Member : Not Applicable
> > Load Balance : Not Applicable
> > Auto Replace Member on Predictive Failure : Not Applicable
> > Redundant Path view : Not Applicable
> > CacheCade Capable : Not Applicable
> > Persistent Hot Spare : Not Applicable
> > Encryption Capable : Not Applicable
> > Encryption Key Present : Not Applicable
> > Encryption Mode : Not Applicable
> > Spin Down Unconfigured Drives : Not Applicable
> > Spin Down Hot Spares : Not Applicable
> >
> > # uname -r
> > 2.6.18-194.26.1.el5
> >
> > # modinfo megaraid_sas
> > filename: /lib/modules/2.6.18-
> 194.26.1.el5/extra/megaraid_sas.ko
> > description: LSI MegaRAID SAS Driver
> > author: megaraidlinux at lsi.com
> > version: 00.00.04.31.2
> > license: GPL
> > srcversion: 26F6E879192F1EF88B1FB53
> > alias: pci:v00001028d00000015sv*sd*bc*sc*i*
> > alias: pci:v00001000d00000413sv*sd*bc*sc*i*
> > alias: pci:v00001000d00000071sv*sd*bc*sc*i*
> > alias: pci:v00001000d00000073sv*sd*bc*sc*i*
> > alias: pci:v00001000d00000079sv*sd*bc*sc*i*
> > alias: pci:v00001000d00000078sv*sd*bc*sc*i*
> > alias: pci:v00001000d0000007Csv*sd*bc*sc*i*
> > alias: pci:v00001000d00000060sv*sd*bc*sc*i*
> > alias: pci:v00001000d00000411sv*sd*bc*sc*i*
> > depends: scsi_mod
> > vermagic: 2.6.18-194.26.1.el5 SMP mod_unload 686 REGPARM
> 4KSTACKS
> > gcc-4.1
> > parm: poll_mode_io:Complete cmds from IO path, (default=0)
> > (int)
> > parm: max_sectors:Maximum number of sectors per IO command
> > (int)
> >
> > # here's after I unpacked my initrd and looked at it.
> > # modinfo ./lib/megaraid_sas.ko
> > filename: ./lib/megaraid_sas.ko
> > description: LSI MegaRAID SAS Driver
> > author: megaraidlinux at lsi.com
> > version: 00.00.04.31.2
> > license: GPL
> > srcversion: 26F6E879192F1EF88B1FB53
> > alias: pci:v00001028d00000015sv*sd*bc*sc*i*
> > alias: pci:v00001000d00000413sv*sd*bc*sc*i*
> > alias: pci:v00001000d00000071sv*sd*bc*sc*i*
> > alias: pci:v00001000d00000073sv*sd*bc*sc*i*
> > alias: pci:v00001000d00000079sv*sd*bc*sc*i*
> > alias: pci:v00001000d00000078sv*sd*bc*sc*i*
> > alias: pci:v00001000d0000007Csv*sd*bc*sc*i*
> > alias: pci:v00001000d00000060sv*sd*bc*sc*i*
> > alias: pci:v00001000d00000411sv*sd*bc*sc*i*
> > depends: scsi_mod
> > vermagic: 2.6.18-194.26.1.el5 SMP mod_unload 686 REGPARM
> 4KSTACKS
> > gcc-4.1
> > parm: poll_mode_io:Complete cmds from IO path, (default=0)
> > (int)
> > parm: max_sectors:Maximum number of sectors per IO command
> > (int)
> >
> > _______________________________________________
> > Linux-PowerEdge mailing list
> > Linux-PowerEdge at dell.com
> > https://lists.us.dell.com/mailman/listinfo/linux-poweredge
> > Please read the FAQ at http://lists.us.dell.com/faq
From Spike_White at Dell.com Tue Jan 4 16:56:57 2011
From: Spike_White at Dell.com (Spike_White at Dell.com)
Date: Tue, 4 Jan 2011 16:56:57 -0600
Subject: Proper RHEL OS version/ixgbe driver version that allows Dell SFP+'s
Message-ID: <0A6B7C731722FD46B7FDC5CBF3A2FF509F3CF39A@AUSX7MCPC102.AMER.DELL.COM>
All,
We are try to configure Intel 10 GBe NICs in several Dell PowerEdge R710's. They came w/o SFP's, so we slapped in some Dell SFP's that we had. This is the 2.6.18 kernel of RHEL 5.4. (But we could reimage to RHEL5.5 if needed).
We get link, no problem. But we get the following error during ixgbe driver initialization:
# dmesg | grep -i ixgbe
ixgbe: Intel(R) 10 Gigabit PCI Express Network Driver - version 2.0.44-k2
ixgbe: Copyright (c) 1999-2010 Intel Corporation.
ixgbe 0000:0c:00.0: failed to initialize because an unsupported SFP+ module type was detected.
ixgbe 0000:0c:00.1: failed to initialize because an unsupported SFP+ module type was detected.
ixgbe: 0000:0e:00.0: ixgbe_init_interrupt_scheme: Multiqueue Enabled: Rx Queue count = 4, Tx Queue count = 4
ixgbe 0000:0e:00.0: (PCI Express:2.5Gb/s:Width x8) 00:1b:21:0e:c0:56
ixgbe 0000:0e:00.0: MAC: 1, PHY: 3, PBA No: d99083-006
ixgbe 0000:0e:00.0: Intel(R) 10 Gigabit Network Connection
Interestingly, we get these errors only on these newer Intel 82599 chipset NICs.
The older Intel 82598 chipset NICs worked fine. (But we're out of those). In fact, that's why we thought of using these SFP+'s, they work in the previous Intel 10 GBe NICs, no problem.
Specifically we're trying to install a Dell model FTLX8571D3BCL SFP+ module. (which I think is merely a re-branded Finistar SFP+ module).
In the ixgbe readme (http://downloadmirror.intel.com/14687/eng/README.txt ), it lists under tested & supported 3rd party SFP+ modules:
The following is a list of 3rd party SFP+ modules that have received some
testing. Not all modules are applicable to all devices.
Supplier Type Part Numbers
Finisar SFP+ SR bailed, 10g single rate FTLX8571D3BCL
Several questions:
1. I believe Dell just re-brands these SFP+ modules, that they're really made by Finistar. True?
2. If so, I'm guessing the ixgbe driver is getting confused by the PCI signature, true?
3. Is this fixed in RHEL5.5? Or maybe a later ixgbe driver version?
Spike
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.us.dell.com/pipermail/linux-poweredge/attachments/20110104/fd3c2254/attachment-0001.htm
From noc at defenderhosting.com Tue Jan 4 20:11:08 2011
From: noc at defenderhosting.com (Defender NOC)
Date: Tue, 4 Jan 2011 21:11:08 -0500 (EST)
Subject: 146GB drive pretenting to be a 300GB
In-Reply-To: <2121948662.125098.1294193310972.JavaMail.root@mail.dtgmail.com>
Message-ID: <112137805.125102.1294193468470.JavaMail.root@mail.dtgmail.com>
Hi Everyone-
We just purchased a new R510 and while i've had a lot of experience with 10g and 11g systems other than the R510 ( mostly R610 and R710 ) i've yet to see this one before. We purchased the system with 12 x 146gb 15k SAS drives in them which all showed as such in the BIOS. Upon loading OMSA on the system I get the below results however looking at the drive itself:
ID 0:0:11
Status OK
Name Physical Disk 0:0:11
State Online
Power Status Spun Up
Bus Protocol SAS
Media HDD
Revision EH02
Failure Predicted No
Certified Yes
Capacity 136.12GB
Used RAID Disk Space 136.12GB
Available RAID Disk Space 0.00GB
Hot Spare No
Vendor ID DELL(tm)
Product ID ST3300657SS-H
Serial No. 3SJ2QJD8
Part Number SG01DKVF125310AU01ALA00
Negotiated Speed 3.00 Gbps
Capable Speed 3.00 Gbps
Manufacture Day 02
Manufacture Week 44
Manufacture Year 2010
SAS Address 5000C50028AA2161
The Seagate part number listed belongs to the below drive:
http://www.seagate.com/ww/v/index.jsp?locale=en-US&vgnextoid=6664470bd8cc1210VgnVCM1000001a48090aRCRD
which is a 300GB drive, not a 146GB drive. Not only that, but for some reason it's running at 3Gb/s instead of the 6Gb/s rating on Seagate's site ( not that this matters as I won't get that performance out of it, but noting it anyway since its another anomaly ). The -H at the end is also unusual for a Seagate part number. I've never seen it on any of their drives.
Attempting to update the firmware with the listed ES62 update on Dell's site results in the below:
sh FRMW_LX_R245074.BIN
Collecting inventory...
...
Running validation...
This Update Package is not compatible with your system configuration.
-----------------------
This is even after changing /etc/redhat-release so the packages play nice ( as indicated in http://lists.us.dell.com/pipermail/linux-poweredge/2010-December/043864.html )
Is there some sort of firmware limiting going on here that is dropping the drive down to 3Gb/s and 146GB?
Jon Wolberg
Systems Engineer
Virtacore Systems Inc.
"We Virtualize IT!"
From Yen_Onn_Hiu at dell.com Tue Jan 4 20:28:35 2011
From: Yen_Onn_Hiu at dell.com (Yen_Onn_Hiu at dell.com)
Date: Wed, 5 Jan 2011 10:28:35 +0800
Subject: storage controller OM6.4 is showing degraded
Message-ID: <02B75F9811B5BA4497B032FA7AC381AF03036617FA@PENX7MCDC102.APAC.DELL.COM>
Hi all,
I have a blade server, M600 that installed with OM6.4 recently. And I noticed that the storage controller is showing "Degraded" state. May I know what is the problem with this?
[root at labtest11 ~]# omreport about
Product name : Server Administrator
Version : 6.4.0
Copyright : Copyright (C) Dell Inc. 1995-2010 All rights reserved.
Company : Dell Inc.
[root at labtest11 ~]# uname -r
2.6.18-194.el5
[root at labtest11 ~]# dmidecode | grep Product
Product Name: PowerEdge M600
[root at labtest11 ~]# omreport storage controller
Controller CERC 6/i Integrated (Embedded)
Controllers
ID : 0
Status : Non-Critical
Name : CERC 6/i Integrated
Slot ID : Embedded
State : Degraded
Firmware Version : 6.2.0-0013
Minimum Required Firmware Version : 6.3.0-0001
Driver Version : 00.00.04.17-RH1
Minimum Required Driver Version : Not Applicable
Storport Driver Version : Not Applicable
Minimum Required Storport Driver Version : Not Applicable
Number of Connectors : 1
Rebuild Rate : 30%
BGI Rate : 30%
Check Consistency Rate : 30%
Reconstruct Rate : 30%
Alarm State : Not Applicable
Cluster Mode : Not Applicable
SCSI Initiator ID : Not Applicable
Cache Memory Size : 128 MB
Patrol Read Mode : Auto
Patrol Read State : Stopped
Patrol Read Rate : 30%
Patrol Read Iterations : 86
Abort Check Consistency on Error : Disabled
Allow Revertible Hot Spare and Replace Member : Enabled
Load Balance : Not Applicable
Auto Replace Member on Predictive Failure : Disabled
Redundant Path view : Not Applicable
CacheCade Capable : Not Applicable
Persistent Hot Spare : Not Applicable
Encryption Capable : Not Applicable
Encryption Key Present : Not Applicable
Encryption Mode : Not Applicable
Spin Down Unconfigured Drives : Not Applicable
Spin Down Hot Spares : Not Applicable
[root at penlabbldtest11 ~]# modinfo megaraid
filename: /lib/modules/2.6.18-194.el5/kernel/drivers/scsi/megaraid.ko
version: 2.00.4
license: GPL
description: LSI Logic MegaRAID legacy driver
author: sju at lsil.com
srcversion: 9D37C018C9932B2293C8E97
alias: pci:v00008086d00001960sv*sd*bc*sc*i*
alias: pci:v0000101Ed00009060sv*sd*bc*sc*i*
alias: pci:v0000101Ed00009010sv*sd*bc*sc*i*
depends: scsi_mod
vermagic: 2.6.18-194.el5 SMP mod_unload gcc-4.1
parm: max_cmd_per_lun:Maximum number of commands which can be issued to a single LUN (default=DEF_CMD_PER_LUN=63) (uint)
parm: max_sectors_per_io:Maximum number of sectors per I/O request (default=MAX_SECTORS_PER_IO=128) (ushort)
parm: max_mbox_busy_wait:Maximum wait for mailbox in microseconds if busy (default=MBOX_BUSY_WAIT=10) (ushort)
module_sig: 883f3504bb15f8660a06a7eaf1ea8d2112e9bd09f55467fd975f51b96c7032a06d532b709d9c694209d164da27ad1451ac6396e994ead475402e517c8c
[root at penlabbldtest11 ~]# modinfo megaraid_sas
filename: /lib/modules/2.6.18-194.el5/kernel/drivers/scsi/megaraid/megaraid_sas.ko
description: LSI MegaRAID SAS Driver
author: megaraidlinux at lsi.com
version: 00.00.04.17-RH1
license: GPL
srcversion: 04AF5F5C6BA1B7EFD29FB99
alias: pci:v00001028d00000015sv*sd*bc*sc*i*
alias: pci:v00001000d00000413sv*sd*bc*sc*i*
alias: pci:v00001000d0000007Csv*sd*bc*sc*i*
alias: pci:v00001000d00000071sv*sd*bc*sc*i*
alias: pci:v00001000d00000073sv*sd*bc*sc*i*
alias: pci:v00001000d00000079sv*sd*bc*sc*i*
alias: pci:v00001000d00000078sv*sd*bc*sc*i*
alias: pci:v00001000d00000060sv*sd*bc*sc*i*
alias: pci:v00001000d00000411sv*sd*bc*sc*i*
depends: scsi_mod
vermagic: 2.6.18-194.el5 SMP mod_unload gcc-4.1
parm: poll_mode_io:Complete cmds from IO path, (default=0) (int)
module_sig: 883f3504bb15f8760a06a7eaf1ea8d211253ef09f7e467ed74624ee164e4ec14bb85f718c8e4f9be09e25802cc1435c0ba11eb2cd2cf8b7942e5789620
[root at penlabbldtest11 ~]# omreport about
Product name : Server Administrator
Version : 6.4.0
Copyright : Copyright (C) Dell Inc. 1995-2010 All rights reserved.
Company : Dell Inc.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.us.dell.com/pipermail/linux-poweredge/attachments/20110104/9ecc9d38/attachment-0001.htm
From wls at romanus.ca Tue Jan 4 22:46:10 2011
From: wls at romanus.ca (Winston Sorfleet)
Date: Tue, 4 Jan 2011 23:46:10 -0500
Subject: Debian on PE2500 - OMSA or AM available?
Message-ID: <532D6AB2BDA94740B729D2C0AF2ED1E6@caesar>
On Thu, December 16, 2010 21:07, Igor Cicimov wrote:
> Hi all,
>
> I have a SCSI disk failure on my RAID5 and the server flashing light is
> on as well as the flashing light on the disk it self. I would like to
> install OMSA or Array Manager for troubleshooting but since I have Debian
> installed on the server wonder if there is any OMSA or AM deb repository
> available somewhere?
>
Maybe looking for this?
http://hwraid.le-vert.net/wiki/DebianPackages
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.us.dell.com/pipermail/linux-poweredge/attachments/20110104/cdf31736/attachment.htm
From david.ribeiro76 at gmail.com Wed Jan 5 02:03:07 2011
From: david.ribeiro76 at gmail.com (David RIBEIRO)
Date: Wed, 5 Jan 2011 09:03:07 +0100
Subject: Debian on PE2500 - OMSA or AM available?
In-Reply-To: <532D6AB2BDA94740B729D2C0AF2ED1E6@caesar>
References: <532D6AB2BDA94740B729D2C0AF2ED1E6@caesar>
Message-ID:
Hello you can install dellomsa from sara repository, i give you a link, its
function very on ubuntu server 10.04.1 x32 and i have a problem in my x64
version of ubuntu server
http://www.learnosity.com/techblog/index.cfm/2009/8/4/Installing-Dell-OpenManage-Server-Administrator-on-Ubuntu-32bit
Good Luck ;)
2011/1/5 Winston Sorfleet
> On Thu, December 16, 2010 21:07, Igor Cicimov wrote:
> >* Hi all,*>**>* I have a SCSI disk failure on my RAID5 and the server flashing light is*>* on as well as the flashing light on the disk it self. I would like to*>* install OMSA or Array Manager for troubleshooting but since I have Debian*>* installed on the server wonder if there is any OMSA or AM deb repository*>* available somewhere?*>**
>
>
> Maybe looking for this?
>
> http://hwraid.le-vert.net/wiki/DebianPackages
>
> _______________________________________________
> Linux-PowerEdge mailing list
> Linux-PowerEdge at dell.com
> https://lists.us.dell.com/mailman/listinfo/linux-poweredge
> Please read the FAQ at http://lists.us.dell.com/faq
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.us.dell.com/pipermail/linux-poweredge/attachments/20110105/81d71205/attachment.htm
From Vaibhav_Kumar at Dell.com Wed Jan 5 02:42:09 2011
From: Vaibhav_Kumar at Dell.com (Vaibhav_Kumar at Dell.com)
Date: Wed, 5 Jan 2011 00:42:09 -0800
Subject: storage controller OM6.4 is showing degraded
In-Reply-To: <02B75F9811B5BA4497B032FA7AC381AF03036617FA@PENX7MCDC102.APAC.DELL.COM>
References: <02B75F9811B5BA4497B032FA7AC381AF03036617FA@PENX7MCDC102.APAC.DELL.COM>
Message-ID: <46F6103325A0C04E99DF2ECDA75808031D54E76ED1@BLRX7MCDC202.AMER.DELL.COM>
Firmware version you have currently seems to be the problem as it is less than the minimum required Firmware version. Upgrade the Firmware to latest available.
Regards
-Vaibhav
From: linux-poweredge-bounces-Lists On Behalf Of Hiu, Yen Onn
Sent: Wednesday, January 05, 2011 7:59 AM
To: linux-poweredge-Lists
Subject: storage controller OM6.4 is showing degraded
Hi all,
I have a blade server, M600 that installed with OM6.4 recently. And I noticed that the storage controller is showing "Degraded" state. May I know what is the problem with this?
[root at labtest11 ~]# omreport about
Product name : Server Administrator
Version : 6.4.0
Copyright : Copyright (C) Dell Inc. 1995-2010 All rights reserved.
Company : Dell Inc.
[root at labtest11 ~]# uname -r
2.6.18-194.el5
[root at labtest11 ~]# dmidecode | grep Product
Product Name: PowerEdge M600
[root at labtest11 ~]# omreport storage controller
Controller CERC 6/i Integrated (Embedded)
Controllers
ID : 0
Status : Non-Critical
Name : CERC 6/i Integrated
Slot ID : Embedded
State : Degraded
Firmware Version : 6.2.0-0013
Minimum Required Firmware Version : 6.3.0-0001
Driver Version : 00.00.04.17-RH1
Minimum Required Driver Version : Not Applicable
Storport Driver Version : Not Applicable
Minimum Required Storport Driver Version : Not Applicable
Number of Connectors : 1
Rebuild Rate : 30%
BGI Rate : 30%
Check Consistency Rate : 30%
Reconstruct Rate : 30%
Alarm State : Not Applicable
Cluster Mode : Not Applicable
SCSI Initiator ID : Not Applicable
Cache Memory Size : 128 MB
Patrol Read Mode : Auto
Patrol Read State : Stopped
Patrol Read Rate : 30%
Patrol Read Iterations : 86
Abort Check Consistency on Error : Disabled
Allow Revertible Hot Spare and Replace Member : Enabled
Load Balance : Not Applicable
Auto Replace Member on Predictive Failure : Disabled
Redundant Path view : Not Applicable
CacheCade Capable : Not Applicable
Persistent Hot Spare : Not Applicable
Encryption Capable : Not Applicable
Encryption Key Present : Not Applicable
Encryption Mode : Not Applicable
Spin Down Unconfigured Drives : Not Applicable
Spin Down Hot Spares : Not Applicable
[root at penlabbldtest11 ~]# modinfo megaraid
filename: /lib/modules/2.6.18-194.el5/kernel/drivers/scsi/megaraid.ko
version: 2.00.4
license: GPL
description: LSI Logic MegaRAID legacy driver
author: sju at lsil.com
srcversion: 9D37C018C9932B2293C8E97
alias: pci:v00008086d00001960sv*sd*bc*sc*i*
alias: pci:v0000101Ed00009060sv*sd*bc*sc*i*
alias: pci:v0000101Ed00009010sv*sd*bc*sc*i*
depends: scsi_mod
vermagic: 2.6.18-194.el5 SMP mod_unload gcc-4.1
parm: max_cmd_per_lun:Maximum number of commands which can be issued to a single LUN (default=DEF_CMD_PER_LUN=63) (uint)
parm: max_sectors_per_io:Maximum number of sectors per I/O request (default=MAX_SECTORS_PER_IO=128) (ushort)
parm: max_mbox_busy_wait:Maximum wait for mailbox in microseconds if busy (default=MBOX_BUSY_WAIT=10) (ushort)
module_sig: 883f3504bb15f8660a06a7eaf1ea8d2112e9bd09f55467fd975f51b96c7032a06d532b709d9c694209d164da27ad1451ac6396e994ead475402e517c8c
[root at penlabbldtest11 ~]# modinfo megaraid_sas
filename: /lib/modules/2.6.18-194.el5/kernel/drivers/scsi/megaraid/megaraid_sas.ko
description: LSI MegaRAID SAS Driver
author: megaraidlinux at lsi.com
version: 00.00.04.17-RH1
license: GPL
srcversion: 04AF5F5C6BA1B7EFD29FB99
alias: pci:v00001028d00000015sv*sd*bc*sc*i*
alias: pci:v00001000d00000413sv*sd*bc*sc*i*
alias: pci:v00001000d0000007Csv*sd*bc*sc*i*
alias: pci:v00001000d00000071sv*sd*bc*sc*i*
alias: pci:v00001000d00000073sv*sd*bc*sc*i*
alias: pci:v00001000d00000079sv*sd*bc*sc*i*
alias: pci:v00001000d00000078sv*sd*bc*sc*i*
alias: pci:v00001000d00000060sv*sd*bc*sc*i*
alias: pci:v00001000d00000411sv*sd*bc*sc*i*
depends: scsi_mod
vermagic: 2.6.18-194.el5 SMP mod_unload gcc-4.1
parm: poll_mode_io:Complete cmds from IO path, (default=0) (int)
module_sig: 883f3504bb15f8760a06a7eaf1ea8d211253ef09f7e467ed74624ee164e4ec14bb85f718c8e4f9be09e25802cc1435c0ba11eb2cd2cf8b7942e5789620
[root at penlabbldtest11 ~]# omreport about
Product name : Server Administrator
Version : 6.4.0
Copyright : Copyright (C) Dell Inc. 1995-2010 All rights reserved.
Company : Dell Inc.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.us.dell.com/pipermail/linux-poweredge/attachments/20110105/ad9ad5cf/attachment-0001.htm
From linux-poweredge.lists at tisc.de Wed Jan 5 03:59:20 2011
From: linux-poweredge.lists at tisc.de (Tino Schwarze)
Date: Wed, 5 Jan 2011 10:59:20 +0100
Subject: 146GB drive pretenting to be a 300GB
In-Reply-To: <112137805.125102.1294193468470.JavaMail.root@mail.dtgmail.com>
References: <2121948662.125098.1294193310972.JavaMail.root@mail.dtgmail.com>
<112137805.125102.1294193468470.JavaMail.root@mail.dtgmail.com>
Message-ID: <20110105095920.GC26493@easy5.in-chemnitz.de>
Hi Jon,
On Tue, Jan 04, 2011 at 09:11:08PM -0500, Defender NOC wrote:
> We just purchased a new R510 and while i've had a lot of experience
> with 10g and 11g systems other than the R510 ( mostly R610 and R710 )
> i've yet to see this one before. We purchased the system with 12 x
> 146gb 15k SAS drives in them which all showed as such in the BIOS.
> Upon loading OMSA on the system I get the below results however
> looking at the drive itself:
We've recently purchases an R515 with the same drives in it.
> ID 0:0:11
> Status OK
> Name Physical Disk 0:0:11
> State Online
> Power Status Spun Up
> Bus Protocol SAS
> Media HDD
> Revision EH02
> Failure Predicted No
> Certified Yes
> Capacity 136.12GB
> Used RAID Disk Space 136.12GB
> Available RAID Disk Space 0.00GB
> Hot Spare No
> Vendor ID DELL(tm)
> Product ID ST3300657SS-H
> Serial No. 3SJ2QJD8
> Part Number SG01DKVF125310AU01ALA00
> Negotiated Speed 3.00 Gbps
> Capable Speed 3.00 Gbps
> Manufacture Day 02
> Manufacture Week 44
> Manufacture Year 2010
> SAS Address 5000C50028AA2161
I'm seeing similar output:
Bus Protocol : SAS
Media : HDD
Revision : EH02
Certified : Yes
Capacity : 136.13 GB (146163105792 bytes)
Vendor ID : DELL(tm)
Product ID : ST3300657SS-H
Serial No. : 3SJ2QGQY
Negotiated Speed : 3.00 Gbps
Capable Speed : 3.00 Gbps
Manufacture Day : 02
Manufacture Week : 44
Manufacture Year : 2010
SAS Address : 5000C50028AA638D
> The Seagate part number listed belongs to the below drive:
>
> http://www.seagate.com/ww/v/index.jsp?locale=en-US&vgnextoid=6664470bd8cc1210VgnVCM1000001a48090aRCRD
>
> which is a 300GB drive, not a 146GB drive. Not only that, but for
> some reason it's running at 3Gb/s instead of the 6Gb/s rating on
> Seagate's site ( not that this matters as I won't get that performance
> out of it, but noting it anyway since its another anomaly ). The -H
> at the end is also unusual for a Seagate part number. I've never seen
> it on any of their drives.
I guess -H means "half capacity"?
> Attempting to update the firmware with the listed ES62 update on
> Dell's site results in the below:
>
>
> sh FRMW_LX_R245074.BIN
> Collecting inventory...
> ...
> Running validation...
>
> This Update Package is not compatible with your system configuration.
I'm seeing a similar message when running update_firmware:
Checking ST3300657SS-H Firmware - eh02
Did not find a newer package to install that meets all installation
checks.
> Is there some sort of firmware limiting going on here that is dropping
> the drive down to 3Gb/s and 146GB?
Looks like it is. Maybe Dell will sell Upgrade Packages one day? :-|
I'll update the H700 BIOS, then take a look on the reported speed again.
Thanks for pointing that out,
Tino.
--
"What we nourish flourishes." - "Was wir n?hren erbl?ht."
www.tisc.de
From cupertino at gmx.net Wed Jan 5 05:34:07 2011
From: cupertino at gmx.net (cupertino at gmx.net)
Date: Wed, 05 Jan 2011 12:34:07 +0100
Subject: Nautilus SAS/SATA firmware update disks: Inconsistencies and
Quality Control and missing f/w images
In-Reply-To: <4D222258.1070301@ucar.edu>
References: <4D222258.1070301@ucar.edu>
Message-ID: <20110105113407.147540@gmx.net>
As far as I know the Seagate S52x FW are for DUP enabled HDDs only, the S51x are for regular ones.
There are Dell hard drives out there with same Seagate STxxxxxxx numbers, but different FWs (those for DUP enabled and those w/o DUP). You can't mix FWs btw. those drives.
The most recent FW for DUP enabled drives would then be S52C and for regular drives S51A. You can go higher then S51A on those.
Correct me if I'm wrong.
-------- Original-Nachricht --------
> Datum: Mon, 03 Jan 2011 12:24:08 -0700
> Von: Stephen Dowdy
> An: Dell Poweredge list server
> Betreff: *** GMX Spamverdacht *** Nautilus SAS/SATA firmware update disks: Inconsistencies and Quality Control and missing f/w images
> Anyone else have issues with the Nautilus offline Disk firmware
> update utilities? (specifically i'm referring to A28, the latest)
>
> The Readme file has several errors and inconsistencies, so i
> can't be sure if things aren't working as expected because the
> documentation is wrong or because of something else. (See bottom of
> message for examples, i don't want to clutter my core issue)
>
> Specifically, i have a problem with updating ST3300655SS...
> There's an *URGENT* update for f/w S52C for these disks i want
> to apply.
>
>
> The Nautilus A28 shows the following firmware updates as documented:
>
> $ grep ST3300655SS *.txt
> R288929.txt:17.0 Seagate SAS 3.5", offline download version S51A, for
> drive models ST3300655SS, ST3146855SS and ST373455SS
> R288929.txt:45.0 Seagate 15k5 SAS, 3.5" version S52C for drive models
> ST3300655SS, ST3146855SS and ST373455SS.
> R288929.txt:Seagate 15k5 SAS, 3.5" version S52C for drive models
> ST3300655SS, ST3146855SS and ST373455SS.
>
>
> So, S52C deprecates S51A later on. However, the system i just
> updated had f/w S517 and....
>
> Nautilus' interactive update mode showed
> Disk XXX ST3300655SS firmware S519 != S517 (or something like that).
>
> Okay, so there's another firmware image on the ISO for this drive
> that isn't either version mentioned in the Release Notes, and it is
> LOWER than the either version mentioned.
>
> Fine, do the update. Then Nautilus reports:
> Success: Disk XXX ST3300655SS firmware S51A = S51A (or something like
> that).
>
> So, it appears that Nautilus didn't update to the version it first
> reported,
> but a version slightly higher (one mentioned earlier in the release
> notes).
> But what about S52C, the one it SHOULD be applying?
>
>
> $ grep -Rail S519 .
> ./sas/seagate/15k5/S519.fwh
> ./sas/seagate/15k5/S51a.fwh
>
> $ cd ./sas/seagate/15k5/
> $ ls
> 15dps515.lod 15fps515.lod S517.fwh S519.fwh S51a.fwh S527.fwh
> S52C.fwh
>
> As i understand it, there's a 256byte header on the FWH files Dell adds:
>
> sdowdy at zia:/d2/VBOX_Share/SAS_SATA_A28/files/fw/sas/seagate/15k5$ od -a -N
> 256 S52C.fwh
> 0000000 sp sp sp sp D E L L ht S 5 2 C S 5 2
> 0000020 7 x soh nul nul nul nul nul nul nul nul nul nul nul nul nul
> 0000040 nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul
> 0000060 nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul etx
> 0000100 sp sp sp 1 6 1 1 2 sp sp sp sp sp sp sp sp
> 0000120 sp sp sp sp sp sp sp sp sp sp sp sp sp sp sp sp
> 0000140 sp sp sp sp sp S T 3 3 0 0 6 5 5 S S
> 0000160 sp sp sp 1 6 1 1 1 sp sp sp sp sp sp sp sp
> 0000200 sp sp sp sp sp sp sp sp sp sp sp sp sp sp sp sp
> 0000220 sp sp sp sp sp S T 3 1 4 6 8 5 5 S S
> 0000240 sp sp sp 1 6 1 1 3 sp sp sp sp sp sp sp sp
> 0000260 sp sp sp sp sp sp sp sp sp sp sp sp sp sp sp sp
> 0000300 sp sp sp sp sp sp S T 3 7 3 4 5 5 S S
> 0000320 nul nul nul nul nul nul nul nul nul nul nul nul nul nul bel nul
> 0000340 nul soh nul nul nul nul nul nul nul nul nul nul nul nul nul nul
> 0000360 nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul
> 0000400
>
> I don't know the structure of this header, but i'm guessing that maybe
> this is saying S52C can only be applied to disks currently using S527
> and higher?
>
> Okay, check S527 Firmware header file...
> 0000000 sp sp sp sp D E L L ht S 5 2 7 S 5 2
> 0000020 0 nul soh nul nul nul nul nul nul nul nul nul nul nul nul nul
>
> Looks like that can only be applied to disks running S520 or higher,
> but there are NO firmware images in that directory that contain the
> fw version S520.
>
> sdowdy at zia:/d2/VBOX_Share/SAS_SATA_A28/files/fw/sas/seagate/15k5$ od -a -N
> 256 S51a.fwh
> 0000000 sp sp sp sp D E L L ht S 5 1 A S 5 1
> 0000020 9 x soh nul nul nul nul nul nul nul nul nul nul nul nul nul
> ...
>
> sdowdy at zia:/d2/VBOX_Share/SAS_SATA_A28/files/fw/sas/seagate/15k5$ od -a -N
> 256 S519.fwh
> 0000000 sp sp sp sp D E L L ht S 5 1 9 S 5 1
> 0000020 5 x soh nul nul nul nul nul nul nul nul nul nul nul nul nul
>
> (confirms that S51A requires S519 and S519 requires S515. So, i think
> my suspicion that Nautilus only reports the first match it finds, but
> iteratively does firmware updates that it CAN do, is accurate)
>
>
> So, as best i can guess, there is a hole here in getting the drive
> updated to the URGENT firmware release because no S520.fwh exists in
> the Nautilus 'fw' directories.
>
> Can someone at Dell (or elsewhere) confirm that my suspicions are
> correct or not? (that a FWH file is simply missing here?)
>
> And i know i'm hoping for too much, but a description of the Header
> format would be awesome, because I don't know of any Dell ONLINE
> tools/resources to identify disks with out of date firmware, so
> i'm maintaining my own script based upon what i can glean from the
> Nautilus release notes and files/fw/... files. E.G.
>
> --------------------------------------------------------------------
> # check-diskfw
>
> This system has one or more disks that MAY require a firmware update.
>
> DISK ST3300655SS firmware S517 is not S52C [Ref:Nautilus SAS/SATA Release
> A28 45.0 (urgent)]
> DISK WDCWD5002ABYS-1 firmware 3B04 is not 3B05 [Ref:Nautilus SAS/SATA
> Release A28 40.0 (recommended)]
>
> # DEBUG=1 /raid/check-diskfw
> ...
> AWKDEBUG: MD1000,A.04,MD1000 RAID Enclosure Version should be A.04 or
> higher
> AWKDEBUG: NO MATCH FOR HDS725050KLA360-AB5A
> AWKDEBUG: ST3300655SS,S515,Nautilus SAS/SATA Release A07
> AWKDEBUG: ST3300655SS,S51A,Nautilus SAS/SATA Release A28 45.0 (urgent) --
> sdowdy: this is the version A28 installed, not S52C
> AWKDEBUG:
> ST3300655SS,S528,http://ftp.us.dell.com/sas-hdd/FRMW_LX_R189919.BIN
> AWKDEBUG: ST3300655SS,S52C,Nautilus SAS/SATA Release A28 45.0 (urgent)
> AWKDEBUG: WDCWD5002ABYS-1,3B05,Nautilus SAS/SATA Release A28 40.0
> (recommended)
> DISK ST3300655SS firmware S517 is not S515 [Ref:Nautilus SAS/SATA Release
> A07]
> DISK ST3300655SS firmware S517 is not S51A [Ref:Nautilus SAS/SATA Release
> A28 45.0 (urgent) -- sdowdy: this is the version A28 installed]
> DISK ST3300655SS firmware S517 is not S528
> [Ref:http://ftp.us.dell.com/sas-hdd/FRMW_LX_R189919.BIN]
> DISK ST3300655SS firmware S517 is not S52C [Ref:Nautilus SAS/SATA Release
> A28 45.0 (urgent)]
> DISK WDCWD5002ABYS-1 firmware 3B04 is not 3B05 [Ref:Nautilus SAS/SATA
> Release A28 40.0 (recommended)]
> --------------------------------------------------------------------
>
>
> If there is such a tool/resource, i would much appreciate a pointer.
> Otherwise, if anyone wants the shell-script *AS-IS*, i can post it.
> (it uses /proc/scsi/scsi and the output from megarc and megacli if
> they can be found along with a big nasty inline awk table) Problem
> is it requires continual manual updates/tweaks/presumptions.
>
> Of course it would be great if Dell could manage the Nautilus
> distribution in a way that could be run online (linux, of course)
> for READ-ONLY verification so i wouldn't have to write my own tool.
>
>
> --------------------------------------------------------------------
> ---- Inconsistencies and errors in Nautilus A28 release notes ------
> --------------------------------------------------------------------
>
> For example, there are several cases of probable typo errors like:
> ----------------------------------------------------------------------
> ===================================================
> 25.0 Seagate ES 7.2k SATA 3.5", Verison MA0D - Recommended
> -----------------------------------^^^^^^^^^
> ===================================================
>
> * Firmware Version MA0D *
>
> Seagate ES 7.2k SATA 2.5", version MS0D for drive models ST3250310NS,
> ST3500320NS, ST31000340NS and ST31000340NS.
> -----------------------------------^^^^
> ----------------------------------------------------------------------
> (what is it, MA0D or MS0D?)
>
> And this series of updates:
> ----------------------------------------------------------------------
> $ grep ST3300555SS R288929.txt
> 26.0 Seagate T10 7.2K SAS 3.5", version T215, for drive models ST3300555SS
> (300GB), ST3146755SS (146GB) and ST373355SS (73GB).
> 52.0 Seagate SAS 2.5", 10K, offline download version T10D, for drive
> models ST3300555SS, ST3146755SS and ST373355SS.
> 53.0 Seagate T10 7.2K SAS 3.5", offline download version T210, for drive
> models ST3300555SS (300GB), ST3146755SS (146GB) and ST373355SS (73GB).
> ..
> ----------------------------------------------------------------------
>
> T215->T10D->T210
>
> Huh? Those don't increment lexicographically in my book.
> But the middle one does say 2.5", the others 3.5", but Seagate
> has always used different model designations for different
> sized drives:
> http://seagate.custkb.com/seagate/crm/selfservice/search.jsp?DocId=204763
> and ST9 is for 2.5" drives, so ST3 shouldn't apply.
>
> Sure, this is nitpicking, but if the Release Notes are this badly
> QC'd, how do i know the firmware images are any better? Don't get
> me wrong, i'm willing to live with bad quality release notes if the
> updates work properly, anyway....
>
>
> thanks,
> --stephen
> --
> Stephen Dowdy - Systems Administrator - NCAR/RAL
> 303.497.2869 - sdowdy at ucar.edu -
> http://www.ral.ucar.edu/~sdowdy/
>
> _______________________________________________
> Linux-PowerEdge mailing list
> Linux-PowerEdge at dell.com
> https://lists.us.dell.com/mailman/listinfo/linux-poweredge
> Please read the FAQ at http://lists.us.dell.com/faq
From cupertino at gmx.net Wed Jan 5 05:42:58 2011
From: cupertino at gmx.net (cupertino at gmx.net)
Date: Wed, 05 Jan 2011 12:42:58 +0100
Subject: Nautilus SAS/SATA firmware update disks: Inconsistencies and
Quality Control and missing f/w images
In-Reply-To: <20110105113407.147540@gmx.net>
References: <4D222258.1070301@ucar.edu> <20110105113407.147540@gmx.net>
Message-ID: <20110105114258.160570@gmx.net>
Sorry for the type-o.
You _can't_ update a HDD w/o DUP to a DUP enabled.
-------- Original-Nachricht --------
> Datum: Wed, 05 Jan 2011 12:34:07 +0100
> Von: cupertino at gmx.net
> An: Stephen Dowdy , linux-poweredge at dell.com
> Betreff: *** GMX Spamverdacht *** Re: Nautilus SAS/SATA firmware update disks: Inconsistencies and Quality Control and missing f/w images
> As far as I know the Seagate S52x FW are for DUP enabled HDDs only, the
> S51x are for regular ones.
> There are Dell hard drives out there with same Seagate STxxxxxxx numbers,
> but different FWs (those for DUP enabled and those w/o DUP). You can't mix
> FWs btw. those drives.
> The most recent FW for DUP enabled drives would then be S52C and for
> regular drives S51A. You can go higher then S51A on those.
> Correct me if I'm wrong.
>
>
> -------- Original-Nachricht --------
> > Datum: Mon, 03 Jan 2011 12:24:08 -0700
> > Von: Stephen Dowdy
> > An: Dell Poweredge list server
> > Betreff: *** GMX Spamverdacht *** Nautilus SAS/SATA firmware update
> disks: Inconsistencies and Quality Control and missing f/w images
>
> > Anyone else have issues with the Nautilus offline Disk firmware
> > update utilities? (specifically i'm referring to A28, the latest)
> >
> > The Readme file has several errors and inconsistencies, so i
> > can't be sure if things aren't working as expected because the
> > documentation is wrong or because of something else. (See bottom of
> > message for examples, i don't want to clutter my core issue)
> >
> > Specifically, i have a problem with updating ST3300655SS...
> > There's an *URGENT* update for f/w S52C for these disks i want
> > to apply.
> >
> >
> > The Nautilus A28 shows the following firmware updates as documented:
> >
> > $ grep ST3300655SS *.txt
> > R288929.txt:17.0 Seagate SAS 3.5", offline download version S51A, for
> > drive models ST3300655SS, ST3146855SS and ST373455SS
> > R288929.txt:45.0 Seagate 15k5 SAS, 3.5" version S52C for drive models
> > ST3300655SS, ST3146855SS and ST373455SS.
> > R288929.txt:Seagate 15k5 SAS, 3.5" version S52C for drive models
> > ST3300655SS, ST3146855SS and ST373455SS.
> >
> >
> > So, S52C deprecates S51A later on. However, the system i just
> > updated had f/w S517 and....
> >
> > Nautilus' interactive update mode showed
> > Disk XXX ST3300655SS firmware S519 != S517 (or something like that).
> >
> > Okay, so there's another firmware image on the ISO for this drive
> > that isn't either version mentioned in the Release Notes, and it is
> > LOWER than the either version mentioned.
> >
> > Fine, do the update. Then Nautilus reports:
> > Success: Disk XXX ST3300655SS firmware S51A = S51A (or something like
> > that).
> >
> > So, it appears that Nautilus didn't update to the version it first
> > reported,
> > but a version slightly higher (one mentioned earlier in the release
> > notes).
> > But what about S52C, the one it SHOULD be applying?
> >
> >
> > $ grep -Rail S519 .
> > ./sas/seagate/15k5/S519.fwh
> > ./sas/seagate/15k5/S51a.fwh
> >
> > $ cd ./sas/seagate/15k5/
> > $ ls
> > 15dps515.lod 15fps515.lod S517.fwh S519.fwh S51a.fwh S527.fwh
> > S52C.fwh
> >
> > As i understand it, there's a 256byte header on the FWH files Dell adds:
> >
> > sdowdy at zia:/d2/VBOX_Share/SAS_SATA_A28/files/fw/sas/seagate/15k5$ od -a
> -N
> > 256 S52C.fwh
> > 0000000 sp sp sp sp D E L L ht S 5 2 C S 5 2
> > 0000020 7 x soh nul nul nul nul nul nul nul nul nul nul nul nul nul
> > 0000040 nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul
> > 0000060 nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul etx
> > 0000100 sp sp sp 1 6 1 1 2 sp sp sp sp sp sp sp sp
> > 0000120 sp sp sp sp sp sp sp sp sp sp sp sp sp sp sp sp
> > 0000140 sp sp sp sp sp S T 3 3 0 0 6 5 5 S S
> > 0000160 sp sp sp 1 6 1 1 1 sp sp sp sp sp sp sp sp
> > 0000200 sp sp sp sp sp sp sp sp sp sp sp sp sp sp sp sp
> > 0000220 sp sp sp sp sp S T 3 1 4 6 8 5 5 S S
> > 0000240 sp sp sp 1 6 1 1 3 sp sp sp sp sp sp sp sp
> > 0000260 sp sp sp sp sp sp sp sp sp sp sp sp sp sp sp sp
> > 0000300 sp sp sp sp sp sp S T 3 7 3 4 5 5 S S
> > 0000320 nul nul nul nul nul nul nul nul nul nul nul nul nul nul bel nul
> > 0000340 nul soh nul nul nul nul nul nul nul nul nul nul nul nul nul nul
> > 0000360 nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul
> > 0000400
> >
> > I don't know the structure of this header, but i'm guessing that maybe
> > this is saying S52C can only be applied to disks currently using S527
> > and higher?
> >
> > Okay, check S527 Firmware header file...
> > 0000000 sp sp sp sp D E L L ht S 5 2 7 S 5 2
> > 0000020 0 nul soh nul nul nul nul nul nul nul nul nul nul nul nul nul
> >
> > Looks like that can only be applied to disks running S520 or higher,
> > but there are NO firmware images in that directory that contain the
> > fw version S520.
> >
> > sdowdy at zia:/d2/VBOX_Share/SAS_SATA_A28/files/fw/sas/seagate/15k5$ od -a
> -N
> > 256 S51a.fwh
> > 0000000 sp sp sp sp D E L L ht S 5 1 A S 5 1
> > 0000020 9 x soh nul nul nul nul nul nul nul nul nul nul nul nul nul
> > ...
> >
> > sdowdy at zia:/d2/VBOX_Share/SAS_SATA_A28/files/fw/sas/seagate/15k5$ od -a
> -N
> > 256 S519.fwh
> > 0000000 sp sp sp sp D E L L ht S 5 1 9 S 5 1
> > 0000020 5 x soh nul nul nul nul nul nul nul nul nul nul nul nul nul
> >
> > (confirms that S51A requires S519 and S519 requires S515. So, i think
> > my suspicion that Nautilus only reports the first match it finds, but
> > iteratively does firmware updates that it CAN do, is accurate)
> >
> >
> > So, as best i can guess, there is a hole here in getting the drive
> > updated to the URGENT firmware release because no S520.fwh exists in
> > the Nautilus 'fw' directories.
> >
> > Can someone at Dell (or elsewhere) confirm that my suspicions are
> > correct or not? (that a FWH file is simply missing here?)
> >
> > And i know i'm hoping for too much, but a description of the Header
> > format would be awesome, because I don't know of any Dell ONLINE
> > tools/resources to identify disks with out of date firmware, so
> > i'm maintaining my own script based upon what i can glean from the
> > Nautilus release notes and files/fw/... files. E.G.
> >
> > --------------------------------------------------------------------
> > # check-diskfw
> >
> > This system has one or more disks that MAY require a firmware update.
> >
> > DISK ST3300655SS firmware S517 is not S52C [Ref:Nautilus SAS/SATA
> Release
> > A28 45.0 (urgent)]
> > DISK WDCWD5002ABYS-1 firmware 3B04 is not 3B05 [Ref:Nautilus SAS/SATA
> > Release A28 40.0 (recommended)]
> >
> > # DEBUG=1 /raid/check-diskfw
> > ...
> > AWKDEBUG: MD1000,A.04,MD1000 RAID Enclosure Version should be A.04 or
> > higher
> > AWKDEBUG: NO MATCH FOR HDS725050KLA360-AB5A
> > AWKDEBUG: ST3300655SS,S515,Nautilus SAS/SATA Release A07
> > AWKDEBUG: ST3300655SS,S51A,Nautilus SAS/SATA Release A28 45.0 (urgent)
> --
> > sdowdy: this is the version A28 installed, not S52C
> > AWKDEBUG:
> > ST3300655SS,S528,http://ftp.us.dell.com/sas-hdd/FRMW_LX_R189919.BIN
> > AWKDEBUG: ST3300655SS,S52C,Nautilus SAS/SATA Release A28 45.0 (urgent)
> > AWKDEBUG: WDCWD5002ABYS-1,3B05,Nautilus SAS/SATA Release A28 40.0
> > (recommended)
> > DISK ST3300655SS firmware S517 is not S515 [Ref:Nautilus SAS/SATA
> Release
> > A07]
> > DISK ST3300655SS firmware S517 is not S51A [Ref:Nautilus SAS/SATA
> Release
> > A28 45.0 (urgent) -- sdowdy: this is the version A28 installed]
> > DISK ST3300655SS firmware S517 is not S528
> > [Ref:http://ftp.us.dell.com/sas-hdd/FRMW_LX_R189919.BIN]
> > DISK ST3300655SS firmware S517 is not S52C [Ref:Nautilus SAS/SATA
> Release
> > A28 45.0 (urgent)]
> > DISK WDCWD5002ABYS-1 firmware 3B04 is not 3B05 [Ref:Nautilus SAS/SATA
> > Release A28 40.0 (recommended)]
> > --------------------------------------------------------------------
> >
> >
> > If there is such a tool/resource, i would much appreciate a pointer.
> > Otherwise, if anyone wants the shell-script *AS-IS*, i can post it.
> > (it uses /proc/scsi/scsi and the output from megarc and megacli if
> > they can be found along with a big nasty inline awk table) Problem
> > is it requires continual manual updates/tweaks/presumptions.
> >
> > Of course it would be great if Dell could manage the Nautilus
> > distribution in a way that could be run online (linux, of course)
> > for READ-ONLY verification so i wouldn't have to write my own tool.
> >
> >
> > --------------------------------------------------------------------
> > ---- Inconsistencies and errors in Nautilus A28 release notes ------
> > --------------------------------------------------------------------
> >
> > For example, there are several cases of probable typo errors like:
> > ----------------------------------------------------------------------
> > ===================================================
> > 25.0 Seagate ES 7.2k SATA 3.5", Verison MA0D - Recommended
> > -----------------------------------^^^^^^^^^
> > ===================================================
> >
> > * Firmware Version MA0D *
> >
> > Seagate ES 7.2k SATA 2.5", version MS0D for drive models ST3250310NS,
> > ST3500320NS, ST31000340NS and ST31000340NS.
> > -----------------------------------^^^^
> > ----------------------------------------------------------------------
> > (what is it, MA0D or MS0D?)
> >
> > And this series of updates:
> > ----------------------------------------------------------------------
> > $ grep ST3300555SS R288929.txt
> > 26.0 Seagate T10 7.2K SAS 3.5", version T215, for drive models
> ST3300555SS
> > (300GB), ST3146755SS (146GB) and ST373355SS (73GB).
> > 52.0 Seagate SAS 2.5", 10K, offline download version T10D, for drive
> > models ST3300555SS, ST3146755SS and ST373355SS.
> > 53.0 Seagate T10 7.2K SAS 3.5", offline download version T210, for drive
> > models ST3300555SS (300GB), ST3146755SS (146GB) and ST373355SS (73GB).
> > ..
> > ----------------------------------------------------------------------
> >
> > T215->T10D->T210
> >
> > Huh? Those don't increment lexicographically in my book.
> > But the middle one does say 2.5", the others 3.5", but Seagate
> > has always used different model designations for different
> > sized drives:
> >
> http://seagate.custkb.com/seagate/crm/selfservice/search.jsp?DocId=204763
> > and ST9 is for 2.5" drives, so ST3 shouldn't apply.
> >
> > Sure, this is nitpicking, but if the Release Notes are this badly
> > QC'd, how do i know the firmware images are any better? Don't get
> > me wrong, i'm willing to live with bad quality release notes if the
> > updates work properly, anyway....
> >
> >
> > thanks,
> > --stephen
> > --
> > Stephen Dowdy - Systems Administrator - NCAR/RAL
> > 303.497.2869 - sdowdy at ucar.edu -
> > http://www.ral.ucar.edu/~sdowdy/
> >
> > _______________________________________________
> > Linux-PowerEdge mailing list
> > Linux-PowerEdge at dell.com
> > https://lists.us.dell.com/mailman/listinfo/linux-poweredge
> > Please read the FAQ at http://lists.us.dell.com/faq
>
> _______________________________________________
> Linux-PowerEdge mailing list
> Linux-PowerEdge at dell.com
> https://lists.us.dell.com/mailman/listinfo/linux-poweredge
> Please read the FAQ at http://lists.us.dell.com/faq
From wls at romanus.ca Wed Jan 5 07:18:56 2011
From: wls at romanus.ca (Winston Sorfleet)
Date: Wed, 5 Jan 2011 08:18:56 -0500
Subject: Debian on PE2500 - OMSA or AM available?
In-Reply-To:
References: <532D6AB2BDA94740B729D2C0AF2ED1E6@caesar>
Message-ID: <002901cbacdb$1e70b7f0$5b5227d0$@ca>
You can, but it doesn't do storage. Just chassis hardware.
Cheers,
From: David RIBEIRO [mailto:david.ribeiro76 at gmail.com]
Sent: January-05-11 3:03 AM
To: Winston Sorfleet
Cc: linux-poweredge at dell.com
Subject: Re: Debian on PE2500 - OMSA or AM available?
Hello you can install dellomsa from sara repository, i give you a link, its
function very on ubuntu server 10.04.1 x32 and i have a problem in my x64
version of ubuntu server
http://www.learnosity.com/techblog/index.cfm/2009/8/4/Installing-Dell-OpenMa
nage-Server-Administrator-on-Ubuntu-32bit
Good Luck ;)
2011/1/5 Winston Sorfleet
On Thu, December 16, 2010 21:07, Igor Cicimov wrote:
> Hi all,
>
> I have a SCSI disk failure on my RAID5 and the server flashing light is
> on as well as the flashing light on the disk it self. I would like to
> install OMSA or Array Manager for troubleshooting but since I have Debian
> installed on the server wonder if there is any OMSA or AM deb repository
> available somewhere?
>
Maybe looking for this?
http://hwraid.le-vert.net/wiki/DebianPackages
_______________________________________________
Linux-PowerEdge mailing list
Linux-PowerEdge at dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.us.dell.com/pipermail/linux-poweredge/attachments/20110105/a9d176fe/attachment.htm
From mcclnx at yahoo.com.tw Wed Jan 5 08:46:33 2011
From: mcclnx at yahoo.com.tw (mcclnx mcc)
Date: Wed, 5 Jan 2011 22:46:33 +0800 (CST)
Subject: way to replace PERC 6/E card???
Message-ID: <454325.11256.qm@web73905.mail.tp2.yahoo.com>
we have several R900 servers with PERC 6/E card in it. Recently it we getting some message on /var/log/message say change to "write back" and change back to "write through".
We figure out it is PERC 6/E card battery weak. My questions are:
1. some battery only use one year (some even shorter) and already this problem.
2. How come PERC 5/E does not have this issue?
3. where is DISK array information store for PERC 6/E? I did NOT see NVRAM on PERC 6/E card.
4. when replace PERC 6/E card and power on server sometime it will ask you "import configuration" ans some time it won't. Why?
5. When replace PERC 6/E card and power on. Server ask "import configuration" should I answer "yes"?
6. where can I find documentation which mention "import configuration"?
Thanks.
From tim at seoss.co.uk Wed Jan 5 09:47:49 2011
From: tim at seoss.co.uk (Tim Small)
Date: Wed, 05 Jan 2011 15:47:49 +0000
Subject: way to replace PERC 6/E card???
In-Reply-To: <454325.11256.qm@web73905.mail.tp2.yahoo.com>
References: <454325.11256.qm@web73905.mail.tp2.yahoo.com>
Message-ID: <4D2492A5.2010004@seoss.co.uk>
On 05/01/11 14:46, mcclnx mcc wrote:
> we have several R900 servers with PERC 6/E card in it. Recently it we getting some message on /var/log/message say change to "write back" and change back to "write through".
>
> We figure out it is PERC 6/E card battery weak. My questions are:
>
> 1. some battery only use one year (some even shorter) and already this problem.
>
Faulty batteries? Continuous high battery temperatures when
fully-charged kill lithium-ion batteries quickly, so maybe you should
check the battery temperature (e.g. using a infra-red thermometer etc.),
so maybe it's that?
Same reason you shouldn't leave the batteries in usually-mains-powered
laptops....
Are you sure this isn't just the automatic battery charge/discharge
which these cards do to periodically check the battery capacity? Stupid
design that it changes the performance of the array whilst doing this
IMO, but that's LSI for you... How about using two batteries, and
checking each of them in turn instead, then it wouldn't need to disable
the write caching?
> 3. where is DISK array information store for PERC 6/E? I did NOT see NVRAM on PERC 6/E card.
>
On the card and also on disk, I believe.
> 4. when replace PERC 6/E card and power on server sometime it will ask you "import configuration" ans some time it won't. Why?
>
Maybe this depends on the firmware version installed? Certainly it did
with PERC5 cards. Newer firmware will ask you to import the
configuration I think.
> 5. When replace PERC 6/E card and power on. Server ask "import configuration" should I answer "yes"?
>
I think so, yes - this crappy message means something like "I don't
think the on-disk configuration data was written by this RAID card -
should I use it anyway?"
> 6. where can I find documentation which mention "import configuration"?
>
Good luck, it's an LSI product. There are some vendor docs around from
both Dell and Sun. Try google?
HTH...
Tim.
--
South East Open Source Solutions Limited
Registered in England and Wales with company number 06134732.
Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ
VAT number: 900 6633 53 http://seoss.co.uk/ +44-(0)1273-808309
From Jens_Heinz at Dell.com Wed Jan 5 10:07:45 2011
From: Jens_Heinz at Dell.com (Jens_Heinz at Dell.com)
Date: Wed, 5 Jan 2011 17:07:45 +0100
Subject: way to replace PERC 6/E card???
In-Reply-To: <454325.11256.qm@web73905.mail.tp2.yahoo.com>
References: <454325.11256.qm@web73905.mail.tp2.yahoo.com>
Message-ID: <399212640037934A8F25742DA026AFD7021BBD93@LEJX7ADC103.EMEA.DELL.COM>
Hi,
here some answers:
1. please check the Perc are not doing their regular battery learn cycles. It happens every 90days and during the cycle cache will be disabled.
2. Perc5/E would have a learn cycle too. If that comes out to be the issue.
3. only on your HDDs. It is stored in DDF (format).
4. if you have a new controller and there are no conflicting information on your HDDs you won't be asked for 'import' it just takes the configuration from your drives. If the configuration stored on your drives does differ for whatever reason you will be asked which one to import.
5. It depends. I would always recommend to enter CTRL+R BIOS and check what exactly the configuration is the controller found. With most recent RAID FW you'll have even a tab for 'foreign config'
6. Perc6/E documentation can be found here. Don't know if 'import configuration' is specifically mentioned there:
http://support.dell.com/support/edocs/storage/RAID/PERC6/en/index.htm
Regards,
Jens.
-----Original Message-----
From: linux-poweredge-bounces-Lists On Behalf Of mcclnx mcc
Sent: 05 January 2011 15:47
To: linux-poweredge-Lists
Subject: way to replace PERC 6/E card???
we have several R900 servers with PERC 6/E card in it. Recently it we getting some message on /var/log/message say change to "write back" and change back to "write through".
We figure out it is PERC 6/E card battery weak. My questions are:
1. some battery only use one year (some even shorter) and already this problem.
2. How come PERC 5/E does not have this issue?
3. where is DISK array information store for PERC 6/E? I did NOT see NVRAM on PERC 6/E card.
4. when replace PERC 6/E card and power on server sometime it will ask you "import configuration" ans some time it won't. Why?
5. When replace PERC 6/E card and power on. Server ask "import configuration" should I answer "yes"?
6. where can I find documentation which mention "import configuration"?
Thanks.
_______________________________________________
Linux-PowerEdge mailing list
Linux-PowerEdge at dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq
From Dell at epperson.homelinux.net Wed Jan 5 10:15:38 2011
From: Dell at epperson.homelinux.net (J. Epperson)
Date: Wed, 5 Jan 2011 11:15:38 -0500
Subject: way to replace PERC 6/E card???
In-Reply-To: <4D2492A5.2010004@seoss.co.uk>
References: <454325.11256.qm@web73905.mail.tp2.yahoo.com>
<4D2492A5.2010004@seoss.co.uk>
Message-ID: <57a0f4e18706fc643b59311822e4c794.squirrel@epperson.homelinux.net>
On Wed, January 5, 2011 10:47, Tim Small wrote:
> On 05/01/11 14:46, mcclnx mcc wrote:
>
>> 3. where is DISK array information store for PERC 6/E? I did NOT see
>> NVRAM on PERC 6/E card.
>>
>
> On the card and also on disk, I believe.
>
>> 4. when replace PERC 6/E card and power on server sometime it will ask
>> you "import configuration" ans some time it won't. Why?
>>
>
> Maybe this depends on the firmware version installed? Certainly it did
> with PERC5 cards. Newer firmware will ask you to import the
> configuration I think.
>
>> 5. When replace PERC 6/E card and power on. Server ask "import
>> configuration" should I answer "yes"?
>>
>
> I think so, yes - this crappy message means something like "I don't
> think the on-disk configuration data was written by this RAID card -
> should I use it anyway?"
>
I don't have any 6/E controllers, but historically with the LSI RAIDs the
most reliable way to import a config to a replacement card has been to
boot into the card bios with the drives not present, clear the NVRAM
config, then shut down, attach the drives and start back up. That way the
only config available is the one on disk.
From mcclnx at yahoo.com.tw Wed Jan 5 10:36:53 2011
From: mcclnx at yahoo.com.tw (mcclnx mcc)
Date: Thu, 6 Jan 2011 00:36:53 +0800 (CST)
Subject: way to replace PERC 6/E card???
In-Reply-To: <399212640037934A8F25742DA026AFD7021BBD93@LEJX7ADC103.EMEA.DELL.COM>
Message-ID: <330599.56026.qm@web73901.mail.tp2.yahoo.com>
Thank you for your answer.
I have been checked several times and confirm PERC 6?E is NOT on "battery lean cycle".
you mention configuration store on HDD and using DDF format. Where is Hard disk location on Redhat Linux?
I already check PERC 6/E documention and NO "import " information there.
--- 11/1/5 (?)?Jens_Heinz at Dell.com ???
> ???: Jens_Heinz at Dell.com
> ??: RE: way to replace PERC 6/E card???
> ???: mcclnx at yahoo.com.tw, linux-poweredge at lists.us.dell.com
> ??: 2011?1?5?,?,??11:07
> Hi,
>
> here some answers:
>
> 1. please check the Perc are not doing their regular
> battery learn cycles. It happens every 90days and during the
> cycle cache will be disabled.
>
> 2. Perc5/E would have a learn cycle too. If that comes out
> to be the issue.
>
> 3. only on your HDDs. It is stored in DDF (format).
>
> 4. if you have a new controller and there are no
> conflicting information on your HDDs you won't be asked for
> 'import' it just takes the configuration from your drives.
> If the configuration stored on your drives does differ for
> whatever reason you will be asked which one to import.
>
> 5. It depends. I would always recommend to enter CTRL+R
> BIOS and check what exactly the configuration is the
> controller found. With most recent RAID FW you'll have even
> a tab for 'foreign config'
>
> 6. Perc6/E documentation can be found here. Don't know if
> 'import configuration' is specifically mentioned there:
> ??? http://support.dell.com/support/edocs/storage/RAID/PERC6/en/index.htm
>
>
> Regards,
> Jens.
>
> -----Original Message-----
> From: linux-poweredge-bounces-Lists On Behalf Of mcclnx
> mcc
> Sent: 05 January 2011 15:47
> To: linux-poweredge-Lists
> Subject: way to replace PERC 6/E card???
>
> we have several R900 servers with PERC 6/E card in
> it.? Recently it we getting some message on
> /var/log/message say change to "write back" and change back
> to "write through".???
>
> We figure out it is PERC 6/E card battery weak.? My
> questions are:
>
> 1. some battery only use one year (some even shorter) and
> already this problem.
>
> 2. How come PERC 5/E does not have this issue?
>
> 3. where is DISK array information store for PERC
> 6/E?? I did NOT see NVRAM on PERC 6/E card.
>
> 4. when replace PERC 6/E card and power on server sometime
> it will ask you "import configuration" ans some time it
> won't.? Why?
>
> 5. When replace PERC 6/E card and power on.? Server
> ask "import configuration" should I answer "yes"?
>
> 6. where can I find documentation which mention "import
> configuration"?
>
> Thanks.
>
>
> ? ? ?
>
> _______________________________________________
> Linux-PowerEdge mailing list
> Linux-PowerEdge at dell.com
> https://lists.us.dell.com/mailman/listinfo/linux-poweredge
> Please read the FAQ at http://lists.us.dell.com/faq
>
From tim at seoss.co.uk Wed Jan 5 11:09:09 2011
From: tim at seoss.co.uk (Tim Small)
Date: Wed, 05 Jan 2011 17:09:09 +0000
Subject: way to replace PERC 6/E card???
In-Reply-To: <330599.56026.qm@web73901.mail.tp2.yahoo.com>
References: <330599.56026.qm@web73901.mail.tp2.yahoo.com>
Message-ID: <4D24A5B5.8030402@seoss.co.uk>
On 05/01/11 16:36, mcclnx mcc wrote:
> you mention configuration store on HDD and using DDF format. Where is Hard disk location on Redhat Linux?
>
It's not possible to directly read or write the DDF format metadata from
within Linux via the PERC cards - the PERC cards do not make those areas
of the drives visible to Linux.
If you attach the drives to a Linux machine using a different controller
(e.g. plain SATA, or SAS controller, or a RAID card which supports JBOD
mode), then you can use mdadm version 3.0+, or dmraid to parse, and use
DDF arrays. This is often useful for data recovery.
Tim.
--
South East Open Source Solutions Limited
Registered in England and Wales with company number 06134732.
Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ
VAT number: 900 6633 53 http://seoss.co.uk/ +44-(0)1273-808309
From xpoinsard at openpricer.com Wed Jan 5 11:28:32 2011
From: xpoinsard at openpricer.com (Xavier Poinsard)
Date: Wed, 05 Jan 2011 18:28:32 +0100
Subject: rpm error trying to do firmware update on poweredge 2950
In-Reply-To: <20110104145030.GA11889@auslistsprd01.us.dell.com>
References: <20110104022454.GA31054@auslistsprd01.us.dell.com>
<20110104145030.GA11889@auslistsprd01.us.dell.com>
Message-ID:
That's much better.
RPM were installed.
Now I am curious why it doesn't propose the ST3500620SS firmware update:
update_firmware
Running system inventory...
Searching storage directory for available BIOS updates...
Checking BIOS - 2.6.1
Available: dell_dup_componentid_00159 - 2.6.1
Did not find a newer package to install that meets all installation checks.
Checking SAS/SATA Backplane 0:0 Backplane Firmware - 1.05
Available: dell_dup_componentid_11204 - 1.05
Did not find a newer package to install that meets all installation checks.
Checking NetXtreme II BCM5708 Gigabit Ethernet rev 12 (eth0) - 2.9.1
Available: pci_firmware(ven_0x14e4_dev_0x164c) - 6.0.1
Found Update: pci_firmware(ven_0x14e4_dev_0x164c) - 6.0.1
Checking PowerVault MD1000-0 EMM-1 Firmware - a.04
Available: dell_dup_componentid_08529 - a.04
Did not find a newer package to install that meets all installation checks.
Checking ST3146755SS Firmware - t109
Did not find a newer package to install that meets all installation checks.
Checking PERC 5/i Integrated Controller 0 Firmware - 5.2.2-0072
Available:
pci_firmware(ven_0x1028_dev_0x0015_subven_0x1028_subdev_0x1f03) - 5.2.2-0072
Did not find a newer package to install that meets all installation checks.
Checking PERC 6/E Adapter Controller 1 Firmware - 6.2.0-0013
Available:
pci_firmware(ven_0x1000_dev_0x0060_subven_0x1028_subdev_0x1f0a) - 6.3.0-0001
Found Update:
pci_firmware(ven_0x1000_dev_0x0060_subven_0x1028_subdev_0x1f0a) - 6.3.0-0001
Checking NetXtreme II BCM5708 Gigabit Ethernet rev 12 (eth1) - 2.9.1
Available: pci_firmware(ven_0x14e4_dev_0x164c) - 6.0.1
Found Update: pci_firmware(ven_0x14e4_dev_0x164c) - 6.0.1
Checking System BIOS for PowerEdge 2950 - 2.6.1
Available: system_bios(ven_0x1028_dev_0x01b2) - 2.7.0
Found Update: system_bios(ven_0x1028_dev_0x01b2) - 2.7.0
Checking ST3500620SS Firmware - ms04
Available: dell_dup_componentid_16861 - ms0c
Did not find a newer package to install that meets all installation checks.
Found firmware which needs to be updated.
Please run the program with the '--yes' switch to enable BIOS update.
UPDATE NOT COMPLETED!
Best regards,
Xavier.
Le 04/01/2011 15:50, Matt Domsch a ?crit :
> On Mon, Jan 03, 2011 at 08:24:54PM -0600, Matt Domsch wrote:
>> On Mon, Jan 03, 2011 at 03:13:32PM +0100, Xavier Poinsard wrote:
>>> Hi all,
>>>
>>> I tried to do yum install $(/usr/sbin/bootstrap_firmware)
>>> And I got an error :
>> [snip]
>>> ERROR with rpm_check_debug vs depsolve:
>>> rpmlib(FileDigests) is needed by system_bios_PowerEdge_2950-2.7.0-20.noarch
>>> rpmlib(PayloadIsXz) is needed by system_bios_PowerEdge_2950-2.7.0-20.noarch
>>
>> Arrggh. That's the builder, which is running on Fedora 14, which
>> apparently picked up the new SHA2 digests and XZ compression method,
>> which older rpm versions don't know.
>>
>> Looks like I'll have to force the builder to use older checksum and
>> compression methods, and rebuild the newer packages. Arrggh....
>
> All BIOS packages in the firmware repository have been rebuilt using
> the old checksums and compression methods. You'll see all -21.noarch
> packages now. Please test and report success/failure now.
>
> Thanks,
> Matt
>
From noc at defenderhosting.com Wed Jan 5 11:44:45 2011
From: noc at defenderhosting.com (Jon Wolberg)
Date: Wed, 5 Jan 2011 10:44:45 -0700
Subject: 146GB drive pretenting to be a 300GB
In-Reply-To: <20110105095920.GC26493@easy5.in-chemnitz.de>
References: <2121948662.125098.1294193310972.JavaMail.root@mail.dtgmail.com>
<112137805.125102.1294193468470.JavaMail.root@mail.dtgmail.com>
<20110105095920.GC26493@easy5.in-chemnitz.de>
Message-ID:
Hi Tino-
Since you mentioned the H700 firmware, I should note that I am already
running the latest as well.
Jon Wolberg
Systems Engineer
Virtacore Systems Inc.
"We Virtualize IT!"
On Wed, Jan 5, 2011 at 2:59 AM, Tino Schwarze wrote:
> Hi Jon,
>
> On Tue, Jan 04, 2011 at 09:11:08PM -0500, Defender NOC wrote:
>
> > We just purchased a new R510 and while i've had a lot of experience
> > with 10g and 11g systems other than the R510 ( mostly R610 and R710 )
> > i've yet to see this one before. We purchased the system with 12 x
> > 146gb 15k SAS drives in them which all showed as such in the BIOS.
> > Upon loading OMSA on the system I get the below results however
> > looking at the drive itself:
>
> We've recently purchases an R515 with the same drives in it.
>
> > ID 0:0:11
> > Status OK
> > Name Physical Disk 0:0:11
> > State Online
> > Power Status Spun Up
> > Bus Protocol SAS
> > Media HDD
> > Revision EH02
> > Failure Predicted No
> > Certified Yes
> > Capacity 136.12GB
> > Used RAID Disk Space 136.12GB
> > Available RAID Disk Space 0.00GB
> > Hot Spare No
> > Vendor ID DELL(tm)
> > Product ID ST3300657SS-H
> > Serial No. 3SJ2QJD8
> > Part Number SG01DKVF125310AU01ALA00
> > Negotiated Speed 3.00 Gbps
> > Capable Speed 3.00 Gbps
> > Manufacture Day 02
> > Manufacture Week 44
> > Manufacture Year 2010
> > SAS Address 5000C50028AA2161
>
> I'm seeing similar output:
> Bus Protocol : SAS
> Media : HDD
> Revision : EH02
> Certified : Yes
> Capacity : 136.13 GB (146163105792 bytes)
> Vendor ID : DELL(tm)
> Product ID : ST3300657SS-H
> Serial No. : 3SJ2QGQY
> Negotiated Speed : 3.00 Gbps
> Capable Speed : 3.00 Gbps
> Manufacture Day : 02
> Manufacture Week : 44
> Manufacture Year : 2010
> SAS Address : 5000C50028AA638D
>
> > The Seagate part number listed belongs to the below drive:
> >
> >
> http://www.seagate.com/ww/v/index.jsp?locale=en-US&vgnextoid=6664470bd8cc1210VgnVCM1000001a48090aRCRD
> >
> > which is a 300GB drive, not a 146GB drive. Not only that, but for
> > some reason it's running at 3Gb/s instead of the 6Gb/s rating on
> > Seagate's site ( not that this matters as I won't get that performance
> > out of it, but noting it anyway since its another anomaly ). The -H
> > at the end is also unusual for a Seagate part number. I've never seen
> > it on any of their drives.
>
> I guess -H means "half capacity"?
>
> > Attempting to update the firmware with the listed ES62 update on
> > Dell's site results in the below:
> >
> >
> > sh FRMW_LX_R245074.BIN
> > Collecting inventory...
> > ...
> > Running validation...
> >
> > This Update Package is not compatible with your system configuration.
>
> I'm seeing a similar message when running update_firmware:
>
> Checking ST3300657SS-H Firmware - eh02
> Did not find a newer package to install that meets all installation
> checks.
>
> > Is there some sort of firmware limiting going on here that is dropping
> > the drive down to 3Gb/s and 146GB?
>
> Looks like it is. Maybe Dell will sell Upgrade Packages one day? :-|
>
> I'll update the H700 BIOS, then take a look on the reported speed again.
>
> Thanks for pointing that out,
>
> Tino.
>
> --
> "What we nourish flourishes." - "Was wir n?hren erbl?ht."
>
> www.tisc.de
>
> _______________________________________________
> Linux-PowerEdge mailing list
> Linux-PowerEdge at dell.com
> https://lists.us.dell.com/mailman/listinfo/linux-poweredge
> Please read the FAQ at http://lists.us.dell.com/faq
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.us.dell.com/pipermail/linux-poweredge/attachments/20110105/e239ad1e/attachment.htm
From sdowdy at ucar.edu Wed Jan 5 12:54:40 2011
From: sdowdy at ucar.edu (Stephen Dowdy)
Date: Wed, 05 Jan 2011 11:54:40 -0700
Subject: Nautilus SAS/SATA firmware update disks: Inconsistencies and
Quality Control and missing f/w images
In-Reply-To: <20110105113407.147540@gmx.net>
References: <4D222258.1070301@ucar.edu> <20110105113407.147540@gmx.net>
Message-ID: <4D24BE70.4050309@ucar.edu>
cupertino at gmx.net wrote, On 01/05/2011 04:34 AM:
> As far as I know the Seagate S52x FW are for DUP enabled HDDs only, the S51x are for regular ones.
> There are Dell hard drives out there with same Seagate STxxxxxxx numbers, but different FWs (those for DUP enabled and those w/o DUP). You can't mix FWs btw. those drives.
> The most recent FW for DUP enabled drives would then be S52C and for regular drives S51A. You can go higher then S51A on those.
> Correct me if I'm wrong.
Thanks for that response.
I'm presuming that DUP means "Dell Update Package" and not some
other acronym. Going on this, i found:
Dell Update Package Frequently Asked Questions - Technical Assistance Bulletin (TAB) - 337778
http://support.dell.com/support/topics/global.aspx/support/kcs/document?c=us&docid=124202&doclang=en&l=en&s=gen&cs=
Which seems to confirm what you have said.
-----------------------
Can I update the firmware on a non-DUP HDD to make it fully compatible with DUP?
No. DUP enabled drives require a minimum firmware and related part number to take advantage of DUP functionality. Non-DUP part numbers will continue to be supported with the Nautilus firmware download utility.
-----------------------
(I'm surprised by this, but that's apparently reality. I guess the
firmware gap from S51A to S520 is simply to keep the non-DUP-enabled
drives from getting updated to a DUP-enabled drive) It's too bad
Dell uses the SAME model numbers with differing firmware tracks.
And i have to presume from this that the Update processes (Nautilus
and DUPs) are not entirely disjoint, but that Nautilus appears to
update both DUP-enabled and non-DUP-enabled drives. (i.e. Nautilus
would then appear to be a superset of all SAS+SATA firmware updates?)
I still have a continued concern over the inconsistencies and typos
in the release notes making it hard to determine if something
applies. (again, references to ST3* devices as 2.5" doesn't
jive w/ Seagate model numbering schemes, etc)
Q: is it possible (without OpenManage) to obtain the Dell
PartNumber for a Dell drive? I'm *guessing* that OM simply has a
giant Model+Firmware lookup table and that the partnumber isn't
stored in some log page on the controller.
thanks again,
--stephen
> -------- Original-Nachricht --------
>> Datum: Mon, 03 Jan 2011 12:24:08 -0700
>> Von: Stephen Dowdy
>> An: Dell Poweredge list server
>> Betreff: *** GMX Spamverdacht *** Nautilus SAS/SATA firmware update disks: Inconsistencies and Quality Control and missing f/w images
>
>> Anyone else have issues with the Nautilus offline Disk firmware
>> update utilities? (specifically i'm referring to A28, the latest)
...
Hmm, "GMX Spamverdacht" (Suspicion of Spam) ? Oh well. :-(
--stephen
--
Stephen Dowdy - Systems Administrator - NCAR/RAL
303.497.2869 - sdowdy at ucar.edu - http://www.ral.ucar.edu/~sdowdy/
From cupertino at gmx.net Wed Jan 5 14:15:28 2011
From: cupertino at gmx.net (cupertino at gmx.net)
Date: Wed, 05 Jan 2011 21:15:28 +0100
Subject: Nautilus SAS/SATA firmware update disks: Inconsistencies
and Quality Control and missing f/w images
In-Reply-To: <4D24BE70.4050309@ucar.edu>
References: <4D222258.1070301@ucar.edu> <20110105113407.147540@gmx.net>
<4D24BE70.4050309@ucar.edu>
Message-ID: <1294258528.3304.38.camel@E6400.jshweb.info>
Yes, DUP is the Dell Update Package which enables those drives to be
updated from within a Dell supported OS. No Nautilus is needed.
But Nautilus is of course able to update those DUP enabled hard drives
too. At least it should be ;-)
I don't know of a way to get Dell part# directly from those hard drives.
Maybe someone else can help with that.
On Wed, 2011-01-05 at 11:54 -0700, Stephen Dowdy wrote:
> cupertino at gmx.net wrote, On 01/05/2011 04:34 AM:
> > As far as I know the Seagate S52x FW are for DUP enabled HDDs only, the S51x are for regular ones.
> > There are Dell hard drives out there with same Seagate STxxxxxxx numbers, but different FWs (those for DUP enabled and those w/o DUP). You can't mix FWs btw. those drives.
> > The most recent FW for DUP enabled drives would then be S52C and for regular drives S51A. You can go higher then S51A on those.
> > Correct me if I'm wrong.
>
> Thanks for that response.
>
> I'm presuming that DUP means "Dell Update Package" and not some
> other acronym. Going on this, i found:
>
> Dell Update Package Frequently Asked Questions - Technical Assistance Bulletin (TAB) - 337778
> http://support.dell.com/support/topics/global.aspx/support/kcs/document?c=us&docid=124202&doclang=en&l=en&s=gen&cs=
>
> Which seems to confirm what you have said.
>
> -----------------------
> Can I update the firmware on a non-DUP HDD to make it fully compatible with DUP?
> No. DUP enabled drives require a minimum firmware and related part number to take advantage of DUP functionality. Non-DUP part numbers will continue to be supported with the Nautilus firmware download utility.
> -----------------------
> (I'm surprised by this, but that's apparently reality. I guess the
> firmware gap from S51A to S520 is simply to keep the non-DUP-enabled
> drives from getting updated to a DUP-enabled drive) It's too bad
> Dell uses the SAME model numbers with differing firmware tracks.
>
> And i have to presume from this that the Update processes (Nautilus
> and DUPs) are not entirely disjoint, but that Nautilus appears to
> update both DUP-enabled and non-DUP-enabled drives. (i.e. Nautilus
> would then appear to be a superset of all SAS+SATA firmware updates?)
>
> I still have a continued concern over the inconsistencies and typos
> in the release notes making it hard to determine if something
> applies. (again, references to ST3* devices as 2.5" doesn't
> jive w/ Seagate model numbering schemes, etc)
>
> Q: is it possible (without OpenManage) to obtain the Dell
> PartNumber for a Dell drive? I'm *guessing* that OM simply has a
> giant Model+Firmware lookup table and that the partnumber isn't
> stored in some log page on the controller.
>
> thanks again,
> --stephen
>
> > -------- Original-Nachricht --------
> >> Datum: Mon, 03 Jan 2011 12:24:08 -0700
> >> Von: Stephen Dowdy
> >> An: Dell Poweredge list server
> >> Betreff: *** GMX Spamverdacht *** Nautilus SAS/SATA firmware update disks: Inconsistencies and Quality Control and missing f/w images
> >
> >> Anyone else have issues with the Nautilus offline Disk firmware
> >> update utilities? (specifically i'm referring to A28, the latest)
> ...
>
> Hmm, "GMX Spamverdacht" (Suspicion of Spam) ? Oh well. :-(
>
> --stephen
From jacob.perkin at gmail.com Wed Jan 5 14:42:17 2011
From: jacob.perkin at gmail.com (Jacob Perkins)
Date: Wed, 5 Jan 2011 14:42:17 -0600
Subject: SMART / PERC HDD replacement criteria
Message-ID:
Good morning everyone!
I was wondering if anyone would like to share there 'criteria' for swapping out
bad hard drives on their PE's. We currently run a farm of right under 10k
servers, mostly PE's ranging from 840's to 2950's to R710's.
We have gotten into the pattern of swapping out drives as soon as we notice
issues with them, and not waiting for the array to degrade. I was wondering if
others did the same? Do you swap the drives as soon as they report
media errors?
What about SMART errors such as Current_Pending_Sector and
Offline_Uncorrectable? Do you wait for the drive to actually fall into a
degraded state before any hardware is swapped out?
Thanks for your time!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.us.dell.com/pipermail/linux-poweredge/attachments/20110105/8fce15bd/attachment.htm
From rostetter at mail.utexas.edu Wed Jan 5 19:41:00 2011
From: rostetter at mail.utexas.edu (Eric Rostetter)
Date: Wed, 05 Jan 2011 19:41:00 -0600
Subject: SMART / PERC HDD replacement criteria
In-Reply-To:
References:
Message-ID: <20110105194100.31825a388a45zc84@mail.ph.utexas.edu>
Quoting Jacob Perkins :
> I was wondering if anyone would like to share there 'criteria' for
> swapping out
> bad hard drives on their PE's.
We used to wait until server disks failed and then replace them. Then we
put monitoring in place, which relies on dell's OMSA for the data. Since
Dell's OMSA goes to warning when they reach "predictive failure" or
such, we now swap them as soon as OMSA goes warning, which is almost
always before they actually fail...
On non-servers we don't monitor, we wait until smart errors are greater
than 5 before we swap them... We've had drives with 1 or 2 smart errors
last forever, but once they get 5-10 they usually degrade fast so we take
it seriously...
All our machines now run either smart (e-mail reports) or are monitored
via OMSA (snmp). So we no longer wait for a failure. In the last 3 years,
we had one actual failure, and the rest were smart/omsa reports.
--
Eric Rostetter
The Department of Physics
The University of Texas at Austin
Go Longhorns!
From blake at ispn.net Thu Jan 6 10:04:07 2011
From: blake at ispn.net (Blake Hudson)
Date: Thu, 06 Jan 2011 10:04:07 -0600
Subject: 146GB drive pretenting to be a 300GB
In-Reply-To: <112137805.125102.1294193468470.JavaMail.root@mail.dtgmail.com>
References: <112137805.125102.1294193468470.JavaMail.root@mail.dtgmail.com>
Message-ID: <4D25E7F7.2070409@ispn.net>
-------- Original Message --------
Subject: 146GB drive pretenting to be a 300GB
From: Defender NOC
To: linux-poweredge
Date: Tuesday, January 04, 2011 8:11:08 PM
> Hi Everyone-
>
> We just purchased a new R510 and while i've had a lot of experience with 10g and 11g systems other than the R510 ( mostly R610 and R710 ) i've yet to see this one before. We purchased the system with 12 x 146gb 15k SAS drives in them which all showed as such in the BIOS. Upon loading OMSA on the system I get the below results however looking at the drive itself:
>
> ID 0:0:11
> Status OK
> Name Physical Disk 0:0:11
> State Online
> Power Status Spun Up
> Bus Protocol SAS
> Media HDD
> Revision EH02
> Failure Predicted No
> Certified Yes
> Capacity 136.12GB
> Used RAID Disk Space 136.12GB
> Available RAID Disk Space 0.00GB
> Hot Spare No
> Vendor ID DELL(tm)
> Product ID ST3300657SS-H
> Serial No. 3SJ2QJD8
> Part Number SG01DKVF125310AU01ALA00
> Negotiated Speed 3.00 Gbps
> Capable Speed 3.00 Gbps
> Manufacture Day 02
> Manufacture Week 44
> Manufacture Year 2010
> SAS Address 5000C50028AA2161
>
Based on the information on Seagate's website, Seagate no longer
manufactures 146GB drives. Vendors, like Dell, want to continue to offer
146GB drives for various reasons. This model is probably made
specifically for those vendors. It is essentially a 1 platter 15k.7
Cheetah. However, I see no reason it wouldn't support the same features
as other 15k.7 drives - 6Gbps SAS, encryption, etc
From dtrainor at toolbox.com Thu Jan 6 10:44:38 2011
From: dtrainor at toolbox.com (Dan Trainor)
Date: Thu, 6 Jan 2011 09:44:38 -0700
Subject: Dell SAS 5/e, EL5/CentOS5, MD3000
Message-ID:
Hi -
It would seem that everyone else has gotten this to work, except for me (but with varying degrees of success). I'm starting to think that its hardware related now, unfortunately. However, this being my first time trying to use any Dell (Or LSI?) storage products, maybe I'm simply doing it wrong.
I'm using a SAS 5/e HBA connected to an MD3000. Pretty sure the MD3k checks out fine because the other controller on the MD3k is connected to a Windows-based backup machine that also uses the MD3k. I carved out a 60G virtual disk and exposed it to this single HBA that I'm trying to use, all under CentOS 5.5.
>From reading http://www.delltechcenter.com/page/Linux+RAID+and+Storage, I understand that I'm supposed to use "mptsas, part of the mptfusion driver family". I've not found this in later EL5/CentOS5 kernels, as the article suggests. I went ahead and used LSI's provided drivers in kmod form (mptbase), and their mptlinux util/sysv script. Since I wasn't able to find any clear-cut instructions, these steps seemed logical from what I figured.
I'm able to see the exposed slice from the BIOS of the SAS5/e card before the machine boots, and I've tried different methods of verification and even gone so far as to switch between the card being enabled in BIOS, OS, or both etc etc. Whenever I try to access the slice by any means, however, I get i/o errors. This is an example of a simple mkfs.ext3 that I attempted. I didn't expect it to work - but I wanted to debug some more and figured this was the most intrusive way to do so:
Jan 5 15:32:39 vmserver02 kernel: printk: 29826 messages suppressed.
Jan 5 15:32:39 vmserver02 kernel: Buffer I/O error on device sda, logical block 14319618
Jan 5 15:32:39 vmserver02 kernel: lost page write due to I/O error on sda
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 114557968
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 114558992
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 114560016
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 114819088
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 114820112
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 114821136
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 114822160
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 115081232
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 115082256
And, here's what I'ma ctually using:
[root at vmserver02.ops.az.domain.local x86_64]# lspci|grep "SCSI storage"
02:08.0 SCSI storage controller: LSI Logic / Symbios Logic SAS1068 PCI-X Fusion-MPT SAS (rev 01)
03:00.0 SCSI storage controller: LSI Logic / Symbios Logic SAS1068E PCI-Express Fusion-MPT SAS (rev 08)
I'm kind of stumped at this point. This is my second go at this over a six month period. Last time I got the same results, just got frustrated and then gave up. Hopefully with some help, though, I'll be able to get this working as I expect it to.
And, just out of curiosity, what is the difference between these two devices that lspci shows? There's only one card in there. Is one of these the actual onboard SAS controller?
Thanks!
-dant
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.us.dell.com/pipermail/linux-poweredge/attachments/20110106/8f0b9321/attachment-0001.htm
From richard.ems at cape-horn-eng.com Thu Jan 6 10:59:07 2011
From: richard.ems at cape-horn-eng.com (Richard Ems)
Date: Thu, 06 Jan 2011 17:59:07 +0100
Subject: Dell SAS 5/e, EL5/CentOS5, MD3000
In-Reply-To:
References:
Message-ID: <4D25F4DB.7020909@cape-horn-eng.com>
Hi Dan,
probably you already did it, but I have to ask: Could you have bad
cabling issues? Did u try other cables?
Richard
--
Richard Ems mail: Richard.Ems at Cape-Horn-Eng.com
Cape Horn Engineering S.L.
C/ Dr. J.J. D?mine 1, 5? piso
46011 Valencia
Tel : +34 96 3242923 / Fax 924
http://www.cape-horn-eng.com
From Jens_Heinz at Dell.com Thu Jan 6 11:05:17 2011
From: Jens_Heinz at Dell.com (Jens_Heinz at Dell.com)
Date: Thu, 6 Jan 2011 18:05:17 +0100
Subject: Dell SAS 5/e, EL5/CentOS5, MD3000
In-Reply-To:
References:
Message-ID: <399212640037934A8F25742DA026AFD7021BC143@LEJX7ADC103.EMEA.DELL.COM>
How do you access the virtual disk (ie. using /dev/sdX device oder /dev/mapper)?
Which raid controller module the SAS cable is attached to?
Is the preferred owner of your virtual disk (on the MD3000) the same raid controller the cable is attached to?
Are you using the Dell provided linuxrdac driver?
If any possible send me a MD3000 support log to have a look at.
Regards,
Jens.
From: linux-poweredge-bounces-Lists On Behalf Of Dan Trainor
Sent: 06 January 2011 17:45
To: linux-poweredge-Lists
Subject: Dell SAS 5/e, EL5/CentOS5, MD3000
Hi -
It would seem that everyone else has gotten this to work, except for me (but with varying degrees of success). I'm starting to think that its hardware related now, unfortunately. However, this being my first time trying to use any Dell (Or LSI?) storage products, maybe I'm simply doing it wrong.
I'm using a SAS 5/e HBA connected to an MD3000. Pretty sure the MD3k checks out fine because the other controller on the MD3k is connected to a Windows-based backup machine that also uses the MD3k. I carved out a 60G virtual disk and exposed it to this single HBA that I'm trying to use, all under CentOS 5.5.
>From reading http://www.delltechcenter.com/page/Linux+RAID+and+Storage, I understand that I'm supposed to use "mptsas, part of the mptfusion driver family". I've not found this in later EL5/CentOS5 kernels, as the article suggests. I went ahead and used LSI's provided drivers in kmod form (mptbase), and their mptlinux util/sysv script. Since I wasn't able to find any clear-cut instructions, these steps seemed logical from what I figured.
I'm able to see the exposed slice from the BIOS of the SAS5/e card before the machine boots, and I've tried different methods of verification and even gone so far as to switch between the card being enabled in BIOS, OS, or both etc etc. Whenever I try to access the slice by any means, however, I get i/o errors. This is an example of a simple mkfs.ext3 that I attempted. I didn't expect it to work - but I wanted to debug some more and figured this was the most intrusive way to do so:
Jan 5 15:32:39 vmserver02 kernel: printk: 29826 messages suppressed.
Jan 5 15:32:39 vmserver02 kernel: Buffer I/O error on device sda, logical block 14319618
Jan 5 15:32:39 vmserver02 kernel: lost page write due to I/O error on sda
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 114557968
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 114558992
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 114560016
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 114819088
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 114820112
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 114821136
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 114822160
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 115081232
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 115082256
And, here's what I'ma ctually using:
[root at vmserver02.ops.az.domain.local x86_64]# lspci|grep "SCSI storage"
02:08.0 SCSI storage controller: LSI Logic / Symbios Logic SAS1068 PCI-X Fusion-MPT SAS (rev 01)
03:00.0 SCSI storage controller: LSI Logic / Symbios Logic SAS1068E PCI-Express Fusion-MPT SAS (rev 08)
I'm kind of stumped at this point. This is my second go at this over a six month period. Last time I got the same results, just got frustrated and then gave up. Hopefully with some help, though, I'll be able to get this working as I expect it to.
And, just out of curiosity, what is the difference between these two devices that lspci shows? There's only one card in there. Is one of these the actual onboard SAS controller?
Thanks!
-dant
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.us.dell.com/pipermail/linux-poweredge/attachments/20110106/6c170286/attachment.htm
From dtrainor at toolbox.com Thu Jan 6 11:19:38 2011
From: dtrainor at toolbox.com (Dan Trainor)
Date: Thu, 6 Jan 2011 10:19:38 -0700
Subject: Dell SAS 5/e, EL5/CentOS5, MD3000
In-Reply-To: <399212640037934A8F25742DA026AFD7021BC143@LEJX7ADC103.EMEA.DELL.COM>
References:
<399212640037934A8F25742DA026AFD7021BC143@LEJX7ADC103.EMEA.DELL.COM>
Message-ID:
Hi Jens, Richard -
Richard, I have two cables, one of which is in a production environment and I cannot steal it right now. I sent in a PO for a few more. You can't really pick these things up at Best Buy, and if I need one, I need it fast, ya know? Thanks for the tip. I'll try it as soon as I can.
I'm trying to access the virtual disk, in this case, by /dev/sda.
The SAS cable is connected to the SAS5/e controller card. I've tried both ports. IIRC this SAS5/e card is of an older type and does not participate in RAID? I think I remember reading that. The other card in the machine (can't identify if its onboard or expansion) is a SAS6/ir. If that's the one you're inquiring about, it has no external ports.
The preferred owner of the virtual disk is that of the same raid controller attaching to the MD3000. Even changing different ports, which changes ownership, they are still one and the same now.
I am not using the Dell provided linuxdrac driver. Again, as I understood, I was to use the mptsas driver. Did I misunderstand?
I've just asked for a support log, I'll go ahead and throw that somewhere in a few minutes when I get it.
Thanks!
-dant
From: Jens_Heinz at Dell.com [mailto:Jens_Heinz at Dell.com]
Sent: Thursday, January 06, 2011 10:05 AM
To: Dan Trainor; linux-poweredge at lists.us.dell.com
Subject: RE: Dell SAS 5/e, EL5/CentOS5, MD3000
How do you access the virtual disk (ie. using /dev/sdX device oder /dev/mapper)?
Which raid controller module the SAS cable is attached to?
Is the preferred owner of your virtual disk (on the MD3000) the same raid controller the cable is attached to?
Are you using the Dell provided linuxrdac driver?
If any possible send me a MD3000 support log to have a look at.
Regards,
Jens.
From: linux-poweredge-bounces-Lists On Behalf Of Dan Trainor
Sent: 06 January 2011 17:45
To: linux-poweredge-Lists
Subject: Dell SAS 5/e, EL5/CentOS5, MD3000
Hi -
It would seem that everyone else has gotten this to work, except for me (but with varying degrees of success). I'm starting to think that its hardware related now, unfortunately. However, this being my first time trying to use any Dell (Or LSI?) storage products, maybe I'm simply doing it wrong.
I'm using a SAS 5/e HBA connected to an MD3000. Pretty sure the MD3k checks out fine because the other controller on the MD3k is connected to a Windows-based backup machine that also uses the MD3k. I carved out a 60G virtual disk and exposed it to this single HBA that I'm trying to use, all under CentOS 5.5.
>From reading http://www.delltechcenter.com/page/Linux+RAID+and+Storage, I understand that I'm supposed to use "mptsas, part of the mptfusion driver family". I've not found this in later EL5/CentOS5 kernels, as the article suggests. I went ahead and used LSI's provided drivers in kmod form (mptbase), and their mptlinux util/sysv script. Since I wasn't able to find any clear-cut instructions, these steps seemed logical from what I figured.
I'm able to see the exposed slice from the BIOS of the SAS5/e card before the machine boots, and I've tried different methods of verification and even gone so far as to switch between the card being enabled in BIOS, OS, or both etc etc. Whenever I try to access the slice by any means, however, I get i/o errors. This is an example of a simple mkfs.ext3 that I attempted. I didn't expect it to work - but I wanted to debug some more and figured this was the most intrusive way to do so:
Jan 5 15:32:39 vmserver02 kernel: printk: 29826 messages suppressed.
Jan 5 15:32:39 vmserver02 kernel: Buffer I/O error on device sda, logical block 14319618
Jan 5 15:32:39 vmserver02 kernel: lost page write due to I/O error on sda
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 114557968
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 114558992
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 114560016
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 114819088
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 114820112
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 114821136
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 114822160
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 115081232
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 115082256
And, here's what I'ma ctually using:
[root at vmserver02.ops.az.domain.local x86_64]# lspci|grep "SCSI storage"
02:08.0 SCSI storage controller: LSI Logic / Symbios Logic SAS1068 PCI-X Fusion-MPT SAS (rev 01)
03:00.0 SCSI storage controller: LSI Logic / Symbios Logic SAS1068E PCI-Express Fusion-MPT SAS (rev 08)
I'm kind of stumped at this point. This is my second go at this over a six month period. Last time I got the same results, just got frustrated and then gave up. Hopefully with some help, though, I'll be able to get this working as I expect it to.
And, just out of curiosity, what is the difference between these two devices that lspci shows? There's only one card in there. Is one of these the actual onboard SAS controller?
Thanks!
-dant
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.us.dell.com/pipermail/linux-poweredge/attachments/20110106/55ee60e4/attachment-0001.htm
From dtrainor at toolbox.com Thu Jan 6 12:19:00 2011
From: dtrainor at toolbox.com (Dan Trainor)
Date: Thu, 6 Jan 2011 11:19:00 -0700
Subject: Dell SAS 5/e, EL5/CentOS5, MD3000
In-Reply-To: <399212640037934A8F25742DA026AFD7021BC143@LEJX7ADC103.EMEA.DELL.COM>
References:
<399212640037934A8F25742DA026AFD7021BC143@LEJX7ADC103.EMEA.DELL.COM>
Message-ID:
Hi, Jens -
So I'm reading what I interpret as conflicting information. I've seen a lot of places talk about using the mptsas driver, and a lot of people saying I need to use the linuxrdac driver. The latter, I believe, is only if I'm using any kind of multipath. Eventually that would be nice, sure, but right now I just want to get a simple configuration working and expand from there.
I'm correct in going the mptsas/mptlinux route, right?
Thanks
-dant
From: Jens_Heinz at Dell.com [mailto:Jens_Heinz at Dell.com]
Sent: Thursday, January 06, 2011 10:05 AM
To: Dan Trainor; linux-poweredge at lists.us.dell.com
Subject: RE: Dell SAS 5/e, EL5/CentOS5, MD3000
How do you access the virtual disk (ie. using /dev/sdX device oder /dev/mapper)?
Which raid controller module the SAS cable is attached to?
Is the preferred owner of your virtual disk (on the MD3000) the same raid controller the cable is attached to?
Are you using the Dell provided linuxrdac driver?
If any possible send me a MD3000 support log to have a look at.
Regards,
Jens.
From: linux-poweredge-bounces-Lists On Behalf Of Dan Trainor
Sent: 06 January 2011 17:45
To: linux-poweredge-Lists
Subject: Dell SAS 5/e, EL5/CentOS5, MD3000
Hi -
It would seem that everyone else has gotten this to work, except for me (but with varying degrees of success). I'm starting to think that its hardware related now, unfortunately. However, this being my first time trying to use any Dell (Or LSI?) storage products, maybe I'm simply doing it wrong.
I'm using a SAS 5/e HBA connected to an MD3000. Pretty sure the MD3k checks out fine because the other controller on the MD3k is connected to a Windows-based backup machine that also uses the MD3k. I carved out a 60G virtual disk and exposed it to this single HBA that I'm trying to use, all under CentOS 5.5.
>From reading http://www.delltechcenter.com/page/Linux+RAID+and+Storage, I understand that I'm supposed to use "mptsas, part of the mptfusion driver family". I've not found this in later EL5/CentOS5 kernels, as the article suggests. I went ahead and used LSI's provided drivers in kmod form (mptbase), and their mptlinux util/sysv script. Since I wasn't able to find any clear-cut instructions, these steps seemed logical from what I figured.
I'm able to see the exposed slice from the BIOS of the SAS5/e card before the machine boots, and I've tried different methods of verification and even gone so far as to switch between the card being enabled in BIOS, OS, or both etc etc. Whenever I try to access the slice by any means, however, I get i/o errors. This is an example of a simple mkfs.ext3 that I attempted. I didn't expect it to work - but I wanted to debug some more and figured this was the most intrusive way to do so:
Jan 5 15:32:39 vmserver02 kernel: printk: 29826 messages suppressed.
Jan 5 15:32:39 vmserver02 kernel: Buffer I/O error on device sda, logical block 14319618
Jan 5 15:32:39 vmserver02 kernel: lost page write due to I/O error on sda
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 114557968
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 114558992
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 114560016
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 114819088
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 114820112
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 114821136
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 114822160
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 115081232
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 115082256
And, here's what I'ma ctually using:
[root at vmserver02.ops.az.domain.local x86_64]# lspci|grep "SCSI storage"
02:08.0 SCSI storage controller: LSI Logic / Symbios Logic SAS1068 PCI-X Fusion-MPT SAS (rev 01)
03:00.0 SCSI storage controller: LSI Logic / Symbios Logic SAS1068E PCI-Express Fusion-MPT SAS (rev 08)
I'm kind of stumped at this point. This is my second go at this over a six month period. Last time I got the same results, just got frustrated and then gave up. Hopefully with some help, though, I'll be able to get this working as I expect it to.
And, just out of curiosity, what is the difference between these two devices that lspci shows? There's only one card in there. Is one of these the actual onboard SAS controller?
Thanks!
-dant
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.us.dell.com/pipermail/linux-poweredge/attachments/20110106/3926a7ab/attachment.htm
From hchan at mail.ewind.com Thu Jan 6 12:34:56 2011
From: hchan at mail.ewind.com (Hoover Chan)
Date: Thu, 6 Jan 2011 10:34:56 -0800 (PST)
Subject: poweredge 2650/2850 options
Message-ID: <25092420.665.1294338895992.JavaMail.root@lincoln.ewind.com>
I just received these servers in a donation and was wondering what my options are. SCSI based. Dual Xeon but it doesn't look like it'll do VMWare which is what I'm trying to migrate to.
Any other thoughts from the collected wisdom here?
Thanks in advance.
-----------------------------------------------------------------
Hoover Chan hchan at mail.ewind.com -or- hchan at well.com
From Jens_Heinz at Dell.com Thu Jan 6 12:41:42 2011
From: Jens_Heinz at Dell.com (Jens_Heinz at Dell.com)
Date: Thu, 6 Jan 2011 19:41:42 +0100
Subject: Dell SAS 5/e, EL5/CentOS5, MD3000
In-Reply-To:
References:
<399212640037934A8F25742DA026AFD7021BC143@LEJX7ADC103.EMEA.DELL.COM>,
Message-ID: <399212640037934A8F25742DA026AFD70201C977@LEJX7ADC103.EMEA.DELL.COM>
Sorry, missed that reply.
I was referring to the RAID controller modules in your MD3000. There are usually 2 (unless you have a single controller edition). SAS5/e has no RAID functionality, that's right.
As I pointed out in my other reply, you are perfectly right, mptsas is only the driver for your SAS5/e HBA, linuxrdac is usually only required for multipath. But it may have side effects if you install linuxrdac and then use only 1x path. That's why I asked.
I'm working from Germany and won't be in till tomorrow morning. But I'll check your log then.
Bye,
Jens.
________________________________________
From: linux-poweredge-bounces-Lists On Behalf Of Dan Trainor [dtrainor at toolbox.com]
Sent: Thursday, January 06, 2011 6:19 PM
To: linux-poweredge-Lists
Subject: RE: Dell SAS 5/e, EL5/CentOS5, MD3000
Hi Jens, Richard ?
Richard, I have two cables, one of which is in a production environment and I cannot steal it right now. I sent in a PO for a few more. You can?t really pick these things up at Best Buy, and if I need one, I need it fast, ya know? Thanks for the tip. I?ll try it as soon as I can.
I?m trying to access the virtual disk, in this case, by /dev/sda.
The SAS cable is connected to the SAS5/e controller card. I?ve tried both ports. IIRC this SAS5/e card is of an older type and does not participate in RAID? I think I remember reading that. The other card in the machine (can?t identify if its onboard or expansion) is a SAS6/ir. If that?s the one you?re inquiring about, it has no external ports.
The preferred owner of the virtual disk is that of the same raid controller attaching to the MD3000. Even changing different ports, which changes ownership, they are still one and the same now.
I am not using the Dell provided linuxdrac driver. Again, as I understood, I was to use the mptsas driver. Did I misunderstand?
I?ve just asked for a support log, I?ll go ahead and throw that somewhere in a few minutes when I get it.
Thanks!
-dant
From: Jens_Heinz at Dell.com [mailto:Jens_Heinz at Dell.com]
Sent: Thursday, January 06, 2011 10:05 AM
To: Dan Trainor; linux-poweredge at lists.us.dell.com
Subject: RE: Dell SAS 5/e, EL5/CentOS5, MD3000
How do you access the virtual disk (ie. using /dev/sdX device oder /dev/mapper)?
Which raid controller module the SAS cable is attached to?
Is the preferred owner of your virtual disk (on the MD3000) the same raid controller the cable is attached to?
Are you using the Dell provided linuxrdac driver?
If any possible send me a MD3000 support log to have a look at.
Regards,
Jens.
From: linux-poweredge-bounces-Lists On Behalf Of Dan Trainor
Sent: 06 January 2011 17:45
To: linux-poweredge-Lists
Subject: Dell SAS 5/e, EL5/CentOS5, MD3000
Hi ?
It would seem that everyone else has gotten this to work, except for me (but with varying degrees of success). I?m starting to think that its hardware related now, unfortunately. However, this being my first time trying to use any Dell (Or LSI?) storage products, maybe I?m simply doing it wrong.
I?m using a SAS 5/e HBA connected to an MD3000. Pretty sure the MD3k checks out fine because the other controller on the MD3k is connected to a Windows-based backup machine that also uses the MD3k. I carved out a 60G virtual disk and exposed it to this single HBA that I?m trying to use, all under CentOS 5.5.
>From reading http://www.delltechcenter.com/page/Linux+RAID+and+Storage, I understand that I?m supposed to use ?mptsas, part of the mptfusion driver family?. I?ve not found this in later EL5/CentOS5 kernels, as the article suggests. I went ahead and used LSI?s provided drivers in kmod form (mptbase), and their mptlinux util/sysv script. Since I wasn?t able to find any clear-cut instructions, these steps seemed logical from what I figured.
I?m able to see the exposed slice from the BIOS of the SAS5/e card before the machine boots, and I?ve tried different methods of verification and even gone so far as to switch between the card being enabled in BIOS, OS, or both etc etc. Whenever I try to access the slice by any means, however, I get i/o errors. This is an example of a simple mkfs.ext3 that I attempted. I didn?t expect it to work ? but I wanted to debug some more and figured this was the most intrusive way to do so:
Jan 5 15:32:39 vmserver02 kernel: printk: 29826 messages suppressed.
Jan 5 15:32:39 vmserver02 kernel: Buffer I/O error on device sda, logical block 14319618
Jan 5 15:32:39 vmserver02 kernel: lost page write due to I/O error on sda
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 114557968
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 114558992
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 114560016
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 114819088
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 114820112
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 114821136
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 114822160
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 115081232
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 115082256
And, here?s what I?ma ctually using:
[root at vmserver02.ops.az.domain.local x86_64]# lspci|grep "SCSI storage"
02:08.0 SCSI storage controller: LSI Logic / Symbios Logic SAS1068 PCI-X Fusion-MPT SAS (rev 01)
03:00.0 SCSI storage controller: LSI Logic / Symbios Logic SAS1068E PCI-Express Fusion-MPT SAS (rev 08)
I?m kind of stumped at this point. This is my second go at this over a six month period. Last time I got the same results, just got frustrated and then gave up. Hopefully with some help, though, I?ll be able to get this working as I expect it to.
And, just out of curiosity, what is the difference between these two devices that lspci shows? There?s only one card in there. Is one of these the actual onboard SAS controller?
Thanks!
-dant
From dtrainor at toolbox.com Thu Jan 6 12:43:09 2011
From: dtrainor at toolbox.com (Dan Trainor)
Date: Thu, 6 Jan 2011 11:43:09 -0700
Subject: Dell SAS 5/e, EL5/CentOS5, MD3000
In-Reply-To: <399212640037934A8F25742DA026AFD70201C976@LEJX7ADC103.EMEA.DELL.COM>
References:
<399212640037934A8F25742DA026AFD7021BC143@LEJX7ADC103.EMEA.DELL.COM>,
<399212640037934A8F25742DA026AFD70201C976@LEJX7ADC103.EMEA.DELL.COM>
Message-ID:
Hi, Jens -
Wow, that's a lot of good information, thank you. I have a better understanding of MD now.
So in my situation, using RHEL/CentOS, I understand that I can use either linuxrdac or mptsas/mptlinux. Which is recommended more over the other? I suppose you're obligated to say that the linuxrdac driver works better, is better supported etc etc, but if I could get your opinion :)
What causes these "shadow" drives? Funny you should mention. I also did see one of these shadow drives, it was a 20M virtual that did not exist in the MD3000 itself. Don't know if that has any significance.
I'm familiar with the ramifications of using right/wrong device names in Linux, and the errors that I was getting were not indicative of this.
I juggled the cable around a little bit, switched to a different SAS port on the SAS5/e card, and I see:
Jan 6 11:20:35 vmserver02 kernel: mptsas: ioc0: attaching ssp device: fw_channel 0, fw_id 16, phy 0, sas_addr 0x50022194925cb504
Jan 6 11:20:35 vmserver02 kernel: target0:0:1: mptsas: ioc0: add device: fw_channel 0, fw_id 16, phy 0, sas_addr 0x50022194925cb504
Jan 6 11:20:35 vmserver02 kernel: Vendor: DELL Model: MD3000 Rev: 0670
Jan 6 11:20:35 vmserver02 kernel: Type: Direct-Access ANSI SCSI revision: 05
Jan 6 11:20:35 vmserver02 kernel: scsi 0:0:1:0: mptscsih: ioc0: qdepth=64, tagged=1, simple=1, ordered=0, scsi_level=6, cmd_que=1
Jan 6 11:20:35 vmserver02 kernel: scsi 0:0:1:0: Attached scsi generic sg0 type 0
Jan 6 11:20:35 vmserver02 kernel: Vendor: DELL Model: MD3000 Rev: 0670
Jan 6 11:20:35 vmserver02 kernel: Type: Direct-Access ANSI SCSI revision: 05
Jan 6 11:20:36 vmserver02 kernel: scsi 0:0:1:2: mptscsih: ioc0: qdepth=64, tagged=1, simple=1, ordered=0, scsi_level=6, cmd_que=1
Jan 6 11:20:36 vmserver02 kernel: SCSI device sda: 125829120 512-byte hdwr sectors (64425 MB)
Jan 6 11:20:36 vmserver02 kernel: sda: Write Protect is off
Jan 6 11:20:36 vmserver02 kernel: SCSI device sda: drive cache: write back w/ FUA
Jan 6 11:20:36 vmserver02 kernel: SCSI device sda: 125829120 512-byte hdwr sectors (64425 MB)
Jan 6 11:20:36 vmserver02 kernel: sda: Write Protect is off
Jan 6 11:20:36 vmserver02 kernel: SCSI device sda: drive cache: write back w/ FUA
Jan 6 11:20:36 vmserver02 kernel: sda: unknown partition table
Jan 6 11:20:36 vmserver02 kernel: sd 0:0:1:2: Attached scsi disk sda
Jan 6 11:20:36 vmserver02 kernel: sd 0:0:1:2: Attached scsi generic sg1 type 0
Jan 6 11:20:36 vmserver02 kernel: Vendor: DELL Model: Universal Xport Rev: 0670
Jan 6 11:20:36 vmserver02 kernel: Type: Direct-Access ANSI SCSI revision: 05
Jan 6 11:20:36 vmserver02 kernel: scsi 0:0:1:31: mptscsih: ioc0: qdepth=64, tagged=1, simple=1, ordered=0, scsi_level=6, cmd_que=1
Jan 6 11:20:36 vmserver02 kernel: SCSI device sdb: 40960 512-byte hdwr sectors (21 MB)
Jan 6 11:20:36 vmserver02 kernel: sdb: Write Protect is off
Jan 6 11:20:36 vmserver02 kernel: SCSI device sdb: drive cache: write through w/ FUA
Jan 6 11:20:36 vmserver02 kernel: SCSI device sdb: 40960 512-byte hdwr sectors (21 MB)
Jan 6 11:20:36 vmserver02 kernel: sdb: Write Protect is off
Jan 6 11:20:36 vmserver02 kernel: SCSI device sdb: drive cache: write through w/ FUA
Jan 6 11:20:36 vmserver02 kernel: sdb: unknown partition table
Jan 6 11:20:36 vmserver02 kernel: sd 0:0:1:31: Attached scsi disk sdb
Jan 6 11:20:36 vmserver02 kernel: sd 0:0:1:31: Attached scsi generic sg6 type 0
You'll see both drives - my 60G volume, and that mysterious 20M volume.
I'm using the mptsas driver right now. I think forcing a reset, but unplugging/plugging in the SAS cable, might have shaken things up enough so that the new driver now understands how to talk to the MD3000. I'm going to reproduce all this on a clean install to see what the deal is. Either I misunderstood how setting/resetting the SAS controller/path works, or I have a bad cable and just happened to temporarily fix it by moving it around a little bit.
Jens, I sincerely appreciate your help. I hope this also serves as some good information for others trying to figure this out, as well.
Thanks
-dant
-----Original Message-----
From: Jens_Heinz at Dell.com [mailto:Jens_Heinz at Dell.com]
Sent: Thursday, January 06, 2011 11:35 AM
To: Dan Trainor
Subject: RE: Dell SAS 5/e, EL5/CentOS5, MD3000
Hi,
in fact Dell supports only the linuxrdac driver for all MD3000(i) systems, but it is only available for RHEL/SLES based Linux. Debian has to use Linux multipath with RDAC device handler. The mptsas/mptlinux is only dealing as driver for the SAS HBA (SAS5/e).
But the important thing is that your MD3000 drive is sometimes presented to your OS multiple times (even if only 1x path is connected), like sdb and sdc. One or more of those sd devices are then 'shadow' devices which can't be used to access the drive. If you now use the 'wrong' device it'll result in errors. Same could happen if you have only a SAS path to the controller which is currently not owning the MD virtual disk drive.
The other post might be right, we saw problems like this caused by faulty HW aswell, but most of the time it is a misunderstanding on how MD works.
If you want me to, I could have a look at your configuration to check for those issues. I would need a MD support log (in MDSM -> Tools -> Gather Support Information) and an output of fdisk -l would be great.
Bye,
Jens.
________________________________________
From: linux-poweredge-bounces-Lists On Behalf Of Dan Trainor [dtrainor at toolbox.com]
Sent: Thursday, January 06, 2011 7:19 PM
To: linux-poweredge-Lists
Subject: RE: Dell SAS 5/e, EL5/CentOS5, MD3000
Hi, Jens -
So I'm reading what I interpret as conflicting information. I've seen a lot of places talk about using the mptsas driver, and a lot of people saying I need to use the linuxrdac driver. The latter, I believe, is only if I'm using any kind of multipath. Eventually that would be nice, sure, but right now I just want to get a simple configuration working and expand from there.
I'm correct in going the mptsas/mptlinux route, right?
Thanks
-dant
From: Jens_Heinz at Dell.com [mailto:Jens_Heinz at Dell.com]
Sent: Thursday, January 06, 2011 10:05 AM
To: Dan Trainor; linux-poweredge at lists.us.dell.com
Subject: RE: Dell SAS 5/e, EL5/CentOS5, MD3000
How do you access the virtual disk (ie. using /dev/sdX device oder /dev/mapper)?
Which raid controller module the SAS cable is attached to?
Is the preferred owner of your virtual disk (on the MD3000) the same raid controller the cable is attached to?
Are you using the Dell provided linuxrdac driver?
If any possible send me a MD3000 support log to have a look at.
Regards,
Jens.
From: linux-poweredge-bounces-Lists On Behalf Of Dan Trainor
Sent: 06 January 2011 17:45
To: linux-poweredge-Lists
Subject: Dell SAS 5/e, EL5/CentOS5, MD3000
Hi -
It would seem that everyone else has gotten this to work, except for me (but with varying degrees of success). I'm starting to think that its hardware related now, unfortunately. However, this being my first time trying to use any Dell (Or LSI?) storage products, maybe I'm simply doing it wrong.
I'm using a SAS 5/e HBA connected to an MD3000. Pretty sure the MD3k checks out fine because the other controller on the MD3k is connected to a Windows-based backup machine that also uses the MD3k. I carved out a 60G virtual disk and exposed it to this single HBA that I'm trying to use, all under CentOS 5.5.
>From reading http://www.delltechcenter.com/page/Linux+RAID+and+Storage, I understand that I'm supposed to use "mptsas, part of the mptfusion driver family". I've not found this in later EL5/CentOS5 kernels, as the article suggests. I went ahead and used LSI's provided drivers in kmod form (mptbase), and their mptlinux util/sysv script. Since I wasn't able to find any clear-cut instructions, these steps seemed logical from what I figured.
I'm able to see the exposed slice from the BIOS of the SAS5/e card before the machine boots, and I've tried different methods of verification and even gone so far as to switch between the card being enabled in BIOS, OS, or both etc etc. Whenever I try to access the slice by any means, however, I get i/o errors. This is an example of a simple mkfs.ext3 that I attempted. I didn't expect it to work - but I wanted to debug some more and figured this was the most intrusive way to do so:
Jan 5 15:32:39 vmserver02 kernel: printk: 29826 messages suppressed.
Jan 5 15:32:39 vmserver02 kernel: Buffer I/O error on device sda, logical block 14319618
Jan 5 15:32:39 vmserver02 kernel: lost page write due to I/O error on sda
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 114557968
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 114558992
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 114560016
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 114819088
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 114820112
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 114821136
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 114822160
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 115081232
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 115082256
And, here's what I'ma ctually using:
[root at vmserver02.ops.az.domain.local x86_64]# lspci|grep "SCSI storage"
02:08.0 SCSI storage controller: LSI Logic / Symbios Logic SAS1068 PCI-X Fusion-MPT SAS (rev 01)
03:00.0 SCSI storage controller: LSI Logic / Symbios Logic SAS1068E PCI-Express Fusion-MPT SAS (rev 08)
I'm kind of stumped at this point. This is my second go at this over a six month period. Last time I got the same results, just got frustrated and then gave up. Hopefully with some help, though, I'll be able to get this working as I expect it to.
And, just out of curiosity, what is the difference between these two devices that lspci shows? There's only one card in there. Is one of these the actual onboard SAS controller?
Thanks!
-dant
From Dell at epperson.homelinux.net Thu Jan 6 14:20:14 2011
From: Dell at epperson.homelinux.net (J. Epperson)
Date: Thu, 6 Jan 2011 15:20:14 -0500
Subject: poweredge 2650/2850 options
In-Reply-To: <25092420.665.1294338895992.JavaMail.root@lincoln.ewind.com>
References: <25092420.665.1294338895992.JavaMail.root@lincoln.ewind.com>
Message-ID: <55592d8b6637e23bc616376060e863da.squirrel@epperson.homelinux.net>
On Thu, January 6, 2011 13:34, Hoover Chan wrote:
> I just received these servers in a donation and was wondering what my
> options are. SCSI based. Dual Xeon but it doesn't look like it'll do
> VMWare which is what I'm trying to migrate to.
>
> Any other thoughts from the collected wisdom here?
>
> Thanks in advance.
>
They'll run RHEL/Centos 5 with no problems, but the 2650's are a bit long
in the tooth. I have about twenty 2850s running RHEL 4&5 and a variety of
Oracle products. Have to replace a drive every once in a while. Got rid
of my last 2650 last year.
From matthew at acfr.usyd.edu.au Thu Jan 6 14:51:47 2011
From: matthew at acfr.usyd.edu.au (Matthew Geier)
Date: Fri, 07 Jan 2011 07:51:47 +1100
Subject: Dell SAS 5/e, EL5/CentOS5, MD3000
In-Reply-To:
References: <399212640037934A8F25742DA026AFD7021BC143@LEJX7ADC103.EMEA.DELL.COM>,
<399212640037934A8F25742DA026AFD70201C976@LEJX7ADC103.EMEA.DELL.COM>
Message-ID: <4D262B63.3010104@acfr.usyd.edu.au>
On 07/01/11 05:43, Dan Trainor wrote:
> What causes these "shadow" drives? Funny you should mention. I also did see one of these shadow drives, it was a 20M virtual that did not exist in the MD3000 itself. Don't know if that has any significance.
It appears to be a idiosyncrasy of the MD3000 that the controller
creates two dummy block devices of 21mb. (1 per controller). I gather
they are the SCSI device you talk to to control the arrays.
As to why they implemented that device as a block device and not as an
enclosure device or some other non disk SCSI device I can't say.
From dtrainor at toolbox.com Thu Jan 6 15:00:45 2011
From: dtrainor at toolbox.com (Dan Trainor)
Date: Thu, 6 Jan 2011 14:00:45 -0700
Subject: Dell SAS 5/e, EL5/CentOS5, MD3000
In-Reply-To: <4D262B63.3010104@acfr.usyd.edu.au>
References:
<399212640037934A8F25742DA026AFD7021BC143@LEJX7ADC103.EMEA.DELL.COM>,
<399212640037934A8F25742DA026AFD70201C976@LEJX7ADC103.EMEA.DELL.COM>
<4D262B63.3010104@acfr.usyd.edu.au>
Message-ID:
On 07/01/11 05:43, Dan Trainor wrote:
>> What causes these "shadow" drives? Funny you should mention. I also did see one of these shadow drives, it was a 20M virtual that did not exist in the MD3000 itself. Don't know if that has any significance.
> It appears to be a idiosyncrasy of the MD3000 that the controller
creates two dummy block devices of 21mb. (1 per controller). I gather
they are the SCSI device you talk to to control the arrays.
> As to why they implemented that device as a block device and not as an
enclosure device or some other non disk SCSI device I can't say.
Hi, Matthew -
Interesting. So When you talk about "controller", is that the controller on the HBA, or a controller on the MD3000 itself? I'm guessing the latter based on the way you explained your response.
How many virtual disks can I expose to a single HBA from the MD3000? Are there any hard/theoretical limits?
Thanks
-dant
From Matt_Domsch at dell.com Thu Jan 6 15:06:54 2011
From: Matt_Domsch at dell.com (Matt Domsch)
Date: Thu, 6 Jan 2011 15:06:54 -0600
Subject: mailing list bounces / unsubscribes
Message-ID: <20110106210654.GA11643@auslistsprd01.us.dell.com>
We're experiencing problems with the mailing lists on
lists.us.dell.com, including linux-poweredge, where people may be
unsubscribed automatically due to bounces, even when then bounces are
incorrect. I'm investigating. In the meantime, I've disabled mailman
bounce processing to keep people from being automatically
unsubscribed.
Thanks,
Matt
--
Matt Domsch
Technology Strategist
Dell | Office of the CTO
From johan.sjoberg at deltamanagement.se Thu Jan 6 15:12:56 2011
From: johan.sjoberg at deltamanagement.se (=?iso-8859-1?Q?Johan_Sj=F6berg?=)
Date: Thu, 6 Jan 2011 22:12:56 +0100
Subject: poweredge 2650/2850 options
In-Reply-To: <25092420.665.1294338895992.JavaMail.root@lincoln.ewind.com>
References: <25092420.665.1294338895992.JavaMail.root@lincoln.ewind.com>
Message-ID:
The 2850:s should be able to run VMware. We are running ESXi on an 1850, and it works, but it's not a performance monster.
/Johan
-----Original Message-----
From: linux-poweredge-bounces at dell.com [mailto:linux-poweredge-bounces at dell.com] On Behalf Of Hoover Chan
Sent: den 6 januari 2011 19:35
To: linux-poweredge at lists.us.dell.com
Subject: poweredge 2650/2850 options
I just received these servers in a donation and was wondering what my options are. SCSI based. Dual Xeon but it doesn't look like it'll do VMWare which is what I'm trying to migrate to.
Any other thoughts from the collected wisdom here?
Thanks in advance.
-----------------------------------------------------------------
Hoover Chan hchan at mail.ewind.com -or- hchan at well.com
_______________________________________________
Linux-PowerEdge mailing list
Linux-PowerEdge at dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq
From stroller at stellar.eclipse.co.uk Thu Jan 6 17:43:04 2011
From: stroller at stellar.eclipse.co.uk (Stroller)
Date: Thu, 6 Jan 2011 23:43:04 +0000
Subject: poweredge 2650/2850 options
In-Reply-To: <25092420.665.1294338895992.JavaMail.root@lincoln.ewind.com>
References: <25092420.665.1294338895992.JavaMail.root@lincoln.ewind.com>
Message-ID: <1CD2B7A0-81AB-4B4D-9642-235A6FCC75CA@stellar.eclipse.co.uk>
On 6/1/2011, at 6:34pm, Hoover Chan wrote:
> I just received these servers in a donation and was wondering what my options are. SCSI based. Dual Xeon but it doesn't look like it'll do VMWare which is what I'm trying to migrate to.
On two 2650s I've seen problems similar to this [1] [2] on recent (c 2.6.35) vanilla kernels. It really does not take much effort to create the disk throughput which will cause the crash, and the array will offline, needing a reboot and a rebuild. Sometimes the rebuild fails, but most often it completes perfectly, with no errors, then verifies ok, only to show the same problem the next time I perform a recursive grep. This has persisted through several disks, which now seem to be working fine on my 2850.
The cause appears to be more complex than in the links, which patch the driver "removing aac_handle_aif entirely". The driver has undergone considerable development in the intervening 5 years, and appears now to carry much more sophisticated error handling for such conditions. Troubleshooting it would be way beyond my abilities, and I doubt if anyone knowledgable is now interested in doing so.
For this reason I can't recommend the 2650 at all. I'm getting rid of mine (hopefully) this week. I'll bet they're stable under Windows 2003, though.
I really like the 2850. They're 64-bit and you can pick them up on eBay for c ?125 fully loaded. DRAC4 cards are nice and there are loads on eBay for less than ?20! By the standards of my other rescued-from-the-scrapheap home servers they're very fast.
I'm not sure either are worthwhile if you have to pay for electricity, though. Even if they were donated to you! Both are Pentium 4 technology, and considering power-consumption a low-end current model (with virtualisation extensions in the CPU?) might well pay for itself in a year or two.
Stroller.
[1] http://lists.us.dell.com/pipermail/linux-poweredge/2006-June/026064.html
[2] http://marc.info/?l=linux-scsi&m=110252243627410&w=2
From Jens_Heinz at Dell.com Fri Jan 7 02:44:24 2011
From: Jens_Heinz at Dell.com (Jens_Heinz at Dell.com)
Date: Fri, 7 Jan 2011 09:44:24 +0100
Subject: Dell SAS 5/e, EL5/CentOS5, MD3000
In-Reply-To:
References:
<399212640037934A8F25742DA026AFD7021BC143@LEJX7ADC103.EMEA.DELL.COM>,
<399212640037934A8F25742DA026AFD70201C976@LEJX7ADC103.EMEA.DELL.COM>,
Message-ID: <399212640037934A8F25742DA026AFD7021BC227@LEJX7ADC103.EMEA.DELL.COM>
Good morning,
always glad to help ;-)
I'll try to shine some light on everything.
First thing, you always need mptsas/mptlinux if you're using a MD3000, because it is the kernel module/driver for your SAS5/e HBA. Of course you don't need it with MD3000i (iSCSI version).
Then you'll have to decide whether to use linuxrdac or device mapper for multipath. The requirement for linuxrdac comes from LSI and honestly I would prefer it over device mapper. RDAC is made for Engenio/LSI controllers. We too see much less issues here in regards of failover. (But the reason might be, that customers don't call in with device mapper issues since we don't support it. So I would be glad to hear other opinions on that.)
The 20MB device you mentioned (should have LUN ID 31 be default), is the MD's access virtual disk. It is essential for the MD's LUN mappings to work and shall never be touched! W2k8 and ESX are already able to hide that from being visible to users.
What I call a 'shadow' device (I use that name in lack of a proper name) is the way how the MD presents its virtual disks to the host. It always tells the host how many paths are available to access the particular virtual disk. (You can say it depends on the number of host ports on your MD). If the host in return is able to interpret that information, it'll create device nodes for those 'virtual usable' paths too. As far as I recall with Linux that should only happen if more than one path is connected or a rdac driver is active.
Another detail to point out is the following. If you have a MD with 2x controller modules (I'm still not sure if that is the case with yours), a virtual disk can only be owned by one of the both controllers (though both controllers are active and marketing is often referring to it as active/active, which is not entirely true). In MDSM the owner would be called 'preferred path'.
Let's say your particular virtual disk's preferred path is raid controller 0, but your host is connected to raid controller 1 (I'm not talking about the ports on SAS HBA), then MD can tell Linux there is a drive for you, but in fact it isn't because the physical path doesn't exist (it is on the other controller => 0). On access of course the MD internally tries to move the ownership to the required controller, but that not always succeeds and results in I/O errors like yours. I saw that a lot. Usually you just have to change the preferred ownership in MDSM for that virtual drive.
Once I had a look at the MD log I can tell you more,
Jens.
________________________________________
From: linux-poweredge-bounces-Lists On Behalf Of Dan Trainor [dtrainor at toolbox.com]
Sent: Thursday, January 06, 2011 7:43 PM
To: linux-poweredge-Lists
Subject: RE: Dell SAS 5/e, EL5/CentOS5, MD3000
Hi, Jens -
Wow, that's a lot of good information, thank you. I have a better understanding of MD now.
So in my situation, using RHEL/CentOS, I understand that I can use either linuxrdac or mptsas/mptlinux. Which is recommended more over the other? I suppose you're obligated to say that the linuxrdac driver works better, is better supported etc etc, but if I could get your opinion :)
What causes these "shadow" drives? Funny you should mention. I also did see one of these shadow drives, it was a 20M virtual that did not exist in the MD3000 itself. Don't know if that has any significance.
I'm familiar with the ramifications of using right/wrong device names in Linux, and the errors that I was getting were not indicative of this.
I juggled the cable around a little bit, switched to a different SAS port on the SAS5/e card, and I see:
Jan 6 11:20:35 vmserver02 kernel: mptsas: ioc0: attaching ssp device: fw_channel 0, fw_id 16, phy 0, sas_addr 0x50022194925cb504
Jan 6 11:20:35 vmserver02 kernel: target0:0:1: mptsas: ioc0: add device: fw_channel 0, fw_id 16, phy 0, sas_addr 0x50022194925cb504
Jan 6 11:20:35 vmserver02 kernel: Vendor: DELL Model: MD3000 Rev: 0670
Jan 6 11:20:35 vmserver02 kernel: Type: Direct-Access ANSI SCSI revision: 05
Jan 6 11:20:35 vmserver02 kernel: scsi 0:0:1:0: mptscsih: ioc0: qdepth=64, tagged=1, simple=1, ordered=0, scsi_level=6, cmd_que=1
Jan 6 11:20:35 vmserver02 kernel: scsi 0:0:1:0: Attached scsi generic sg0 type 0
Jan 6 11:20:35 vmserver02 kernel: Vendor: DELL Model: MD3000 Rev: 0670
Jan 6 11:20:35 vmserver02 kernel: Type: Direct-Access ANSI SCSI revision: 05
Jan 6 11:20:36 vmserver02 kernel: scsi 0:0:1:2: mptscsih: ioc0: qdepth=64, tagged=1, simple=1, ordered=0, scsi_level=6, cmd_que=1
Jan 6 11:20:36 vmserver02 kernel: SCSI device sda: 125829120 512-byte hdwr sectors (64425 MB)
Jan 6 11:20:36 vmserver02 kernel: sda: Write Protect is off
Jan 6 11:20:36 vmserver02 kernel: SCSI device sda: drive cache: write back w/ FUA
Jan 6 11:20:36 vmserver02 kernel: SCSI device sda: 125829120 512-byte hdwr sectors (64425 MB)
Jan 6 11:20:36 vmserver02 kernel: sda: Write Protect is off
Jan 6 11:20:36 vmserver02 kernel: SCSI device sda: drive cache: write back w/ FUA
Jan 6 11:20:36 vmserver02 kernel: sda: unknown partition table
Jan 6 11:20:36 vmserver02 kernel: sd 0:0:1:2: Attached scsi disk sda
Jan 6 11:20:36 vmserver02 kernel: sd 0:0:1:2: Attached scsi generic sg1 type 0
Jan 6 11:20:36 vmserver02 kernel: Vendor: DELL Model: Universal Xport Rev: 0670
Jan 6 11:20:36 vmserver02 kernel: Type: Direct-Access ANSI SCSI revision: 05
Jan 6 11:20:36 vmserver02 kernel: scsi 0:0:1:31: mptscsih: ioc0: qdepth=64, tagged=1, simple=1, ordered=0, scsi_level=6, cmd_que=1
Jan 6 11:20:36 vmserver02 kernel: SCSI device sdb: 40960 512-byte hdwr sectors (21 MB)
Jan 6 11:20:36 vmserver02 kernel: sdb: Write Protect is off
Jan 6 11:20:36 vmserver02 kernel: SCSI device sdb: drive cache: write through w/ FUA
Jan 6 11:20:36 vmserver02 kernel: SCSI device sdb: 40960 512-byte hdwr sectors (21 MB)
Jan 6 11:20:36 vmserver02 kernel: sdb: Write Protect is off
Jan 6 11:20:36 vmserver02 kernel: SCSI device sdb: drive cache: write through w/ FUA
Jan 6 11:20:36 vmserver02 kernel: sdb: unknown partition table
Jan 6 11:20:36 vmserver02 kernel: sd 0:0:1:31: Attached scsi disk sdb
Jan 6 11:20:36 vmserver02 kernel: sd 0:0:1:31: Attached scsi generic sg6 type 0
You'll see both drives - my 60G volume, and that mysterious 20M volume.
I'm using the mptsas driver right now. I think forcing a reset, but unplugging/plugging in the SAS cable, might have shaken things up enough so that the new driver now understands how to talk to the MD3000. I'm going to reproduce all this on a clean install to see what the deal is. Either I misunderstood how setting/resetting the SAS controller/path works, or I have a bad cable and just happened to temporarily fix it by moving it around a little bit.
Jens, I sincerely appreciate your help. I hope this also serves as some good information for others trying to figure this out, as well.
Thanks
-dant
-----Original Message-----
From: Jens_Heinz at Dell.com [mailto:Jens_Heinz at Dell.com]
Sent: Thursday, January 06, 2011 11:35 AM
To: Dan Trainor
Subject: RE: Dell SAS 5/e, EL5/CentOS5, MD3000
Hi,
in fact Dell supports only the linuxrdac driver for all MD3000(i) systems, but it is only available for RHEL/SLES based Linux. Debian has to use Linux multipath with RDAC device handler. The mptsas/mptlinux is only dealing as driver for the SAS HBA (SAS5/e).
But the important thing is that your MD3000 drive is sometimes presented to your OS multiple times (even if only 1x path is connected), like sdb and sdc. One or more of those sd devices are then 'shadow' devices which can't be used to access the drive. If you now use the 'wrong' device it'll result in errors. Same could happen if you have only a SAS path to the controller which is currently not owning the MD virtual disk drive.
The other post might be right, we saw problems like this caused by faulty HW aswell, but most of the time it is a misunderstanding on how MD works.
If you want me to, I could have a look at your configuration to check for those issues. I would need a MD support log (in MDSM -> Tools -> Gather Support Information) and an output of fdisk -l would be great.
Bye,
Jens.
________________________________________
From: linux-poweredge-bounces-Lists On Behalf Of Dan Trainor [dtrainor at toolbox.com]
Sent: Thursday, January 06, 2011 7:19 PM
To: linux-poweredge-Lists
Subject: RE: Dell SAS 5/e, EL5/CentOS5, MD3000
Hi, Jens -
So I'm reading what I interpret as conflicting information. I've seen a lot of places talk about using the mptsas driver, and a lot of people saying I need to use the linuxrdac driver. The latter, I believe, is only if I'm using any kind of multipath. Eventually that would be nice, sure, but right now I just want to get a simple configuration working and expand from there.
I'm correct in going the mptsas/mptlinux route, right?
Thanks
-dant
From: Jens_Heinz at Dell.com [mailto:Jens_Heinz at Dell.com]
Sent: Thursday, January 06, 2011 10:05 AM
To: Dan Trainor; linux-poweredge at lists.us.dell.com
Subject: RE: Dell SAS 5/e, EL5/CentOS5, MD3000
How do you access the virtual disk (ie. using /dev/sdX device oder /dev/mapper)?
Which raid controller module the SAS cable is attached to?
Is the preferred owner of your virtual disk (on the MD3000) the same raid controller the cable is attached to?
Are you using the Dell provided linuxrdac driver?
If any possible send me a MD3000 support log to have a look at.
Regards,
Jens.
From: linux-poweredge-bounces-Lists On Behalf Of Dan Trainor
Sent: 06 January 2011 17:45
To: linux-poweredge-Lists
Subject: Dell SAS 5/e, EL5/CentOS5, MD3000
Hi -
It would seem that everyone else has gotten this to work, except for me (but with varying degrees of success). I'm starting to think that its hardware related now, unfortunately. However, this being my first time trying to use any Dell (Or LSI?) storage products, maybe I'm simply doing it wrong.
I'm using a SAS 5/e HBA connected to an MD3000. Pretty sure the MD3k checks out fine because the other controller on the MD3k is connected to a Windows-based backup machine that also uses the MD3k. I carved out a 60G virtual disk and exposed it to this single HBA that I'm trying to use, all under CentOS 5.5.
>From reading http://www.delltechcenter.com/page/Linux+RAID+and+Storage, I understand that I'm supposed to use "mptsas, part of the mptfusion driver family". I've not found this in later EL5/CentOS5 kernels, as the article suggests. I went ahead and used LSI's provided drivers in kmod form (mptbase), and their mptlinux util/sysv script. Since I wasn't able to find any clear-cut instructions, these steps seemed logical from what I figured.
I'm able to see the exposed slice from the BIOS of the SAS5/e card before the machine boots, and I've tried different methods of verification and even gone so far as to switch between the card being enabled in BIOS, OS, or both etc etc. Whenever I try to access the slice by any means, however, I get i/o errors. This is an example of a simple mkfs.ext3 that I attempted. I didn't expect it to work - but I wanted to debug some more and figured this was the most intrusive way to do so:
Jan 5 15:32:39 vmserver02 kernel: printk: 29826 messages suppressed.
Jan 5 15:32:39 vmserver02 kernel: Buffer I/O error on device sda, logical block 14319618
Jan 5 15:32:39 vmserver02 kernel: lost page write due to I/O error on sda
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 114557968
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 114558992
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 114560016
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 114819088
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 114820112
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 114821136
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 114822160
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 115081232
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 115082256
And, here's what I'ma ctually using:
[root at vmserver02.ops.az.domain.local x86_64]# lspci|grep "SCSI storage"
02:08.0 SCSI storage controller: LSI Logic / Symbios Logic SAS1068 PCI-X Fusion-MPT SAS (rev 01)
03:00.0 SCSI storage controller: LSI Logic / Symbios Logic SAS1068E PCI-Express Fusion-MPT SAS (rev 08)
I'm kind of stumped at this point. This is my second go at this over a six month period. Last time I got the same results, just got frustrated and then gave up. Hopefully with some help, though, I'll be able to get this working as I expect it to.
And, just out of curiosity, what is the difference between these two devices that lspci shows? There's only one card in there. Is one of these the actual onboard SAS controller?
Thanks!
-dant
_______________________________________________
Linux-PowerEdge mailing list
Linux-PowerEdge at dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq
From dtrainor at toolbox.com Fri Jan 7 10:53:35 2011
From: dtrainor at toolbox.com (Dan Trainor)
Date: Fri, 7 Jan 2011 09:53:35 -0700
Subject: Dell SAS 5/e, EL5/CentOS5, MD3000
In-Reply-To: <399212640037934A8F25742DA026AFD7021BC227@LEJX7ADC103.EMEA.DELL.COM>
References:
<399212640037934A8F25742DA026AFD7021BC143@LEJX7ADC103.EMEA.DELL.COM>,
<399212640037934A8F25742DA026AFD70201C976@LEJX7ADC103.EMEA.DELL.COM>,
<399212640037934A8F25742DA026AFD7021BC227@LEJX7ADC103.EMEA.DELL.COM>
Message-ID:
Hi, Jens -
Once again - fantastic information. I'm going to compile a Wiki article somewhere of your brain dumps here, not only for my reference, but for others as well. This is really good stuff.
I'm going to try working with both linuxrdac and DM. I throw DM in there because I've used it before with multipathd, also for storage. That was different though - I was using all Brocade and QLogic components. I'd expect this to work the same way.
Regarding controller ownership, active/active, etc etc. If I was connected to controller 1 on the MD3000, and the volume was assigned to controller 0, no devices on controller 1 would be able to use that volume? Would I at least be able to see it - just not use it? The more you describe this, the more I think that the volume was not actually assigned properly, via preferred owner.
I think I understand that linuxrdac is preferred over DM because it seems that linuxrdac can communicate directly with the MD3000, and listen for signals such as that which would be good to know in the event that some controller died and linuxrdac needs to do its multipath thing to switch over to the other port/controller. I don't see how, using DM, a host running Linux would *know* to agree with the MD3000 that a controller or component is failing, unless both sides could agree on this fact. Given the fact that an active/active implementation does not truly exist, I think this is a very important point as to why linuxrdac would be preferred over DM. In a perfect world where nothing broke, I'd use DM, I guess.
Thanks again for your help. I'm still waiting on that log, along with some hardware. I'm thinking that the volume was not set up properly, unfortunately I did not set it up. I think I can confirm this without the log by knowing the answers to a few questions above re: ownership/assignment/visibility etc etc.
Thanks!
-dant
-----Original Message-----
From: Jens_Heinz at Dell.com [mailto:Jens_Heinz at Dell.com]
Sent: Friday, January 07, 2011 1:44 AM
To: Dan Trainor; linux-poweredge at lists.us.dell.com
Subject: RE: Dell SAS 5/e, EL5/CentOS5, MD3000
Good morning,
always glad to help ;-)
I'll try to shine some light on everything.
First thing, you always need mptsas/mptlinux if you're using a MD3000, because it is the kernel module/driver for your SAS5/e HBA. Of course you don't need it with MD3000i (iSCSI version).
Then you'll have to decide whether to use linuxrdac or device mapper for multipath. The requirement for linuxrdac comes from LSI and honestly I would prefer it over device mapper. RDAC is made for Engenio/LSI controllers. We too see much less issues here in regards of failover. (But the reason might be, that customers don't call in with device mapper issues since we don't support it. So I would be glad to hear other opinions on that.)
The 20MB device you mentioned (should have LUN ID 31 be default), is the MD's access virtual disk. It is essential for the MD's LUN mappings to work and shall never be touched! W2k8 and ESX are already able to hide that from being visible to users.
What I call a 'shadow' device (I use that name in lack of a proper name) is the way how the MD presents its virtual disks to the host. It always tells the host how many paths are available to access the particular virtual disk. (You can say it depends on the number of host ports on your MD). If the host in return is able to interpret that information, it'll create device nodes for those 'virtual usable' paths too. As far as I recall with Linux that should only happen if more than one path is connected or a rdac driver is active.
Another detail to point out is the following. If you have a MD with 2x controller modules (I'm still not sure if that is the case with yours), a virtual disk can only be owned by one of the both controllers (though both controllers are active and marketing is often referring to it as active/active, which is not entirely true). In MDSM the owner would be called 'preferred path'.
Let's say your particular virtual disk's preferred path is raid controller 0, but your host is connected to raid controller 1 (I'm not talking about the ports on SAS HBA), then MD can tell Linux there is a drive for you, but in fact it isn't because the physical path doesn't exist (it is on the other controller => 0). On access of course the MD internally tries to move the ownership to the required controller, but that not always succeeds and results in I/O errors like yours. I saw that a lot. Usually you just have to change the preferred ownership in MDSM for that virtual drive.
Once I had a look at the MD log I can tell you more,
Jens.
________________________________________
From: linux-poweredge-bounces-Lists On Behalf Of Dan Trainor [dtrainor at toolbox.com]
Sent: Thursday, January 06, 2011 7:43 PM
To: linux-poweredge-Lists
Subject: RE: Dell SAS 5/e, EL5/CentOS5, MD3000
Hi, Jens -
Wow, that's a lot of good information, thank you. I have a better understanding of MD now.
So in my situation, using RHEL/CentOS, I understand that I can use either linuxrdac or mptsas/mptlinux. Which is recommended more over the other? I suppose you're obligated to say that the linuxrdac driver works better, is better supported etc etc, but if I could get your opinion :)
What causes these "shadow" drives? Funny you should mention. I also did see one of these shadow drives, it was a 20M virtual that did not exist in the MD3000 itself. Don't know if that has any significance.
I'm familiar with the ramifications of using right/wrong device names in Linux, and the errors that I was getting were not indicative of this.
I juggled the cable around a little bit, switched to a different SAS port on the SAS5/e card, and I see:
Jan 6 11:20:35 vmserver02 kernel: mptsas: ioc0: attaching ssp device: fw_channel 0, fw_id 16, phy 0, sas_addr 0x50022194925cb504
Jan 6 11:20:35 vmserver02 kernel: target0:0:1: mptsas: ioc0: add device: fw_channel 0, fw_id 16, phy 0, sas_addr 0x50022194925cb504
Jan 6 11:20:35 vmserver02 kernel: Vendor: DELL Model: MD3000 Rev: 0670
Jan 6 11:20:35 vmserver02 kernel: Type: Direct-Access ANSI SCSI revision: 05
Jan 6 11:20:35 vmserver02 kernel: scsi 0:0:1:0: mptscsih: ioc0: qdepth=64, tagged=1, simple=1, ordered=0, scsi_level=6, cmd_que=1
Jan 6 11:20:35 vmserver02 kernel: scsi 0:0:1:0: Attached scsi generic sg0 type 0
Jan 6 11:20:35 vmserver02 kernel: Vendor: DELL Model: MD3000 Rev: 0670
Jan 6 11:20:35 vmserver02 kernel: Type: Direct-Access ANSI SCSI revision: 05
Jan 6 11:20:36 vmserver02 kernel: scsi 0:0:1:2: mptscsih: ioc0: qdepth=64, tagged=1, simple=1, ordered=0, scsi_level=6, cmd_que=1
Jan 6 11:20:36 vmserver02 kernel: SCSI device sda: 125829120 512-byte hdwr sectors (64425 MB)
Jan 6 11:20:36 vmserver02 kernel: sda: Write Protect is off
Jan 6 11:20:36 vmserver02 kernel: SCSI device sda: drive cache: write back w/ FUA
Jan 6 11:20:36 vmserver02 kernel: SCSI device sda: 125829120 512-byte hdwr sectors (64425 MB)
Jan 6 11:20:36 vmserver02 kernel: sda: Write Protect is off
Jan 6 11:20:36 vmserver02 kernel: SCSI device sda: drive cache: write back w/ FUA
Jan 6 11:20:36 vmserver02 kernel: sda: unknown partition table
Jan 6 11:20:36 vmserver02 kernel: sd 0:0:1:2: Attached scsi disk sda
Jan 6 11:20:36 vmserver02 kernel: sd 0:0:1:2: Attached scsi generic sg1 type 0
Jan 6 11:20:36 vmserver02 kernel: Vendor: DELL Model: Universal Xport Rev: 0670
Jan 6 11:20:36 vmserver02 kernel: Type: Direct-Access ANSI SCSI revision: 05
Jan 6 11:20:36 vmserver02 kernel: scsi 0:0:1:31: mptscsih: ioc0: qdepth=64, tagged=1, simple=1, ordered=0, scsi_level=6, cmd_que=1
Jan 6 11:20:36 vmserver02 kernel: SCSI device sdb: 40960 512-byte hdwr sectors (21 MB)
Jan 6 11:20:36 vmserver02 kernel: sdb: Write Protect is off
Jan 6 11:20:36 vmserver02 kernel: SCSI device sdb: drive cache: write through w/ FUA
Jan 6 11:20:36 vmserver02 kernel: SCSI device sdb: 40960 512-byte hdwr sectors (21 MB)
Jan 6 11:20:36 vmserver02 kernel: sdb: Write Protect is off
Jan 6 11:20:36 vmserver02 kernel: SCSI device sdb: drive cache: write through w/ FUA
Jan 6 11:20:36 vmserver02 kernel: sdb: unknown partition table
Jan 6 11:20:36 vmserver02 kernel: sd 0:0:1:31: Attached scsi disk sdb
Jan 6 11:20:36 vmserver02 kernel: sd 0:0:1:31: Attached scsi generic sg6 type 0
You'll see both drives - my 60G volume, and that mysterious 20M volume.
I'm using the mptsas driver right now. I think forcing a reset, but unplugging/plugging in the SAS cable, might have shaken things up enough so that the new driver now understands how to talk to the MD3000. I'm going to reproduce all this on a clean install to see what the deal is. Either I misunderstood how setting/resetting the SAS controller/path works, or I have a bad cable and just happened to temporarily fix it by moving it around a little bit.
Jens, I sincerely appreciate your help. I hope this also serves as some good information for others trying to figure this out, as well.
Thanks
-dant
-----Original Message-----
From: Jens_Heinz at Dell.com [mailto:Jens_Heinz at Dell.com]
Sent: Thursday, January 06, 2011 11:35 AM
To: Dan Trainor
Subject: RE: Dell SAS 5/e, EL5/CentOS5, MD3000
Hi,
in fact Dell supports only the linuxrdac driver for all MD3000(i) systems, but it is only available for RHEL/SLES based Linux. Debian has to use Linux multipath with RDAC device handler. The mptsas/mptlinux is only dealing as driver for the SAS HBA (SAS5/e).
But the important thing is that your MD3000 drive is sometimes presented to your OS multiple times (even if only 1x path is connected), like sdb and sdc. One or more of those sd devices are then 'shadow' devices which can't be used to access the drive. If you now use the 'wrong' device it'll result in errors. Same could happen if you have only a SAS path to the controller which is currently not owning the MD virtual disk drive.
The other post might be right, we saw problems like this caused by faulty HW aswell, but most of the time it is a misunderstanding on how MD works.
If you want me to, I could have a look at your configuration to check for those issues. I would need a MD support log (in MDSM -> Tools -> Gather Support Information) and an output of fdisk -l would be great.
Bye,
Jens.
________________________________________
From: linux-poweredge-bounces-Lists On Behalf Of Dan Trainor [dtrainor at toolbox.com]
Sent: Thursday, January 06, 2011 7:19 PM
To: linux-poweredge-Lists
Subject: RE: Dell SAS 5/e, EL5/CentOS5, MD3000
Hi, Jens -
So I'm reading what I interpret as conflicting information. I've seen a lot of places talk about using the mptsas driver, and a lot of people saying I need to use the linuxrdac driver. The latter, I believe, is only if I'm using any kind of multipath. Eventually that would be nice, sure, but right now I just want to get a simple configuration working and expand from there.
I'm correct in going the mptsas/mptlinux route, right?
Thanks
-dant
From: Jens_Heinz at Dell.com [mailto:Jens_Heinz at Dell.com]
Sent: Thursday, January 06, 2011 10:05 AM
To: Dan Trainor; linux-poweredge at lists.us.dell.com
Subject: RE: Dell SAS 5/e, EL5/CentOS5, MD3000
How do you access the virtual disk (ie. using /dev/sdX device oder /dev/mapper)?
Which raid controller module the SAS cable is attached to?
Is the preferred owner of your virtual disk (on the MD3000) the same raid controller the cable is attached to?
Are you using the Dell provided linuxrdac driver?
If any possible send me a MD3000 support log to have a look at.
Regards,
Jens.
From: linux-poweredge-bounces-Lists On Behalf Of Dan Trainor
Sent: 06 January 2011 17:45
To: linux-poweredge-Lists
Subject: Dell SAS 5/e, EL5/CentOS5, MD3000
Hi -
It would seem that everyone else has gotten this to work, except for me (but with varying degrees of success). I'm starting to think that its hardware related now, unfortunately. However, this being my first time trying to use any Dell (Or LSI?) storage products, maybe I'm simply doing it wrong.
I'm using a SAS 5/e HBA connected to an MD3000. Pretty sure the MD3k checks out fine because the other controller on the MD3k is connected to a Windows-based backup machine that also uses the MD3k. I carved out a 60G virtual disk and exposed it to this single HBA that I'm trying to use, all under CentOS 5.5.
>From reading http://www.delltechcenter.com/page/Linux+RAID+and+Storage, I understand that I'm supposed to use "mptsas, part of the mptfusion driver family". I've not found this in later EL5/CentOS5 kernels, as the article suggests. I went ahead and used LSI's provided drivers in kmod form (mptbase), and their mptlinux util/sysv script. Since I wasn't able to find any clear-cut instructions, these steps seemed logical from what I figured.
I'm able to see the exposed slice from the BIOS of the SAS5/e card before the machine boots, and I've tried different methods of verification and even gone so far as to switch between the card being enabled in BIOS, OS, or both etc etc. Whenever I try to access the slice by any means, however, I get i/o errors. This is an example of a simple mkfs.ext3 that I attempted. I didn't expect it to work - but I wanted to debug some more and figured this was the most intrusive way to do so:
Jan 5 15:32:39 vmserver02 kernel: printk: 29826 messages suppressed.
Jan 5 15:32:39 vmserver02 kernel: Buffer I/O error on device sda, logical block 14319618
Jan 5 15:32:39 vmserver02 kernel: lost page write due to I/O error on sda
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 114557968
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 114558992
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 114560016
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 114819088
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 114820112
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 114821136
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 114822160
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 115081232
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 115082256
And, here's what I'ma ctually using:
[root at vmserver02.ops.az.domain.local x86_64]# lspci|grep "SCSI storage"
02:08.0 SCSI storage controller: LSI Logic / Symbios Logic SAS1068 PCI-X Fusion-MPT SAS (rev 01)
03:00.0 SCSI storage controller: LSI Logic / Symbios Logic SAS1068E PCI-Express Fusion-MPT SAS (rev 08)
I'm kind of stumped at this point. This is my second go at this over a six month period. Last time I got the same results, just got frustrated and then gave up. Hopefully with some help, though, I'll be able to get this working as I expect it to.
And, just out of curiosity, what is the difference between these two devices that lspci shows? There's only one card in there. Is one of these the actual onboard SAS controller?
Thanks!
-dant
_______________________________________________
Linux-PowerEdge mailing list
Linux-PowerEdge at dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq
From dtrainor at toolbox.com Fri Jan 7 10:57:00 2011
From: dtrainor at toolbox.com (Dan Trainor)
Date: Fri, 7 Jan 2011 09:57:00 -0700
Subject: Dell SAS 5/e, EL5/CentOS5, MD3000
In-Reply-To: <399212640037934A8F25742DA026AFD7021BC227@LEJX7ADC103.EMEA.DELL.COM>
References:
<399212640037934A8F25742DA026AFD7021BC143@LEJX7ADC103.EMEA.DELL.COM>,
<399212640037934A8F25742DA026AFD70201C976@LEJX7ADC103.EMEA.DELL.COM>,
<399212640037934A8F25742DA026AFD7021BC227@LEJX7ADC103.EMEA.DELL.COM>
Message-ID:
Hi, Jens -
One more, sorry.
Being that only one volume can be assigned to one host at a given time.... I was going in to this project thinking that I could have a volume of say read only data, such as content for a web site or media etc etc that all the hosts (web servers) in my configuration could read from at any given time. It doesn't sound like this is a method that would work, would it?
If not, then I'll probably go the GFS2/COFS2 route, or even iSCSI - but I'm not exactly there yet. I've done all three in the past, without the advantage of any DAS/SAN hardware.
Thanks!
-dant
-----Original Message-----
From: Jens_Heinz at Dell.com [mailto:Jens_Heinz at Dell.com]
Sent: Friday, January 07, 2011 1:44 AM
To: Dan Trainor; linux-poweredge at lists.us.dell.com
Subject: RE: Dell SAS 5/e, EL5/CentOS5, MD3000
Good morning,
always glad to help ;-)
I'll try to shine some light on everything.
First thing, you always need mptsas/mptlinux if you're using a MD3000, because it is the kernel module/driver for your SAS5/e HBA. Of course you don't need it with MD3000i (iSCSI version).
Then you'll have to decide whether to use linuxrdac or device mapper for multipath. The requirement for linuxrdac comes from LSI and honestly I would prefer it over device mapper. RDAC is made for Engenio/LSI controllers. We too see much less issues here in regards of failover. (But the reason might be, that customers don't call in with device mapper issues since we don't support it. So I would be glad to hear other opinions on that.)
The 20MB device you mentioned (should have LUN ID 31 be default), is the MD's access virtual disk. It is essential for the MD's LUN mappings to work and shall never be touched! W2k8 and ESX are already able to hide that from being visible to users.
What I call a 'shadow' device (I use that name in lack of a proper name) is the way how the MD presents its virtual disks to the host. It always tells the host how many paths are available to access the particular virtual disk. (You can say it depends on the number of host ports on your MD). If the host in return is able to interpret that information, it'll create device nodes for those 'virtual usable' paths too. As far as I recall with Linux that should only happen if more than one path is connected or a rdac driver is active.
Another detail to point out is the following. If you have a MD with 2x controller modules (I'm still not sure if that is the case with yours), a virtual disk can only be owned by one of the both controllers (though both controllers are active and marketing is often referring to it as active/active, which is not entirely true). In MDSM the owner would be called 'preferred path'.
Let's say your particular virtual disk's preferred path is raid controller 0, but your host is connected to raid controller 1 (I'm not talking about the ports on SAS HBA), then MD can tell Linux there is a drive for you, but in fact it isn't because the physical path doesn't exist (it is on the other controller => 0). On access of course the MD internally tries to move the ownership to the required controller, but that not always succeeds and results in I/O errors like yours. I saw that a lot. Usually you just have to change the preferred ownership in MDSM for that virtual drive.
Once I had a look at the MD log I can tell you more,
Jens.
________________________________________
From: linux-poweredge-bounces-Lists On Behalf Of Dan Trainor [dtrainor at toolbox.com]
Sent: Thursday, January 06, 2011 7:43 PM
To: linux-poweredge-Lists
Subject: RE: Dell SAS 5/e, EL5/CentOS5, MD3000
Hi, Jens -
Wow, that's a lot of good information, thank you. I have a better understanding of MD now.
So in my situation, using RHEL/CentOS, I understand that I can use either linuxrdac or mptsas/mptlinux. Which is recommended more over the other? I suppose you're obligated to say that the linuxrdac driver works better, is better supported etc etc, but if I could get your opinion :)
What causes these "shadow" drives? Funny you should mention. I also did see one of these shadow drives, it was a 20M virtual that did not exist in the MD3000 itself. Don't know if that has any significance.
I'm familiar with the ramifications of using right/wrong device names in Linux, and the errors that I was getting were not indicative of this.
I juggled the cable around a little bit, switched to a different SAS port on the SAS5/e card, and I see:
Jan 6 11:20:35 vmserver02 kernel: mptsas: ioc0: attaching ssp device: fw_channel 0, fw_id 16, phy 0, sas_addr 0x50022194925cb504
Jan 6 11:20:35 vmserver02 kernel: target0:0:1: mptsas: ioc0: add device: fw_channel 0, fw_id 16, phy 0, sas_addr 0x50022194925cb504
Jan 6 11:20:35 vmserver02 kernel: Vendor: DELL Model: MD3000 Rev: 0670
Jan 6 11:20:35 vmserver02 kernel: Type: Direct-Access ANSI SCSI revision: 05
Jan 6 11:20:35 vmserver02 kernel: scsi 0:0:1:0: mptscsih: ioc0: qdepth=64, tagged=1, simple=1, ordered=0, scsi_level=6, cmd_que=1
Jan 6 11:20:35 vmserver02 kernel: scsi 0:0:1:0: Attached scsi generic sg0 type 0
Jan 6 11:20:35 vmserver02 kernel: Vendor: DELL Model: MD3000 Rev: 0670
Jan 6 11:20:35 vmserver02 kernel: Type: Direct-Access ANSI SCSI revision: 05
Jan 6 11:20:36 vmserver02 kernel: scsi 0:0:1:2: mptscsih: ioc0: qdepth=64, tagged=1, simple=1, ordered=0, scsi_level=6, cmd_que=1
Jan 6 11:20:36 vmserver02 kernel: SCSI device sda: 125829120 512-byte hdwr sectors (64425 MB)
Jan 6 11:20:36 vmserver02 kernel: sda: Write Protect is off
Jan 6 11:20:36 vmserver02 kernel: SCSI device sda: drive cache: write back w/ FUA
Jan 6 11:20:36 vmserver02 kernel: SCSI device sda: 125829120 512-byte hdwr sectors (64425 MB)
Jan 6 11:20:36 vmserver02 kernel: sda: Write Protect is off
Jan 6 11:20:36 vmserver02 kernel: SCSI device sda: drive cache: write back w/ FUA
Jan 6 11:20:36 vmserver02 kernel: sda: unknown partition table
Jan 6 11:20:36 vmserver02 kernel: sd 0:0:1:2: Attached scsi disk sda
Jan 6 11:20:36 vmserver02 kernel: sd 0:0:1:2: Attached scsi generic sg1 type 0
Jan 6 11:20:36 vmserver02 kernel: Vendor: DELL Model: Universal Xport Rev: 0670
Jan 6 11:20:36 vmserver02 kernel: Type: Direct-Access ANSI SCSI revision: 05
Jan 6 11:20:36 vmserver02 kernel: scsi 0:0:1:31: mptscsih: ioc0: qdepth=64, tagged=1, simple=1, ordered=0, scsi_level=6, cmd_que=1
Jan 6 11:20:36 vmserver02 kernel: SCSI device sdb: 40960 512-byte hdwr sectors (21 MB)
Jan 6 11:20:36 vmserver02 kernel: sdb: Write Protect is off
Jan 6 11:20:36 vmserver02 kernel: SCSI device sdb: drive cache: write through w/ FUA
Jan 6 11:20:36 vmserver02 kernel: SCSI device sdb: 40960 512-byte hdwr sectors (21 MB)
Jan 6 11:20:36 vmserver02 kernel: sdb: Write Protect is off
Jan 6 11:20:36 vmserver02 kernel: SCSI device sdb: drive cache: write through w/ FUA
Jan 6 11:20:36 vmserver02 kernel: sdb: unknown partition table
Jan 6 11:20:36 vmserver02 kernel: sd 0:0:1:31: Attached scsi disk sdb
Jan 6 11:20:36 vmserver02 kernel: sd 0:0:1:31: Attached scsi generic sg6 type 0
You'll see both drives - my 60G volume, and that mysterious 20M volume.
I'm using the mptsas driver right now. I think forcing a reset, but unplugging/plugging in the SAS cable, might have shaken things up enough so that the new driver now understands how to talk to the MD3000. I'm going to reproduce all this on a clean install to see what the deal is. Either I misunderstood how setting/resetting the SAS controller/path works, or I have a bad cable and just happened to temporarily fix it by moving it around a little bit.
Jens, I sincerely appreciate your help. I hope this also serves as some good information for others trying to figure this out, as well.
Thanks
-dant
-----Original Message-----
From: Jens_Heinz at Dell.com [mailto:Jens_Heinz at Dell.com]
Sent: Thursday, January 06, 2011 11:35 AM
To: Dan Trainor
Subject: RE: Dell SAS 5/e, EL5/CentOS5, MD3000
Hi,
in fact Dell supports only the linuxrdac driver for all MD3000(i) systems, but it is only available for RHEL/SLES based Linux. Debian has to use Linux multipath with RDAC device handler. The mptsas/mptlinux is only dealing as driver for the SAS HBA (SAS5/e).
But the important thing is that your MD3000 drive is sometimes presented to your OS multiple times (even if only 1x path is connected), like sdb and sdc. One or more of those sd devices are then 'shadow' devices which can't be used to access the drive. If you now use the 'wrong' device it'll result in errors. Same could happen if you have only a SAS path to the controller which is currently not owning the MD virtual disk drive.
The other post might be right, we saw problems like this caused by faulty HW aswell, but most of the time it is a misunderstanding on how MD works.
If you want me to, I could have a look at your configuration to check for those issues. I would need a MD support log (in MDSM -> Tools -> Gather Support Information) and an output of fdisk -l would be great.
Bye,
Jens.
________________________________________
From: linux-poweredge-bounces-Lists On Behalf Of Dan Trainor [dtrainor at toolbox.com]
Sent: Thursday, January 06, 2011 7:19 PM
To: linux-poweredge-Lists
Subject: RE: Dell SAS 5/e, EL5/CentOS5, MD3000
Hi, Jens -
So I'm reading what I interpret as conflicting information. I've seen a lot of places talk about using the mptsas driver, and a lot of people saying I need to use the linuxrdac driver. The latter, I believe, is only if I'm using any kind of multipath. Eventually that would be nice, sure, but right now I just want to get a simple configuration working and expand from there.
I'm correct in going the mptsas/mptlinux route, right?
Thanks
-dant
From: Jens_Heinz at Dell.com [mailto:Jens_Heinz at Dell.com]
Sent: Thursday, January 06, 2011 10:05 AM
To: Dan Trainor; linux-poweredge at lists.us.dell.com
Subject: RE: Dell SAS 5/e, EL5/CentOS5, MD3000
How do you access the virtual disk (ie. using /dev/sdX device oder /dev/mapper)?
Which raid controller module the SAS cable is attached to?
Is the preferred owner of your virtual disk (on the MD3000) the same raid controller the cable is attached to?
Are you using the Dell provided linuxrdac driver?
If any possible send me a MD3000 support log to have a look at.
Regards,
Jens.
From: linux-poweredge-bounces-Lists On Behalf Of Dan Trainor
Sent: 06 January 2011 17:45
To: linux-poweredge-Lists
Subject: Dell SAS 5/e, EL5/CentOS5, MD3000
Hi -
It would seem that everyone else has gotten this to work, except for me (but with varying degrees of success). I'm starting to think that its hardware related now, unfortunately. However, this being my first time trying to use any Dell (Or LSI?) storage products, maybe I'm simply doing it wrong.
I'm using a SAS 5/e HBA connected to an MD3000. Pretty sure the MD3k checks out fine because the other controller on the MD3k is connected to a Windows-based backup machine that also uses the MD3k. I carved out a 60G virtual disk and exposed it to this single HBA that I'm trying to use, all under CentOS 5.5.
>From reading http://www.delltechcenter.com/page/Linux+RAID+and+Storage, I understand that I'm supposed to use "mptsas, part of the mptfusion driver family". I've not found this in later EL5/CentOS5 kernels, as the article suggests. I went ahead and used LSI's provided drivers in kmod form (mptbase), and their mptlinux util/sysv script. Since I wasn't able to find any clear-cut instructions, these steps seemed logical from what I figured.
I'm able to see the exposed slice from the BIOS of the SAS5/e card before the machine boots, and I've tried different methods of verification and even gone so far as to switch between the card being enabled in BIOS, OS, or both etc etc. Whenever I try to access the slice by any means, however, I get i/o errors. This is an example of a simple mkfs.ext3 that I attempted. I didn't expect it to work - but I wanted to debug some more and figured this was the most intrusive way to do so:
Jan 5 15:32:39 vmserver02 kernel: printk: 29826 messages suppressed.
Jan 5 15:32:39 vmserver02 kernel: Buffer I/O error on device sda, logical block 14319618
Jan 5 15:32:39 vmserver02 kernel: lost page write due to I/O error on sda
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 114557968
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 114558992
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 114560016
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 114819088
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 114820112
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 114821136
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 114822160
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 115081232
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 115082256
And, here's what I'ma ctually using:
[root at vmserver02.ops.az.domain.local x86_64]# lspci|grep "SCSI storage"
02:08.0 SCSI storage controller: LSI Logic / Symbios Logic SAS1068 PCI-X Fusion-MPT SAS (rev 01)
03:00.0 SCSI storage controller: LSI Logic / Symbios Logic SAS1068E PCI-Express Fusion-MPT SAS (rev 08)
I'm kind of stumped at this point. This is my second go at this over a six month period. Last time I got the same results, just got frustrated and then gave up. Hopefully with some help, though, I'll be able to get this working as I expect it to.
And, just out of curiosity, what is the difference between these two devices that lspci shows? There's only one card in there. Is one of these the actual onboard SAS controller?
Thanks!
-dant
_______________________________________________
Linux-PowerEdge mailing list
Linux-PowerEdge at dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq
From bond.masuda at jlbond.com Fri Jan 7 15:21:57 2011
From: bond.masuda at jlbond.com (Bond Masuda)
Date: Fri, 07 Jan 2011 13:21:57 -0800
Subject: 64GB showing up as 63.75GB on PE2900.. is this correct?
Message-ID: <1294435317.11516.14.camel@tokyo.bbky.org>
Hi List,
I have a PE2900 that runs CentOS 5.5... i just upgraded from 48GB to
64GB and had a few strange issues. These are 8GB 2Rx4 FB-DIMM PC2-5300F
modules from Kingston.
At first, when I plugged in the 8 modules, I got:
System MEmory Size: 48.0GB, System Memory Speed: 667Mhz
Error: Memory failure detected. Memory size reduced. Replace
the faulty DIMM as soon as possible.
DIMM Slot: 1 2 3 4 5 6 7 8 9 10 11 12
8GB 8GB 8GB 8GB 8GB 8GB 8GB 8GB Empty Empty Empty Empty
The FBD link to the following DIMM failed to train: DIMM 5
So, I removed DIMM 5 and decided to try to swap it with DIMM8. This time
it didn't say anything about memory failure, but it shows memory size as
63.75GB. Is this correct? I'm still questioning the stability of what
was DIMM5 (now in DIMM8), but I wanted to know if this 63.75GB is
expected behavior or not? I'm going to run a few days of memtest86+ on
this....
Any insights would be appreciated...
-Bond
From smooge at gmail.com Fri Jan 7 15:28:15 2011
From: smooge at gmail.com (Stephen John Smoogen)
Date: Fri, 7 Jan 2011 14:28:15 -0700
Subject: 64GB showing up as 63.75GB on PE2900.. is this correct?
In-Reply-To: <1294435317.11516.14.camel@tokyo.bbky.org>
References: <1294435317.11516.14.camel@tokyo.bbky.org>
Message-ID:
On Fri, Jan 7, 2011 at 14:21, Bond Masuda wrote:
> Hi List,
>
> I have a PE2900 that runs CentOS 5.5... i just upgraded from 48GB to
> 64GB and had a few strange issues. These are 8GB 2Rx4 FB-DIMM PC2-5300F
> modules from Kingston.
>
> At first, when I plugged in the 8 modules, I got:
>
> System MEmory Size: 48.0GB, System Memory Speed: 667Mhz
> Error: Memory failure detected. Memory size reduced. Replace
> the faulty DIMM as soon as possible.
>
> DIMM Slot: 1 ?2 ?3 ?4 ?5 ?6 ?7 ?8 ?9 ?10 ?11 ?12
> ? ? ? ?8GB 8GB 8GB 8GB 8GB 8GB 8GB 8GB Empty Empty Empty Empty
>
> The FBD link to the following DIMM failed to train: DIMM 5
Well the simplest test.
Take out all DIMMS except one tested good and see what the system size is.
Replace with the questionable DIMM and see what the size is.
Normally if you get something like that.. the DIMM is bad and just
replace it. The time and effort spending with memtest would be better
doing it elsewhere.
>
> So, I removed DIMM 5 and decided to try to swap it with DIMM8. This time
> it didn't say anything about memory failure, but it shows memory size as
> 63.75GB. Is this correct? I'm still questioning the stability of what
> was DIMM5 (now in DIMM8), but I wanted to know if this 63.75GB is
> expected behavior or not? I'm going to run a few days of memtest86+ on
> this....
>
> Any insights would be appreciated...
> -Bond
>
> _______________________________________________
> Linux-PowerEdge mailing list
> Linux-PowerEdge at dell.com
> https://lists.us.dell.com/mailman/listinfo/linux-poweredge
> Please read the FAQ at http://lists.us.dell.com/faq
>
--
Stephen J Smoogen.
"The core skill of innovators is error recovery, not failure avoidance."
Randy Nelson, President of Pixar University.
"Let us be kind, one to another, for most of us are fighting a hard
battle." -- Ian MacLaren
From jgoddard at gmi-mr.com Fri Jan 7 17:29:54 2011
From: jgoddard at gmi-mr.com (Jim Goddard)
Date: Fri, 07 Jan 2011 15:29:54 -0800
Subject: SUU updates to R710 ESXi hosts failing
Message-ID: <1294442994.1717.21.camel@jgoddard>
I am running into an issue trying to update some esxi servers using the
SUU utility.
I have burned the OM 6.4 Firmware Updates DVD, and mounted the SUU
repository via nfs. The process works fine on non-vmware hosts.
However, when trying to update the firmware on the esxi hosts, the
update is failing when trying to update the "Dell Embedded OpenManage
for ESXi 4.0" package. I am wondering if this is due to the fact that
these esxi servers are 4.1, whether the OpenManage "firmware" package is
ver. 6.3-0000 (not 6.4-xxxx), both of these, or something else entirely.
I manually installed the
OM-SrvAdmin-Dell-Web-6.3.0-2075.VIB-ESX41i_A00.8.zip package to these
servers at setup time, but this does not seem to be reflected in the gui
(shows version 0 installed).
Unfortunately, this error stops the process dead in its tracks. I am
thinking I should have just ignored this update and proceeded on to see
if the rest of the updates would apply, but I panicked thinking I had
messed up the updates on the first server this happened, and was outside
of my maintenance window before I got my head straightened... :(
Any pointers on what is going on with this package?
Thanks,
Jim
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.us.dell.com/pipermail/linux-poweredge/attachments/20110107/4347edd4/attachment.htm
From Jens_Heinz at Dell.com Sat Jan 8 06:07:51 2011
From: Jens_Heinz at Dell.com (Jens_Heinz at Dell.com)
Date: Sat, 8 Jan 2011 13:07:51 +0100
Subject: Dell SAS 5/e, EL5/CentOS5, MD3000
In-Reply-To:
References:
<399212640037934A8F25742DA026AFD7021BC143@LEJX7ADC103.EMEA.DELL.COM>,
<399212640037934A8F25742DA026AFD70201C976@LEJX7ADC103.EMEA.DELL.COM>,
<399212640037934A8F25742DA026AFD7021BC227@LEJX7ADC103.EMEA.DELL.COM>,
Message-ID: <399212640037934A8F25742DA026AFD70201C979@LEJX7ADC103.EMEA.DELL.COM>
Well, I might have to dig a little bit deeper.
On first sight it seems very complex, but it isn't. If you know 2 basic facts about MD systems, everything will make more sense.
What I'm going to explain now does apply to all controller types, SAS, iSCSI and FC (though Dell doesn't offer FC controllers in MD arrays).
1. Every virtual disk (VD) created on a MD array is accessible/visible through both raid controller modules (RCM), BUT not at the same time. That concludes each VD can only have ONE RCM which does it own at any given time. If a request to access a VD comes trough a RCM which is not currently the owner then MD starts an attempt to move the VD to the requesting RCM, called controller failover.
2. There are 2 ways to initiate such a failover. The 1st is I/O-based (called AVT failover for Automatic Volume Transfer), the 2nd is driver based, called RDAC failover. I might be out of date on that, but Dell currently offers only RDAC failover, while other LSI partners have AVT implemented as well. The failover mode is determined by the chosen host type (see last couple of lines in your storage array profile).
Now the details ;-)
Point 2 does explain why linuxrdac is superior over DM, because it can deal with there mentioned 'limitations' out of box.
Let's say you have regular dual controller MD3000 with 4 cables connected to a host and DM enabled, one VD configured with RCM0 as preferred path and AVT enabled. In case load balancing is enabled, DM usually starts sending I/O to 1st known path, then the 2nd, 3rd etc. (I know it isn't necessarily the exact order, just to simplify my explanation.)
That is no problem as long the I/O goes to RCM0. But once the 1st I/O goes to RCM1 it forces the VD to move ownership to RCM1. By the time your VD completed its failover, the I/O cycle might be back to RCM0 and you end up with a ping-pong situation.
You will object that can only happen with AVT which isn't available on Dell's MD series, but I can assure you will run into trouble on MDs too.
In contrast the linuxrdac driver is aware of that situation and considers all non-owning paths as pure standby. Once all active paths become unavailable the driver checks for standby paths and if there is at least one it send a command (call 2C - if applicable you can see those in your MD's Major Event Log - MEL) across that path to tell the array to move the VD to the other RCM. The MD then turns amber and recovery guru tells you that event: 'VD not on preferred path'.
That is why you have to make sure to use RDAC HW handler when using DM. And also don't try any fancy load balancing policies w/o considering the consequences.
So, how we want you to setup a MD array for optimal functionality?
Have at least one path to every RCM configured and a RDAC aware multipath driver active. (Of course we only support the Dell provided drivers. But many customers are out there using Linux DM successfully.)
Hope that unravels some mysteries about MD arrays...
Last but not least to answer your questions:
- Yes, if a VD is owned by RCM0, access via RCM1 is not possible at the same time, but it can be seen from there.
- True, every VD can always be seen from both controllers (see above). So to say, that is a requirement for MD to work. Your host/driver must be aware there would be another path(s) to access the VD.
At this point I have to throw in a BUT again. We also saw situations where VDs are not initially visible when single path connected to wrong RCM, especially with Citrix XenServer. You have to configure access via owning RCM then.
- It is the other way around. The MD does not tell any driver anything. Drivers do detect that a certain path is no longer available. And if RDAC aware it sends the 2C command to move the particular VD. In non-RDAC configuration the multipath driver is just doing whatever it is supposed to do, usually sending I/O to one of the other paths and hopefully AVT is enabled then ;-)
I think that is an important thing to mention. Let's say you have many VDs configured on a MD and the owning RCM dies for some reason, the MD will do nothing in regards of moving VDs (of course it does report the failed RCM), but the VDs' ownership stay with the failed RCM until MD receives either I/O or 2C command for a particular VD. To be precise, if a VD is not used at all, it stays with the broken RCM forever.
- No, the number of hosts a VD can exposed to is limited only by the number of your licensed Storage Partitions. By default you have 16 Storage Partitions (which means 16 hosts). You can create a host group in Modular Disk Storage Manager (MDSM) and assign 2 or more hosts to it and of course one or more VDs (I don't recall the maximum number of VDs per host/host group now, but I think it is 256).
All the hosts in that host group share same VD(s) then and are able to access it. BUT keep in mind MD arrays are SANs and provide block devices to your hosts, they just don't care what happens on filesystem level. That is why we only support cluster filesystems (like OCFS, VMFS, GFS, W2k8's CSV etc.) for host groups. But if you can make 100% sure that every single byte of data on that drive is consistent at any point in time, feel free to use it whatever way suits you. Personally I strongly advise against any such constellations. And I have to tell that here ;-) It is dangerous, might cause data loss and Dell doesn't support it!
- One more thing. In situations, where a host group contains multiple non-redundant connected hosts, those connections MUST go to the same RCM. Otherwise the above ping-pong effect will kick in. (ie. host0 with a single path to RCM0 reads from VD + forces VD to move to RCM0 -> at the same time host1 with a single path to RCM1 reads from same VD + initiates a failback to RCM1, which makes VD unavailable for host0 -> and so forth and so on ...)
I really hope I covered all your questions and everything is understandable and didn't confuse you even more ;-)
Bye,
Jens.
________________________________________
From: linux-poweredge-bounces-Lists On Behalf Of Dan Trainor [dtrainor at toolbox.com]
Sent: Friday, January 07, 2011 5:53 PM
To: linux-poweredge-Lists
Subject: RE: Dell SAS 5/e, EL5/CentOS5, MD3000
Hi, Jens -
Once again - fantastic information. I'm going to compile a Wiki article somewhere of your brain dumps here, not only for my reference, but for others as well. This is really good stuff.
I'm going to try working with both linuxrdac and DM. I throw DM in there because I've used it before with multipathd, also for storage. That was different though - I was using all Brocade and QLogic components. I'd expect this to work the same way.
Regarding controller ownership, active/active, etc etc. If I was connected to controller 1 on the MD3000, and the volume was assigned to controller 0, no devices on controller 1 would be able to use that volume? Would I at least be able to see it - just not use it? The more you describe this, the more I think that the volume was not actually assigned properly, via preferred owner.
I think I understand that linuxrdac is preferred over DM because it seems that linuxrdac can communicate directly with the MD3000, and listen for signals such as that which would be good to know in the event that some controller died and linuxrdac needs to do its multipath thing to switch over to the other port/controller. I don't see how, using DM, a host running Linux would *know* to agree with the MD3000 that a controller or component is failing, unless both sides could agree on this fact. Given the fact that an active/active implementation does not truly exist, I think this is a very important point as to why linuxrdac would be preferred over DM. In a perfect world where nothing broke, I'd use DM, I guess.
Thanks again for your help. I'm still waiting on that log, along with some hardware. I'm thinking that the volume was not set up properly, unfortunately I did not set it up. I think I can confirm this without the log by knowing the answers to a few questions above re: ownership/assignment/visibility etc etc.
Thanks!
-dant
-----Original Message-----
From: Jens_Heinz at Dell.com [mailto:Jens_Heinz at Dell.com]
Sent: Friday, January 07, 2011 1:44 AM
To: Dan Trainor; linux-poweredge at lists.us.dell.com
Subject: RE: Dell SAS 5/e, EL5/CentOS5, MD3000
Good morning,
always glad to help ;-)
I'll try to shine some light on everything.
First thing, you always need mptsas/mptlinux if you're using a MD3000, because it is the kernel module/driver for your SAS5/e HBA. Of course you don't need it with MD3000i (iSCSI version).
Then you'll have to decide whether to use linuxrdac or device mapper for multipath. The requirement for linuxrdac comes from LSI and honestly I would prefer it over device mapper. RDAC is made for Engenio/LSI controllers. We too see much less issues here in regards of failover. (But the reason might be, that customers don't call in with device mapper issues since we don't support it. So I would be glad to hear other opinions on that.)
The 20MB device you mentioned (should have LUN ID 31 be default), is the MD's access virtual disk. It is essential for the MD's LUN mappings to work and shall never be touched! W2k8 and ESX are already able to hide that from being visible to users.
What I call a 'shadow' device (I use that name in lack of a proper name) is the way how the MD presents its virtual disks to the host. It always tells the host how many paths are available to access the particular virtual disk. (You can say it depends on the number of host ports on your MD). If the host in return is able to interpret that information, it'll create device nodes for those 'virtual usable' paths too. As far as I recall with Linux that should only happen if more than one path is connected or a rdac driver is active.
Another detail to point out is the following. If you have a MD with 2x controller modules (I'm still not sure if that is the case with yours), a virtual disk can only be owned by one of the both controllers (though both controllers are active and marketing is often referring to it as active/active, which is not entirely true). In MDSM the owner would be called 'preferred path'.
Let's say your particular virtual disk's preferred path is raid controller 0, but your host is connected to raid controller 1 (I'm not talking about the ports on SAS HBA), then MD can tell Linux there is a drive for you, but in fact it isn't because the physical path doesn't exist (it is on the other controller => 0). On access of course the MD internally tries to move the ownership to the required controller, but that not always succeeds and results in I/O errors like yours. I saw that a lot. Usually you just have to change the preferred ownership in MDSM for that virtual drive.
Once I had a look at the MD log I can tell you more,
Jens.
________________________________________
From: linux-poweredge-bounces-Lists On Behalf Of Dan Trainor [dtrainor at toolbox.com]
Sent: Thursday, January 06, 2011 7:43 PM
To: linux-poweredge-Lists
Subject: RE: Dell SAS 5/e, EL5/CentOS5, MD3000
Hi, Jens -
Wow, that's a lot of good information, thank you. I have a better understanding of MD now.
So in my situation, using RHEL/CentOS, I understand that I can use either linuxrdac or mptsas/mptlinux. Which is recommended more over the other? I suppose you're obligated to say that the linuxrdac driver works better, is better supported etc etc, but if I could get your opinion :)
What causes these "shadow" drives? Funny you should mention. I also did see one of these shadow drives, it was a 20M virtual that did not exist in the MD3000 itself. Don't know if that has any significance.
I'm familiar with the ramifications of using right/wrong device names in Linux, and the errors that I was getting were not indicative of this.
I juggled the cable around a little bit, switched to a different SAS port on the SAS5/e card, and I see:
Jan 6 11:20:35 vmserver02 kernel: mptsas: ioc0: attaching ssp device: fw_channel 0, fw_id 16, phy 0, sas_addr 0x50022194925cb504
Jan 6 11:20:35 vmserver02 kernel: target0:0:1: mptsas: ioc0: add device: fw_channel 0, fw_id 16, phy 0, sas_addr 0x50022194925cb504
Jan 6 11:20:35 vmserver02 kernel: Vendor: DELL Model: MD3000 Rev: 0670
Jan 6 11:20:35 vmserver02 kernel: Type: Direct-Access ANSI SCSI revision: 05
Jan 6 11:20:35 vmserver02 kernel: scsi 0:0:1:0: mptscsih: ioc0: qdepth=64, tagged=1, simple=1, ordered=0, scsi_level=6, cmd_que=1
Jan 6 11:20:35 vmserver02 kernel: scsi 0:0:1:0: Attached scsi generic sg0 type 0
Jan 6 11:20:35 vmserver02 kernel: Vendor: DELL Model: MD3000 Rev: 0670
Jan 6 11:20:35 vmserver02 kernel: Type: Direct-Access ANSI SCSI revision: 05
Jan 6 11:20:36 vmserver02 kernel: scsi 0:0:1:2: mptscsih: ioc0: qdepth=64, tagged=1, simple=1, ordered=0, scsi_level=6, cmd_que=1
Jan 6 11:20:36 vmserver02 kernel: SCSI device sda: 125829120 512-byte hdwr sectors (64425 MB)
Jan 6 11:20:36 vmserver02 kernel: sda: Write Protect is off
Jan 6 11:20:36 vmserver02 kernel: SCSI device sda: drive cache: write back w/ FUA
Jan 6 11:20:36 vmserver02 kernel: SCSI device sda: 125829120 512-byte hdwr sectors (64425 MB)
Jan 6 11:20:36 vmserver02 kernel: sda: Write Protect is off
Jan 6 11:20:36 vmserver02 kernel: SCSI device sda: drive cache: write back w/ FUA
Jan 6 11:20:36 vmserver02 kernel: sda: unknown partition table
Jan 6 11:20:36 vmserver02 kernel: sd 0:0:1:2: Attached scsi disk sda
Jan 6 11:20:36 vmserver02 kernel: sd 0:0:1:2: Attached scsi generic sg1 type 0
Jan 6 11:20:36 vmserver02 kernel: Vendor: DELL Model: Universal Xport Rev: 0670
Jan 6 11:20:36 vmserver02 kernel: Type: Direct-Access ANSI SCSI revision: 05
Jan 6 11:20:36 vmserver02 kernel: scsi 0:0:1:31: mptscsih: ioc0: qdepth=64, tagged=1, simple=1, ordered=0, scsi_level=6, cmd_que=1
Jan 6 11:20:36 vmserver02 kernel: SCSI device sdb: 40960 512-byte hdwr sectors (21 MB)
Jan 6 11:20:36 vmserver02 kernel: sdb: Write Protect is off
Jan 6 11:20:36 vmserver02 kernel: SCSI device sdb: drive cache: write through w/ FUA
Jan 6 11:20:36 vmserver02 kernel: SCSI device sdb: 40960 512-byte hdwr sectors (21 MB)
Jan 6 11:20:36 vmserver02 kernel: sdb: Write Protect is off
Jan 6 11:20:36 vmserver02 kernel: SCSI device sdb: drive cache: write through w/ FUA
Jan 6 11:20:36 vmserver02 kernel: sdb: unknown partition table
Jan 6 11:20:36 vmserver02 kernel: sd 0:0:1:31: Attached scsi disk sdb
Jan 6 11:20:36 vmserver02 kernel: sd 0:0:1:31: Attached scsi generic sg6 type 0
You'll see both drives - my 60G volume, and that mysterious 20M volume.
I'm using the mptsas driver right now. I think forcing a reset, but unplugging/plugging in the SAS cable, might have shaken things up enough so that the new driver now understands how to talk to the MD3000. I'm going to reproduce all this on a clean install to see what the deal is. Either I misunderstood how setting/resetting the SAS controller/path works, or I have a bad cable and just happened to temporarily fix it by moving it around a little bit.
Jens, I sincerely appreciate your help. I hope this also serves as some good information for others trying to figure this out, as well.
Thanks
-dant
-----Original Message-----
From: Jens_Heinz at Dell.com [mailto:Jens_Heinz at Dell.com]
Sent: Thursday, January 06, 2011 11:35 AM
To: Dan Trainor
Subject: RE: Dell SAS 5/e, EL5/CentOS5, MD3000
Hi,
in fact Dell supports only the linuxrdac driver for all MD3000(i) systems, but it is only available for RHEL/SLES based Linux. Debian has to use Linux multipath with RDAC device handler. The mptsas/mptlinux is only dealing as driver for the SAS HBA (SAS5/e).
But the important thing is that your MD3000 drive is sometimes presented to your OS multiple times (even if only 1x path is connected), like sdb and sdc. One or more of those sd devices are then 'shadow' devices which can't be used to access the drive. If you now use the 'wrong' device it'll result in errors. Same could happen if you have only a SAS path to the controller which is currently not owning the MD virtual disk drive.
The other post might be right, we saw problems like this caused by faulty HW aswell, but most of the time it is a misunderstanding on how MD works.
If you want me to, I could have a look at your configuration to check for those issues. I would need a MD support log (in MDSM -> Tools -> Gather Support Information) and an output of fdisk -l would be great.
Bye,
Jens.
________________________________________
From: linux-poweredge-bounces-Lists On Behalf Of Dan Trainor [dtrainor at toolbox.com]
Sent: Thursday, January 06, 2011 7:19 PM
To: linux-poweredge-Lists
Subject: RE: Dell SAS 5/e, EL5/CentOS5, MD3000
Hi, Jens -
So I'm reading what I interpret as conflicting information. I've seen a lot of places talk about using the mptsas driver, and a lot of people saying I need to use the linuxrdac driver. The latter, I believe, is only if I'm using any kind of multipath. Eventually that would be nice, sure, but right now I just want to get a simple configuration working and expand from there.
I'm correct in going the mptsas/mptlinux route, right?
Thanks
-dant
From: Jens_Heinz at Dell.com [mailto:Jens_Heinz at Dell.com]
Sent: Thursday, January 06, 2011 10:05 AM
To: Dan Trainor; linux-poweredge at lists.us.dell.com
Subject: RE: Dell SAS 5/e, EL5/CentOS5, MD3000
How do you access the virtual disk (ie. using /dev/sdX device oder /dev/mapper)?
Which raid controller module the SAS cable is attached to?
Is the preferred owner of your virtual disk (on the MD3000) the same raid controller the cable is attached to?
Are you using the Dell provided linuxrdac driver?
If any possible send me a MD3000 support log to have a look at.
Regards,
Jens.
From: linux-poweredge-bounces-Lists On Behalf Of Dan Trainor
Sent: 06 January 2011 17:45
To: linux-poweredge-Lists
Subject: Dell SAS 5/e, EL5/CentOS5, MD3000
Hi -
It would seem that everyone else has gotten this to work, except for me (but with varying degrees of success). I'm starting to think that its hardware related now, unfortunately. However, this being my first time trying to use any Dell (Or LSI?) storage products, maybe I'm simply doing it wrong.
I'm using a SAS 5/e HBA connected to an MD3000. Pretty sure the MD3k checks out fine because the other controller on the MD3k is connected to a Windows-based backup machine that also uses the MD3k. I carved out a 60G virtual disk and exposed it to this single HBA that I'm trying to use, all under CentOS 5.5.
>From reading http://www.delltechcenter.com/page/Linux+RAID+and+Storage, I understand that I'm supposed to use "mptsas, part of the mptfusion driver family". I've not found this in later EL5/CentOS5 kernels, as the article suggests. I went ahead and used LSI's provided drivers in kmod form (mptbase), and their mptlinux util/sysv script. Since I wasn't able to find any clear-cut instructions, these steps seemed logical from what I figured.
I'm able to see the exposed slice from the BIOS of the SAS5/e card before the machine boots, and I've tried different methods of verification and even gone so far as to switch between the card being enabled in BIOS, OS, or both etc etc. Whenever I try to access the slice by any means, however, I get i/o errors. This is an example of a simple mkfs.ext3 that I attempted. I didn't expect it to work - but I wanted to debug some more and figured this was the most intrusive way to do so:
Jan 5 15:32:39 vmserver02 kernel: printk: 29826 messages suppressed.
Jan 5 15:32:39 vmserver02 kernel: Buffer I/O error on device sda, logical block 14319618
Jan 5 15:32:39 vmserver02 kernel: lost page write due to I/O error on sda
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 114557968
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 114558992
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 114560016
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 114819088
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 114820112
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 114821136
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 114822160
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 115081232
Jan 5 15:32:39 vmserver02 kernel: end_request: I/O error, dev sda, sector 115082256
And, here's what I'ma ctually using:
[root at vmserver02.ops.az.domain.local x86_64]# lspci|grep "SCSI storage"
02:08.0 SCSI storage controller: LSI Logic / Symbios Logic SAS1068 PCI-X Fusion-MPT SAS (rev 01)
03:00.0 SCSI storage controller: LSI Logic / Symbios Logic SAS1068E PCI-Express Fusion-MPT SAS (rev 08)
I'm kind of stumped at this point. This is my second go at this over a six month period. Last time I got the same results, just got frustrated and then gave up. Hopefully with some help, though, I'll be able to get this working as I expect it to.
And, just out of curiosity, what is the difference between these two devices that lspci shows? There's only one card in there. Is one of these the actual onboard SAS controller?
Thanks!
-dant
_______________________________________________
Linux-PowerEdge mailing list
Linux-PowerEdge at dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq
_______________________________________________
Linux-PowerEdge mailing list
Linux-PowerEdge at dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq
From Yen_Onn_Hiu at dell.com Mon Jan 10 01:24:04 2011
From: Yen_Onn_Hiu at dell.com (Yen_Onn_Hiu at dell.com)
Date: Mon, 10 Jan 2011 15:24:04 +0800
Subject: bootstrap_firmware fails
Message-ID: <02B75F9811B5BA4497B032FA7AC381AF03036EE491@PENX7MCDC102.APAC.DELL.COM>
Hi Admin,
I installed OMSA 6.4 recently and I found out that the $bootstrap_firmware is failed to return any values. Is that a bug in the python scripts? Thanks!
[root at labbldtest11 post_install]# omreport about
Product name : Server Administrator
Version : 6.4.0
Copyright : Copyright (C) Dell Inc. 1995-2010 All rights reserved.
Company : Dell Inc.
[root at labbldtest11 post_install]# cat /etc/issue
Enterprise Linux Enterprise Linux Server release 5.5 (Tikanga)
[root at penlabbldtest11 post_install]# uname -r
2.6.18-194.el5
[root at labbldtest11 post_install]# bootstrap_firmware
Traceback (most recent call last):
File "/usr/sbin/bootstrap_firmware", line 23, in ?
ftmain.main(sys.argv[1:])
File "/usr/share/firmware-tools/ftmain.py", line 89, in main
base.getOptionsConfig(args)
File "/usr/share/firmware-tools/cli.py", line 106, in getOptionsConfig
disabledPlugins=self.opts.disabledPlugins)
File "/usr/lib/python2.4/site-packages/firmwaretools/__init__.py", line 114, in _getConfig
self.setupLogging(self.loggingConfig, self.verbosity, self.trace)
File "/usr/lib/python2.4/site-packages/firmwaretools/__init__.py", line 127, in setupLogging
logging.config.fileConfig(configFile)
File "/usr/lib64/python2.4/logging/config.py", line 76, in fileConfig
flist = cp.get("formatters", "keys")
File "/usr/lib64/python2.4/ConfigParser.py", line 511, in get
raise NoSectionError(section)
[root at labbldtest11 post_install]# inventory_firmware
Traceback (most recent call last):
File "/usr/sbin/inventory_firmware", line 23, in ?
ftmain.main(sys.argv[1:])
File "/usr/share/firmware-tools/ftmain.py", line 89, in main
base.getOptionsConfig(args)
File "/usr/share/firmware-tools/cli.py", line 106, in getOptionsConfig
disabledPlugins=self.opts.disabledPlugins)
File "/usr/lib/python2.4/site-packages/firmwaretools/__init__.py", line 114, in _getConfig
self.setupLogging(self.loggingConfig, self.verbosity, self.trace)
File "/usr/lib/python2.4/site-packages/firmwaretools/__init__.py", line 127, in setupLogging
logging.config.fileConfig(configFile)
File "/usr/lib64/python2.4/logging/config.py", line 76, in fileConfig
flist = cp.get("formatters", "keys")
File "/usr/lib64/python2.4/ConfigParser.py", line 511, in get
raise NoSectionError(section)
ConfigParser.NoSectionError: No section: 'formatters'
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.us.dell.com/pipermail/linux-poweredge/attachments/20110110/8d178918/attachment.htm
From tr.ml at gmx.de Mon Jan 10 03:28:52 2011
From: tr.ml at gmx.de (Rainer Traut)
Date: Mon, 10 Jan 2011 10:28:52 +0100
Subject: megaraid_sas driver version on RHEL5
Message-ID: <4D2AD154.6060701@gmx.de>
Hi,
just recently I asked nearly the same question regarding megaraid_sas
driver version.
now Dell released version 00.00.04.31.2-1, A12
and in recent RHEL5.5 kernels 00.00.04.17-4.31.z-RH1
The description for the Dell driver states:
NOTE: newer releases (than EL5.5) of the above operating systems should
using the native "in-box" driver version instead.
Am I still fine with the RHEL shipped driver?
Btw:
# omreport storage controller
...
Driver Version : 00.00.04.17-4.31.z-RH1
Minimum Required Driver Version : Not Applicable
...
does not seem to check megaraid_sas driver?
Thx
Rainer
From davide.ferrari at atrapalo.com Mon Jan 10 03:45:55 2011
From: davide.ferrari at atrapalo.com (Davide Ferrari)
Date: Mon, 10 Jan 2011 10:45:55 +0100
Subject: Controller degraded but disks are ok
In-Reply-To:
References:
Message-ID: <1294652755.2603.6.camel@pc-0707-007>
On Tue, 2010-12-28 at 04:51 -0800, Mike Drzal wrote:
> Davide,
>
> This is actually the driver within the OS. Something like
> http://support.dell.com/support/downloads/format.aspx?c=us&cs=555&l=en&s=bi
> z&deviceid=13514&libid=46&releaseid=R282637&vercnt=6&formatcnt=0&SystemID=P
> WE_1950&servicetag=&os=RHEL5&osl=en&catid=-1&dateid=-1&typeid=-1&formatid=-
> 1&impid=-1&checkFormat=true is an example.
Mike,
indeed it was a kernel device driver problem. We updated to latest
2.6.26 lenny revision and now it works as expected.
Thanks.
--
Davide Ferrari
System Administrator
Atrapalo S.L.
From Vaibhav_Kumar at Dell.com Mon Jan 10 05:43:41 2011
From: Vaibhav_Kumar at Dell.com (Vaibhav_Kumar at Dell.com)
Date: Mon, 10 Jan 2011 03:43:41 -0800
Subject: megaraid_sas driver version on RHEL5
In-Reply-To: <4D2AD154.6060701@gmx.de>
References: <4D2AD154.6060701@gmx.de>
Message-ID: <46F6103325A0C04E99DF2ECDA75808031D54E77CFD@BLRX7MCDC202.AMER.DELL.COM>
----- Snip -----
# omreport storage controller
...
Driver Version : 00.00.04.17-4.31.z-RH1
Minimum Required Driver Version : Not Applicable
...
does not seem to check megaraid_sas driver?
---- Snip ------
Answer to above snip :
1. Minimum Required Driver Version will populate only when you are not meeting minimum required level. If you are already meeting the required level, it will display as "Not Applicable"
2. "Minimum Required Driver Version" in OMSS will ask customer to have that Driver _at least_ to support the features OpenManage has shipped with. There can be drivers greater than minimum one in the support site. These drivers will have fixes not directly linked with OpenManage ex- some minor performance fix. But if you have latest driver, OpenManage will behave similarly as that of Minimum Required one.
BTW, it is always good to have latest driver.
Regards
-Vaibhav
-----Original Message-----
From: linux-poweredge-bounces-Lists On Behalf Of Rainer Traut
Sent: Monday, January 10, 2011 2:59 PM
To: linux-poweredge-Lists
Subject: megaraid_sas driver version on RHEL5
Hi,
just recently I asked nearly the same question regarding megaraid_sas
driver version.
now Dell released version 00.00.04.31.2-1, A12
and in recent RHEL5.5 kernels 00.00.04.17-4.31.z-RH1
The description for the Dell driver states:
NOTE: newer releases (than EL5.5) of the above operating systems should
using the native "in-box" driver version instead.
Am I still fine with the RHEL shipped driver?
Btw:
# omreport storage controller
...
Driver Version : 00.00.04.17-4.31.z-RH1
Minimum Required Driver Version : Not Applicable
...
does not seem to check megaraid_sas driver?
Thx
Rainer
_______________________________________________
Linux-PowerEdge mailing list
Linux-PowerEdge at dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq
From Vaibhav_Kumar at Dell.com Mon Jan 10 05:56:06 2011
From: Vaibhav_Kumar at Dell.com (Vaibhav_Kumar at Dell.com)
Date: Mon, 10 Jan 2011 17:26:06 +0530
Subject: SUU updates to R710 ESXi hosts failing
References: <1294442994.1717.21.camel@jgoddard>
Message-ID: <46F6103325A0C04E99DF2ECDA75808031D54E77D0B@BLRX7MCDC202.AMER.DELL.COM>
Please see the below link for more info on F/W updates (DUP) support in case of ESXi
http://support.dell.com/support/edocs/software/smsom/6.3/en/peosom/change_m.htm#wp1037988
Regards
-Vaibhav
From: linux-poweredge-bounces-Lists On Behalf Of Jim Goddard
Sent: Saturday, January 08, 2011 5:00 AM
To: linux-poweredge-Lists
Subject: SUU updates to R710 ESXi hosts failing
I am running into an issue trying to update some esxi servers using the SUU utility.
I have burned the OM 6.4 Firmware Updates DVD, and mounted the SUU repository via nfs. The process works fine on non-vmware hosts. However, when trying to update the firmware on the esxi hosts, the update is failing when trying to update the "Dell Embedded OpenManage for ESXi 4.0" package. I am wondering if this is due to the fact that these esxi servers are 4.1, whether the OpenManage "firmware" package is ver. 6.3-0000 (not 6.4-xxxx), both of these, or something else entirely. I manually installed the OM-SrvAdmin-Dell-Web-6.3.0-2075.VIB-ESX41i_A00.8.zip package to these servers at setup time, but this does not seem to be reflected in the gui (shows version 0 installed).
Unfortunately, this error stops the process dead in its tracks. I am thinking I should have just ignored this update and proceeded on to see if the rest of the updates would apply, but I panicked thinking I had messed up the updates on the first server this happened, and was outside of my maintenance window before I got my head straightened... :(
Any pointers on what is going on with this package?
Thanks,
Jim
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.us.dell.com/pipermail/linux-poweredge/attachments/20110110/c22e655f/attachment-0001.htm
From Pietro.Abate at pps.jussieu.fr Mon Jan 10 08:22:28 2011
From: Pietro.Abate at pps.jussieu.fr (Pietro Abate)
Date: Mon, 10 Jan 2011 15:22:28 +0100
Subject: PE 1850 [OT]
Message-ID: <20110110142228.GA12472@uranium.pps.jussieu.fr>
Hello,
I'm loking after a 1850 and the other day one of the two
disks went offline. I've called DELL and the told me that
there are no more replacement parts for this model (A SCSI
disk 320Gb - the old disk is a seegate ST3146807LC).
Does anybody have experience about refurbishing these
servers with other disks ? if so, what do you reccommend ?
sorry for the offtopic here, but I cannot think of any better
forum then the linux forum for this kind of questions :)
pp
--
----
http://en.wikipedia.org/wiki/Posting_style
From Jeffrey_L_Mendoza at Dell.com Mon Jan 10 09:39:25 2011
From: Jeffrey_L_Mendoza at Dell.com (Jeffrey_L_Mendoza at Dell.com)
Date: Mon, 10 Jan 2011 09:39:25 -0600
Subject: bootstrap_firmware fails
In-Reply-To: <02B75F9811B5BA4497B032FA7AC381AF03036EE491@PENX7MCDC102.APAC.DELL.COM>
References: <02B75F9811B5BA4497B032FA7AC381AF03036EE491@PENX7MCDC102.APAC.DELL.COM>
Message-ID:
> I installed OMSA 6.4 recently and I found out that the $bootstrap_firmware
> is failed to return any values. Is that a bug in the python scripts? Thanks!
> ? File "/usr/lib64/python2.4/ConfigParser.py", line 511, in get
> ??? raise NoSectionError(section)
> ConfigParser.NoSectionError: No section: 'formatters'
Looks like your /etc/firmware/firmware.conf file is missing or corrupted. It should be installed with the firmware-tools package.
-Jeff
From Yen_Onn_Hiu at dell.com Mon Jan 10 10:14:15 2011
From: Yen_Onn_Hiu at dell.com (Yen_Onn_Hiu at dell.com)
Date: Tue, 11 Jan 2011 00:14:15 +0800
Subject: bootstrap_firmware fails
In-Reply-To:
References: <02B75F9811B5BA4497B032FA7AC381AF03036EE491@PENX7MCDC102.APAC.DELL.COM>
Message-ID: <02B75F9811B5BA4497B032FA7AC381AF03036EE815@PENX7MCDC102.APAC.DELL.COM>
Thanks for your reply. Does firmware.conf come from dell_ft_install package?
-----Original Message-----
From: Mendoza, Jeff
Sent: Monday, January 10, 2011 11:39 PM
To: Hiu, Yen Onn; linux-poweredge-Lists
Subject: RE: bootstrap_firmware fails
> I installed OMSA 6.4 recently and I found out that the
> $bootstrap_firmware is failed to return any values. Is that a bug in the python scripts? Thanks!
> ? File "/usr/lib64/python2.4/ConfigParser.py", line 511, in get
> ??? raise NoSectionError(section)
> ConfigParser.NoSectionError: No section: 'formatters'
Looks like your /etc/firmware/firmware.conf file is missing or corrupted. It should be installed with the firmware-tools package.
-Jeff
From stroller at stellar.eclipse.co.uk Mon Jan 10 11:19:31 2011
From: stroller at stellar.eclipse.co.uk (Stroller)
Date: Mon, 10 Jan 2011 17:19:31 +0000
Subject: PE 1850 [OT]
In-Reply-To: <20110110142228.GA12472@uranium.pps.jussieu.fr>
References: <20110110142228.GA12472@uranium.pps.jussieu.fr>
Message-ID: <1FDD003A-9E0C-42FA-AAE9-2ED4DFEC1A8F@stellar.eclipse.co.uk>
On 10/1/2011, at 2:22pm, Pietro Abate wrote:
> ...
> I'm loking after a 1850 and the other day one of the two
> disks went offline. I've called DELL and the told me that
> there are no more replacement parts for this model (A SCSI
> disk 320Gb - the old disk is a seegate ST3146807LC).
>
> Does anybody have experience about refurbishing these
> servers with other disks ? if so, what do you reccommend ?
Just grab any Ultra320 / 80pin SCA SCSI drive.
It's prolly much cheaper to buy two secondhand disks of this type than one new one; all SCSI are "enterprise grade" disks, so pretty reliable.
You may find that drives of another model are .5 gig smaller than the one you're replacing. In this case boot from a LiveCD, reduce the filesystem size then the partition, and post back if you have any problems.
http://www.scsishop.co.uk/acatalog/Seagate_ST3146807LC.html
http://shop.ebay.co.uk/i.html?_nkw=ST3146807LC
Stroller.
From jlar310 at gmail.com Mon Jan 10 12:40:18 2011
From: jlar310 at gmail.com (Jeff)
Date: Mon, 10 Jan 2011 12:40:18 -0600
Subject: PE 1850 [OT]
In-Reply-To: <1FDD003A-9E0C-42FA-AAE9-2ED4DFEC1A8F@stellar.eclipse.co.uk>
References: <20110110142228.GA12472@uranium.pps.jussieu.fr>
<1FDD003A-9E0C-42FA-AAE9-2ED4DFEC1A8F@stellar.eclipse.co.uk>
Message-ID:
On Mon, Jan 10, 2011 at 11:19 AM, Stroller
wrote:
>
> On 10/1/2011, at 2:22pm, Pietro Abate wrote:
>> ...
>> I'm loking after a 1850 and the other day one of the two
>> disks went offline. I've called DELL and the told me that
>> there are no more replacement parts for this model (A SCSI
>> disk 320Gb - the old disk is a seegate ST3146807LC).
>>
>> Does anybody have experience about refurbishing these
>> servers with other disks ? if so, what do you reccommend ?
>
> Just grab any Ultra320 / 80pin SCA SCSI drive.
>
+1.
I just replaced both drives in an 1850 with brand new SCSI drives
twice the size purchased from CDW (USA). The 1850 is from before Dell
started doing firmware checks in their drive controller card bios to
verify that it's a Dell branded drive.
Did they ever back off that hardware check on the newest servers?
--
Jeff
From jim at broadtime.com Mon Jan 10 18:09:47 2011
From: jim at broadtime.com (Jim Nelson)
Date: Mon, 10 Jan 2011 19:09:47 -0500
Subject: PE 1850 [OT]
In-Reply-To:
References: <20110110142228.GA12472@uranium.pps.jussieu.fr> <1FDD003A-9E0C-42FA-AAE9-2ED4DFEC1A8F@stellar.eclipse.co.uk>
Message-ID: <4D2B9FCB.7050103@broadtime.com>
On 1/10/2011 1:40 PM, Jeff wrote:
> On Mon, Jan 10, 2011 at 11:19 AM, Stroller
> wrote:
>>
>> On 10/1/2011, at 2:22pm, Pietro Abate wrote:
>>> ...
>>> I'm loking after a 1850 and the other day one of the two
>>> disks went offline. I've called DELL and the told me that
>>> there are no more replacement parts for this model (A SCSI
>>> disk 320Gb - the old disk is a seegate ST3146807LC).
>>>
>>> Does anybody have experience about refurbishing these
>>> servers with other disks ? if so, what do you reccommend ?
>>
>> Just grab any Ultra320 / 80pin SCA SCSI drive.
>>
>
>
> +1.
>
> I just replaced both drives in an 1850 with brand new SCSI drives
> twice the size purchased from CDW (USA). The 1850 is from before Dell
> started doing firmware checks in their drive controller card bios to
> verify that it's a Dell branded drive.
>
> Did they ever back off that hardware check on the newest servers?
>
> --
> Jeff
>
The 1750s we ran at my old job took generic refurbished drives IIRC. We only had to replace one before we retired all of them...
Jim
From Dell at epperson.homelinux.net Mon Jan 10 18:57:50 2011
From: Dell at epperson.homelinux.net (J. Epperson)
Date: Mon, 10 Jan 2011 19:57:50 -0500
Subject: PE 1850 [OT]
In-Reply-To: <4D2B9FCB.7050103@broadtime.com>
References: <20110110142228.GA12472@uranium.pps.jussieu.fr>
<1FDD003A-9E0C-42FA-AAE9-2ED4DFEC1A8F@stellar.eclipse.co.uk>
<4D2B9FCB.7050103@broadtime.com>
Message-ID:
> On 1/10/2011 1:40 PM, Jeff wrote:
>> On Mon, Jan 10, 2011 at 11:19 AM, Stroller
>> wrote:
>>>
>>> On 10/1/2011, at 2:22pm, Pietro Abate wrote:
>>>> ...
>>>> I'm loking after a 1850 and the other day one of the two
>>>> disks went offline. I've called DELL and the told me that
>>>> there are no more replacement parts for this model (A SCSI
>>>> disk 320Gb - the old disk is a seegate ST3146807LC).
>>>>
>>>> Does anybody have experience about refurbishing these
>>>> servers with other disks ? if so, what do you reccommend ?
>>>
>>> Just grab any Ultra320 / 80pin SCA SCSI drive.
>>>
>>
>>
>> +1.
>>
>> I just replaced both drives in an 1850 with brand new SCSI drives
>> twice the size purchased from CDW (USA). The 1850 is from before Dell
>> started doing firmware checks in their drive controller card bios to
>> verify that it's a Dell branded drive.
>>
>> Did they ever back off that hardware check on the newest servers?
>>
I don't think they backed off the check, but they made it possible to
override the restriction against other-branded drives. Permitted but not
supported, IIRC.
From stroller at stellar.eclipse.co.uk Mon Jan 10 23:44:35 2011
From: stroller at stellar.eclipse.co.uk (Stroller)
Date: Tue, 11 Jan 2011 05:44:35 +0000
Subject: DRAC 5 console with Firefox 3.0.5 on Red Hat
In-Reply-To: <47073A5E92271A409F44D18958C1AAA6AD5B7A@server13.PatechSolutions.local>
References: <47073A5E92271A409F44D18958C1AAA6AD5B7A@server13.PatechSolutions.local>
Message-ID:
On 11/3/2010, at 12:53pm, Nick Lunt wrote:
> Hi
>
> DRAC 5 console with Firefox 3.0.5 on Red Hat keeps saying I need to
> install the console re-direction plugin. I click yes, install it,
> restart firefox, but it says I need to install the console re-direction
> plugin again.
>
> Firmware Version = 1.45
> Firmware Build = 09.01.16
> Last Firmware Update = NA
> Hardware Version = A04
>
> Any way to fix this ?
>
> Cheers
> Nick
A bit of an update on this old thread: I've just managed to get the DRAC4 viewer working with Firefox 3.6.13.
I'm actually quite pleased with myself for managing this. :)
Initially I was getting the "you do not have a Java Virtual Machine (JVM) installed, or you did not accept the security credentials" message. In the terminal (starting `firefox https://192.168.1.103/`) I was getting a message about "java.util.zip.ZipException: error in opening zip file".
Looking through the source for the DRAC4's admin webpages (opening the console viewer start frame in a separate tab) I managed to find that the viewer package is at and I eventually realised to run `javaws -verbose https://192.168.1.103/vkvm.jar`.
This gave me a new error message to search for, a page full of stuff ending in "netx: Unexpected net.sourceforge.jnlp.ParseException: Invalid XML document syntax. at net.sourceforge.jnlp.Parser.getRootNode(Parser.java:1203)"
This quickly led me to this blog post: http://www.walkingrandomly.com/?p=2218
I use a different distro from that blogger, but it turns out that my distro also, by default, installs OpenJDK6 / IcedTea6 to meet the Java dependency of Firefox. Following the blogger's lead I was able to install the Sun binary JRE 1.6.0.22 and set it as my default Java browser plugin (also as my system Java VM at the same time - not sure that this is necessary).
The DRAC4 viewer worked immediately the next time I tried it!
(obviously I exited & restarted Firefox first)
I don't really use Linux on the desktop, only for servers, so this all wasn't really my normal territory - I've been futzing about with it for a couple of days and I was starting to think the DRAC4 was obsolete and that I'd never get it working.
I really REALLY wish that Dell would release the DRAC software as open-source &/or use a standard VNC viewing protocol (as discussed originally in this thread 10 months ago). I guess I understand why they don't do the former but the latter - more compatibility in general and less of these proprietary closed-source java viewer blobs - just makes sense. I should be able to point my desktop's VNC viewer at the DRAC and not have to jump through these hoops.
I'm using 32-bit x86 on the PC I'm using as a viewer; there are some notes in the DRAC4 firmware README [1] if you're on 64-bit. These refer specifically to the virtual media plug-in, but I guess they might give you some clues if you're having some other problem; then again, they might not.
TL;DR: if you're having trouble with the DRAC viewer on Linux, check you're using the Sun Java Runtime Environment, not OpenJDK, IcedTea or anything else.
Stroller.
http://ftp.us.dell.com/sysman/readme_175_A01.txt
From Richard_Walraven at inmarsat.com Tue Jan 11 00:28:03 2011
From: Richard_Walraven at inmarsat.com (Richard Walraven)
Date: Tue, 11 Jan 2011 06:28:03 +0000
Subject: Minimum Required Driver Version for PERC H700 Integrated
Message-ID: <45C69F838C2FE44295A56904EC465EE40CCA0FE950@S2035.inmarsat.local>
I had to download and un-extract the firmware drive to get my firmware upgraded because im running Fedora 13, however when do i need to due next to have omreport report that im using at least the minimum required driver?
Controller PERC H700 Integrated (Embedded)
Controllers
ID : 0
Status : Non-Critical
Name : PERC H700 Integrated
Slot ID : Embedded
State : Degraded
Firmware Version : 12.10.0-0025
Minimum Required Firmware Version : Not Applicable
Driver Version : 00.00.04.12-rc1
Minimum Required Driver Version : 00.00.04.17
Storport Driver Version : Not Applicable
Minimum Required Storport Driver Version : Not Applicable
Number of Connectors : 2
Rebuild Rate : 30%
BGI Rate : 30%
Check Consistency Rate : 30%
Reconstruct Rate : 30%
Alarm State : Not Applicable
Cluster Mode : Not Applicable
SCSI Initiator ID : Not Applicable
Cache Memory Size : 512 MB
Patrol Read Mode : Auto
Patrol Read State : Stopped
Patrol Read Rate : 30%
Patrol Read Iterations : 30
Abort check consistency on error : Disabled
Allow Revertible Hot Spare and Replace Member : Enabled
Auto replace member on predictive failure : Disabled
Load balance : Auto
Security Capable : Yes
Security Key Present : No
Redundant Path view : Not Applicable
Richard T. Walraven
This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager. In accordance with Inmarsat Information Security Policy and Guidelines on Computer use, emails sent or received may be monitored. Inmarsat plc, Registered No 4886072 and Inmarsat Global Limited, Registered No. 3675885. Both Registered in England and Wales with Registered Office at 99 City Road, London EC1Y 1AX.
_____________________________________________________________________
This e-mail has been scanned for viruses by Verizon Business Internet Managed Scanning Services - powered by MessageLabs. For further information visit http://www.verizonbusiness.com/uk
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.us.dell.com/pipermail/linux-poweredge/attachments/20110111/958d1e04/attachment-0001.htm
From Santosh_Gore at Dell.com Tue Jan 11 00:54:35 2011
From: Santosh_Gore at Dell.com (Santosh_Gore at Dell.com)
Date: Tue, 11 Jan 2011 12:24:35 +0530
Subject: Firmware updates gone wrong
In-Reply-To: <4D1E3539.4030202@gmail.com>
References: <4D1BAB09.3020907@gmail.com>
<75F7F7632819D94BA80703D8B1F10B6D1D102276DC@BLRX7MCDC201.AMER.DELL.COM>
<4D1CBFF9.8090602@gmail.com>
<75F7F7632819D94BA80703D8B1F10B6D1D102279A6@BLRX7MCDC201.AMER.DELL.COM>
<4D1E3539.4030202@gmail.com>
Message-ID: <75F7F7632819D94BA80703D8B1F10B6D1D102C985A@BLRX7MCDC201.AMER.DELL.COM>
Hi Erinn,
Thanks for providing all information.
We have fixed rpm version number issue and NIC firmware update failure from OMSA_6.4 repository http://linux.dell.com/repo/hardware/OMSA_6.4/
Thanks
Santosh
-----Original Message-----
From: Erinn Looney-Triggs [mailto:erinn.looneytriggs at gmail.com]
Sent: Saturday, January 01, 2011 1:26 AM
To: Gore, Santosh
Cc: linux-poweredge-Lists
Subject: Re: Firmware updates gone wrong
On 12/30/2010 7:28 PM, Santosh_Gore at Dell.com wrote:
> Hi,
>
> Thanks for sharing the log messages.
>
> Version number are upward for most of rpm's only few payload rpm are having lower or same version number. We are working to resolve this as soon as possible.
>
> We will try to reproduce NIC Broadcom failure in our lab. Can you please give us platform and operating system details.
>
> Thanks
> Santosh
> -----Original Message-----
> From: Erinn Looney-Triggs [mailto:erinn.looneytriggs at gmail.com]
> Sent: Thursday, December 30, 2010 10:53 PM
> To: Gore, Santosh
> Cc: linux-poweredge-Lists
> Subject: Re: Firmware updates gone wrong
>
> On 12/29/2010 09:32 PM, Santosh_Gore at Dell.com wrote:
>> Hi Erinn
>>
>> Because of firmware rpm version number, latest firmware rpm package's were not updated on your system. Please remove existing firmware payload rpm's from your system and run bootstrap_firmware command again.
>>
>> Command to remove existing firmware packages:
>> yum remove iDRAC* 32_Bit_Diagnostics* Lifecycle_Controller* SAS_Backplane_Firmware* *ven_0x14e4*
>>
>> Please share firmware update log file "/var/log/firmware-updates.log" from system where NIC firmware update is failed.
>>
>> Thanks
>> Santosh
>> -----Original Message-----
>> From: linux-poweredge-bounces-Lists On Behalf Of Erinn Looney-Triggs
>> Sent: Thursday, December 30, 2010 3:11 AM
>> To: linux-poweredge-Lists
>> Subject: Firmware updates gone wrong
>>
>> Do many people use the update_firmware process? So far my results have
>> been spotty at best:
>>
>> The bootstrap_firmware part often has to be run twice on a system before
>> it will pick up most of the firmwares, even then it often misses stuff
>> like the DRAC cards, lifecycle controller, and 32-bit diagnostics.
>>
>> On the subject of the DRAC cards, a newer firmware version will not get
>> installed automatically because the version number is wrong in the RPM,
>> I wrote about this a bit earlier but I think I was a bit too wordy :):
>> http://lists.us.dell.com/pipermail/linux-poweredge/2010-August/042997.html.
>> The crux is that the version is listed as a00 or a02 and a new package
>> comes out that doesn't rev that version number, so you won't get new
>> DRAC firmware installs.
>>
>> Has anyone had any success at all updating the NIC firmwares via
>> update_firmware? I never have, across hundreds of systems, so just
>> wondering if I was really doing this wrong. Often this will cause
>> update_firmware to choke until I manually install the firmwares and then
>> update_firmware can continue.
>>
>> Any tips, pointers, etc. appreciated,
>>
>> -Erinn
>>
>> _______________________________________________
>> Linux-PowerEdge mailing list
>> Linux-PowerEdge at dell.com
>> https://lists.us.dell.com/mailman/listinfo/linux-poweredge
>> Please read the FAQ at http://lists.us.dell.com/faq
> Here is the bit of the logs that is pertinent:
> 2010-08-09 13:44:32,859: Device version: 5.0.13
> 2010-08-09 13:44:32,859: Device name:
> pci_firmware(ven_0x14e4_dev_0x1639_subven_0x1028_subdev_0x0235)
> 2010-08-09 13:44:32,859: Device display name: NetXtreme II BCM5709
> Gigabit Ethernet rev 20 (eth0)
> 2010-08-09 13:44:32,859: Device version: 5.0.13
> 2010-08-09 13:44:32,860: Device name:
> pci_firmware(ven_0x14e4_dev_0x1639_subven_0x1028_subdev_0x0235)
> 2010-08-09 13:44:32,860: Device display name: NetXtreme II BCM5709
> Gigabit Ethernet rev 20 (eth2)
> 2010-08-09 13:44:32,860: Device version: 5.0.13
> 2010-08-09 13:44:32,860: Device name:
> pci_firmware(ven_0x14e4_dev_0x1639_subven_0x1028_subdev_0x0235)
> 2010-08-09 13:44:32,860: Device display name: NetXtreme II BCM5709
> Gigabit Ethernet rev 20 (eth3)
> 2010-08-09 13:44:32,860: Device version: 5.0.13
> 2010-08-09 13:48:24,043: Update result: Update failure. Firmware
> programming utility returned an error.
>
> So what I don't understand about your RPM solutions is what is the point
> of having a yum repository if all I am going to have to do every time is
> remove the RPMs manually and re-install them because the version numbers
> aren't changed up? How about changing the version number upward for
> those packages and then viola problem solved for everyone and you use
> RPMs as they were intended.
>
> -Erinn
>
OS is RHEL 5.5 x64, platform is r610s and r710s.
-Erinn
From tr.ml at gmx.de Tue Jan 11 04:27:21 2011
From: tr.ml at gmx.de (Rainer Traut)
Date: Tue, 11 Jan 2011 11:27:21 +0100
Subject: megaraid_sas driver version on RHEL5
In-Reply-To: <46F6103325A0C04E99DF2ECDA75808031D54E77CFD@BLRX7MCDC202.AMER.DELL.COM>
References: <4D2AD154.6060701@gmx.de>
<46F6103325A0C04E99DF2ECDA75808031D54E77CFD@BLRX7MCDC202.AMER.DELL.COM>
Message-ID: <4D2C3089.7060009@gmx.de>
Am 10.01.2011 12:43, schrieb Vaibhav_Kumar at Dell.com:
> ----- Snip -----
>
> # omreport storage controller
> ...
> Driver Version : 00.00.04.17-4.31.z-RH1
> Minimum Required Driver Version : Not Applicable
> ...
>
> does not seem to check megaraid_sas driver?
>
> ---- Snip ------
>
>
>
> Answer to above snip :
>
> 1. Minimum Required Driver Version will populate only when you are not meeting minimum required level. If you are already meeting the required level, it will display as "Not Applicable"
>
> 2. "Minimum Required Driver Version" in OMSS will ask customer to have that Driver _at least_ to support the features OpenManage has shipped with. There can be drivers greater than minimum one in the support site. These drivers will have fixes not directly linked with OpenManage ex- some minor performance fix. But if you have latest driver, OpenManage will behave similarly as that of Minimum Required one.
>
>
> BTW, it is always good to have latest driver.
>
>
> Regards
> -Vaibhav
>
Thx Vaibhav for making this clear.
so 1) means, omss recognises the stock RHEL5.5 driver as sufficient.
Regards
Rainer
From Vaibhav_Kumar at Dell.com Tue Jan 11 04:37:00 2011
From: Vaibhav_Kumar at Dell.com (Vaibhav_Kumar at Dell.com)
Date: Tue, 11 Jan 2011 02:37:00 -0800
Subject: megaraid_sas driver version on RHEL5
In-Reply-To: <4D2C3089.7060009@gmx.de>
References: <4D2AD154.6060701@gmx.de>
<46F6103325A0C04E99DF2ECDA75808031D54E77CFD@BLRX7MCDC202.AMER.DELL.COM>
<4D2C3089.7060009@gmx.de>
Message-ID: <46F6103325A0C04E99DF2ECDA75808031D54EDA97A@BLRX7MCDC202.AMER.DELL.COM>
Yes, here the native driver is sufficient for the features that installed OM is supporting.
-----Original Message-----
From: Rainer Traut [mailto:tr.ml at gmx.de]
Sent: Tuesday, January 11, 2011 3:57 PM
To: Kumar, Vaibhav
Cc: linux-poweredge-Lists
Subject: Re: megaraid_sas driver version on RHEL5
Am 10.01.2011 12:43, schrieb Vaibhav_Kumar at Dell.com:
> ----- Snip -----
>
> # omreport storage controller
> ...
> Driver Version : 00.00.04.17-4.31.z-RH1
> Minimum Required Driver Version : Not Applicable
> ...
>
> does not seem to check megaraid_sas driver?
>
> ---- Snip ------
>
>
>
> Answer to above snip :
>
> 1. Minimum Required Driver Version will populate only when you are not meeting minimum required level. If you are already meeting the required level, it will display as "Not Applicable"
>
> 2. "Minimum Required Driver Version" in OMSS will ask customer to have that Driver _at least_ to support the features OpenManage has shipped with. There can be drivers greater than minimum one in the support site. These drivers will have fixes not directly linked with OpenManage ex- some minor performance fix. But if you have latest driver, OpenManage will behave similarly as that of Minimum Required one.
>
>
> BTW, it is always good to have latest driver.
>
>
> Regards
> -Vaibhav
>
Thx Vaibhav for making this clear.
so 1) means, omss recognises the stock RHEL5.5 driver as sufficient.
Regards
Rainer
From brian.omahony at curamsoftware.com Tue Jan 11 04:52:46 2011
From: brian.omahony at curamsoftware.com (Brian O'Mahony)
Date: Tue, 11 Jan 2011 10:52:46 +0000
Subject: PE 1850 [OT]
In-Reply-To:
References: <20110110142228.GA12472@uranium.pps.jussieu.fr>
<1FDD003A-9E0C-42FA-AAE9-2ED4DFEC1A8F@stellar.eclipse.co.uk>
<4D2B9FCB.7050103@broadtime.com>
Message-ID: <86E8DA9E18BC2344BD0218BF23C88DF301435EE48912@MAIL06.curamsoftware.com>
-----Original Message-----
From: linux-poweredge-bounces at dell.com [mailto:linux-poweredge-bounces at dell.com] On Behalf Of J. Epperson
Sent: Tuesday, January 11, 2011 12:58 AM
To: linux-poweredge at dell.com
Subject: Re: PE 1850 [OT]
>I don't think they backed off the check, but they made it possible to override the restriction against other-branded drives. Permitted but not supported, IIRC.
>From what I understand, they introduced it, then saw the uproar and removed it a few weeks later. Then they released FW fixes for the affected controllers, removing the restriction.
B
The information in this email is confidential and may be legally privileged.
It is intended solely for the addressee. Access to this email by anyone else
is unauthorized. If you are not the intended recipient, any disclosure,
copying, distribution or any action taken or omitted to be taken in reliance
on it, is prohibited and may be unlawful. If you are not the intended
addressee please contact the sender and dispose of this e-mail. Thank you.
From ml at nicole-haehnel.de Tue Jan 11 06:27:10 2011
From: ml at nicole-haehnel.de (=?ISO-8859-1?Q?Nicole_H=E4hnel?=)
Date: Tue, 11 Jan 2011 13:27:10 +0100
Subject: Firmware updates gone wrong
In-Reply-To: <4D1BAB09.3020907@gmail.com>
References: <4D1BAB09.3020907@gmail.com>
Message-ID: <4D2C4C9E.3010600@nicole-haehnel.de>
Hi,
since yesterday I am using the update_firmware process too.
First I thought there are no firmware updates for DRAC, lifecycle
controller and 32-bit diagnostics till I read this post.
I tested with some servers.
Only after 4 - 5 times calling the bootstrap_firmware process I got no
more updates to download.
Servers are R710 and 2950 with sles10sp3, x86_64 and i586.
DRAC updates are still missing on all servers.
No updates for the 2950, although bios version is 2.6.1 and PERC 6/i is
6.2.0-0013.
Any thoughts why I have to run bootstrap_firmware a couple of times to
get all packages?
Using inventory_firmware shows different outputs and no DRAC cards:
R710 sles10sp3, x86_64, server 1:
System inventory:
BIOS = 2.2.10
SAS/SATA Backplane 0:0 Backplane Firmware = 1.07
PowerVault MD1000-0 EMM-1 Firmware = a.04
ST3146356SS Firmware = hs10
NetXtreme II BCM5709 Gigabit Ethernet rev 20 (eth1) = 6.0.1
NetXtreme II BCM5709 Gigabit Ethernet rev 20 (eth0) = 6.0.1
ST3300655SS Firmware = s52a
MBA3300RC Firmware = d306
ST3300657SS Firmware = es62
PERC 6/E Adapter Controller 1 Firmware = 6.3.0-0001
ST3450856SS Firmware = hs10
PERC 6/i Integrated Controller 0 Firmware = 6.3.0-0001
NetXtreme II BCM5709 Gigabit Ethernet rev 20 (eth2) = 6.0.1
NetXtreme II BCM5709 Gigabit Ethernet rev 20 (eth3) = 6.0.1
System BIOS for PowerEdge R710 = 2.2.10
R710 sles10sp3, x86_64, server 2:
System inventory:
BIOS = 2.2.11
SAS/SATA Backplane 0:0 Backplane Firmware = 1.07
PowerVault MD1000-0 EMM-1 Firmware = a.04
HUS154530VLS300 Firmware = b598
NetXtreme II BCM5709 Gigabit Ethernet rev 20 (eth1) = 6.0.1
NetXtreme II BCM5709 Gigabit Ethernet rev 20 (eth0) = 6.0.1
HUS154545VLS300 Firmware = d598
PERC 6/E Adapter Controller 1 Firmware = 6.3.0-0001
PERC 6/i Integrated Controller 0 Firmware = 6.3.0-0001
NetXtreme II BCM5709 Gigabit Ethernet rev 20 (eth2) = 6.0.1
NetXtreme II BCM5709 Gigabit Ethernet rev 20 (eth3) = 6.0.1
System BIOS for PowerEdge R710 = 2.2.11
2950 sles10sp3, i586, server 3:
System inventory:
System BIOS for PowerEdge 2950 = 2.6.1
R710 rehl5.5, x86_64, server 5:
System inventory:
System BIOS for PowerEdge R710 = 2.0.13
Why does inventory_firmware shows sometimes only bios?
Thanks!
Nicole
From heskin at gmail.com Tue Jan 11 07:06:08 2011
From: heskin at gmail.com (Hank)
Date: Tue, 11 Jan 2011 08:06:08 -0500
Subject: DRAC 5 console with Firefox 3.0.5 on Red Hat
In-Reply-To:
References: <47073A5E92271A409F44D18958C1AAA6AD5B7A@server13.PatechSolutions.local>
Message-ID:
I just saw this thread. In my experience using 32bit Win XP, I've had very
little problems (none, essentially) using DRAC on my 2605s, 2950, and R610s
using either Opera or Chrome and Sun Java. The plug-ins just install
themselves, and maybe a couple click-throughs on the plug-in and certificate
security warnings, but I've never had to change any security settings to get
them to work. Opera can be picky from time to time with DRAC and OMSA, but
Chrome works pretty much every time. YMMV.
-Hank
On Tue, Jan 11, 2011 at 12:44 AM, Stroller
wrote:
>
> On 11/3/2010, at 12:53pm, Nick Lunt wrote:
>
> > Hi
> >
> > DRAC 5 console with Firefox 3.0.5 on Red Hat keeps saying I need to
> > install the console re-direction plugin. I click yes, install it,
> > restart firefox, but it says I need to install the console re-direction
> > plugin again.
> >
> > Firmware Version = 1.45
> > Firmware Build = 09.01.16
> > Last Firmware Update = NA
> > Hardware Version = A04
> >
> > Any way to fix this ?
> >
> > Cheers
> > Nick
>
> A bit of an update on this old thread: I've just managed to get the DRAC4
> viewer working with Firefox 3.6.13.
> I'm actually quite pleased with myself for managing this. :)
>
> Initially I was getting the "you do not have a Java Virtual Machine (JVM)
> installed, or you did not accept the security credentials" message. In the
> terminal (starting `firefox https://192.168.1.103/`) I was getting a
> message about "java.util.zip.ZipException: error in opening zip file".
>
> Looking through the source for the DRAC4's admin webpages (opening the
> console viewer start frame in a separate tab) I managed to find that the
> viewer package is at and I eventually
> realised to run `javaws -verbose https://192.168.1.103/vkvm.jar`.
>
> This gave me a new error message to search for, a page full of stuff ending
> in "netx: Unexpected net.sourceforge.jnlp.ParseException: Invalid XML
> document syntax. at
> net.sourceforge.jnlp.Parser.getRootNode(Parser.java:1203)"
> This quickly led me to this blog post:
> http://www.walkingrandomly.com/?p=2218
>
> I use a different distro from that blogger, but it turns out that my distro
> also, by default, installs OpenJDK6 / IcedTea6 to meet the Java dependency
> of Firefox. Following the blogger's lead I was able to install the Sun
> binary JRE 1.6.0.22 and set it as my default Java browser plugin (also as my
> system Java VM at the same time - not sure that this is necessary).
>
> The DRAC4 viewer worked immediately the next time I tried it!
> (obviously I exited & restarted Firefox first)
>
> I don't really use Linux on the desktop, only for servers, so this all
> wasn't really my normal territory - I've been futzing about with it for a
> couple of days and I was starting to think the DRAC4 was obsolete and that
> I'd never get it working.
>
> I really REALLY wish that Dell would release the DRAC software as
> open-source &/or use a standard VNC viewing protocol (as discussed
> originally in this thread 10 months ago). I guess I understand why they
> don't do the former but the latter - more compatibility in general and less
> of these proprietary closed-source java viewer blobs - just makes sense. I
> should be able to point my desktop's VNC viewer at the DRAC and not have to
> jump through these hoops.
>
> I'm using 32-bit x86 on the PC I'm using as a viewer; there are some notes
> in the DRAC4 firmware README [1] if you're on 64-bit. These refer
> specifically to the virtual media plug-in, but I guess they might give you
> some clues if you're having some other problem; then again, they might not.
>
> TL;DR: if you're having trouble with the DRAC viewer on Linux, check you're
> using the Sun Java Runtime Environment, not OpenJDK, IcedTea or anything
> else.
>
> Stroller.
>
>
> http://ftp.us.dell.com/sysman/readme_175_A01.txt
>
> _______________________________________________
> Linux-PowerEdge mailing list
> Linux-PowerEdge at dell.com
> https://lists.us.dell.com/mailman/listinfo/linux-poweredge
> Please read the FAQ at http://lists.us.dell.com/faq
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.us.dell.com/pipermail/linux-poweredge/attachments/20110111/3a413e25/attachment-0001.htm
From tr.ml at gmx.de Tue Jan 11 07:19:12 2011
From: tr.ml at gmx.de (Rainer Traut)
Date: Tue, 11 Jan 2011 14:19:12 +0100
Subject: PERC 4e/Si firmware / openmanage 6.4
In-Reply-To: <46F6103325A0C04E99DF2ECDA75808031D54E0FC96@BLRX7MCDC202.AMER.DELL.COM>
References:
<46F6103325A0C04E99DF2ECDA75808031D54E0FC96@BLRX7MCDC202.AMER.DELL.COM>
Message-ID: <4D2C58D0.2050704@gmx.de>
Am 29.12.2010 07:09, schrieb Vaibhav_Kumar at dell.com:
> Yes, your box has the latest Firmware i.e. 5B2D
>
> It's a OMSA bug getting fixed in upcoming releases.
>
> Regards
> -Vaibhav
>
> -----Original Message-----
> From: linux-poweredge-bounces-Lists On Behalf Of Stephan van Hienen
> Sent: Tuesday, December 28, 2010 10:38 PM
> To: linux-poweredge-Lists
> Subject: PERC 4e/Si firmware / openmanage 6.4
>
> After I upgraded a PowerEdge 1850 to Openmanage 6.4 I get the warning the firmware from the perc 4e/Si is outdated :
>
> ----
> Firmware version is out of date.
>
> Firmware Version 5B2D
> Minimum Required Firmware Version 522D
> ----
>
> Looking at the Dell supportsite, the latest version is 5B2D. (And I think 522D is older ?)
>
> Any hints ?
>
> Stephan
>
This does not seem to be the only bug.
PE2850 here with PERC 4e/DC; OMSA 6.4 32bit on EL5.5 x86_64.
Apart from the wrong firmware warning and showing controller as degraded
- omreport does not show virtual disks:
# omreport storage vdisk controller=0
No virtual disks found
while in Webinterface:
Virtuelle Laufwerke
Virtual Disk 0 Bereit Ausf?hren RAID-10 273.24GB
Regards
Rainer
From heskin at gmail.com Tue Jan 11 07:39:28 2011
From: heskin at gmail.com (Hank)
Date: Tue, 11 Jan 2011 08:39:28 -0500
Subject: DRAC 5 console with Firefox 3.0.5 on Red Hat
In-Reply-To:
References: <47073A5E92271A409F44D18958C1AAA6AD5B7A@server13.PatechSolutions.local>
Message-ID:
Sorry, just saw the subject that this was about RedHat. Nevermind.
-Hank
On Tue, Jan 11, 2011 at 8:06 AM, Hank wrote:
>
> I just saw this thread. In my experience using 32bit Win XP, I've had very
> little problems (none, essentially) using DRAC on my 2605s, 2950, and R610s
> using either Opera or Chrome and Sun Java. The plug-ins just install
> themselves, and maybe a couple click-throughs on the plug-in and certificate
> security warnings, but I've never had to change any security settings to get
> them to work. Opera can be picky from time to time with DRAC and OMSA, but
> Chrome works pretty much every time. YMMV.
>
> -Hank
> \
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.us.dell.com/pipermail/linux-poweredge/attachments/20110111/06fb679a/attachment.htm
From Vaibhav_Kumar at Dell.com Tue Jan 11 07:42:22 2011
From: Vaibhav_Kumar at Dell.com (Vaibhav_Kumar at Dell.com)
Date: Tue, 11 Jan 2011 19:12:22 +0530
Subject: PERC 4e/Si firmware / openmanage 6.4
In-Reply-To: <4D2C58D0.2050704@gmx.de>
References:
<46F6103325A0C04E99DF2ECDA75808031D54E0FC96@BLRX7MCDC202.AMER.DELL.COM>
<4D2C58D0.2050704@gmx.de>
Message-ID: <46F6103325A0C04E99DF2ECDA75808031D54EDAA7A@BLRX7MCDC202.AMER.DELL.COM>
Will check this behavior at my end and update soon.
I hope the controller ID for PERC 4e/DC you used is correct.
Regards
-Vaibhav
-----Original Message-----
From: Rainer Traut [mailto:tr.ml at gmx.de]
Sent: Tuesday, January 11, 2011 6:49 PM
To: Kumar, Vaibhav
Cc: stephan.van.hienen at thevalley.nl; linux-poweredge-Lists
Subject: Re: PERC 4e/Si firmware / openmanage 6.4
Am 29.12.2010 07:09, schrieb Vaibhav_Kumar at dell.com:
> Yes, your box has the latest Firmware i.e. 5B2D
>
> It's a OMSA bug getting fixed in upcoming releases.
>
> Regards
> -Vaibhav
>
> -----Original Message-----
> From: linux-poweredge-bounces-Lists On Behalf Of Stephan van Hienen
> Sent: Tuesday, December 28, 2010 10:38 PM
> To: linux-poweredge-Lists
> Subject: PERC 4e/Si firmware / openmanage 6.4
>
> After I upgraded a PowerEdge 1850 to Openmanage 6.4 I get the warning the firmware from the perc 4e/Si is outdated :
>
> ----
> Firmware version is out of date.
>
> Firmware Version 5B2D
> Minimum Required Firmware Version 522D
> ----
>
> Looking at the Dell supportsite, the latest version is 5B2D. (And I think 522D is older ?)
>
> Any hints ?
>
> Stephan
>
This does not seem to be the only bug.
PE2850 here with PERC 4e/DC; OMSA 6.4 32bit on EL5.5 x86_64.
Apart from the wrong firmware warning and showing controller as degraded
- omreport does not show virtual disks:
# omreport storage vdisk controller=0
No virtual disks found
while in Webinterface:
Virtuelle Laufwerke
Virtual Disk 0 Bereit Ausf?hren RAID-10 273.24GB
Regards
Rainer
From tr.ml at gmx.de Tue Jan 11 07:54:33 2011
From: tr.ml at gmx.de (Rainer Traut)
Date: Tue, 11 Jan 2011 14:54:33 +0100
Subject: PERC 4e/Si firmware / openmanage 6.4
In-Reply-To: <46F6103325A0C04E99DF2ECDA75808031D54EDAA7A@BLRX7MCDC202.AMER.DELL.COM>
References:
<46F6103325A0C04E99DF2ECDA75808031D54E0FC96@BLRX7MCDC202.AMER.DELL.COM>
<4D2C58D0.2050704@gmx.de>
<46F6103325A0C04E99DF2ECDA75808031D54EDAA7A@BLRX7MCDC202.AMER.DELL.COM>
Message-ID: <4D2C6119.9060200@gmx.de>
Am 11.01.2011 14:42, schrieb Vaibhav_Kumar at Dell.com:
> Will check this behavior at my end and update soon.
>
> I hope the controller ID for PERC 4e/DC you used is correct.
Yes, it only has one controller:
# omreport storage controller
Controller PERC 4e/DC (Slot 1)
Controllers
ID : 0
Status : Non-Critical
Name : PERC 4e/DC
Slot ID : PCI Slot 1
State : Degraded
Firmware Version : 5B2D
Minimum Required Firmware Version : 522D
--SNIP--
Spin Down Unconfigured Drives : Not Applicable
Spin Down Hot Spares : Not Applicable
#
Regards
Rainer
From Dirk.Gfroerer at guh-software.de Tue Jan 11 08:18:21 2011
From: Dirk.Gfroerer at guh-software.de (Dirk Gfroerer)
Date: Tue, 11 Jan 2011 15:18:21 +0100
Subject: PERC 4e/Si firmware / openmanage 6.4
In-Reply-To: <46F6103325A0C04E99DF2ECDA75808031D54EDAA7A@BLRX7MCDC202.AMER.DELL.COM>
References: <46F6103325A0C04E99DF2ECDA75808031D54E0FC96@BLRX7MCDC202.AMER.DELL.COM> <4D2C58D0.2050704@gmx.de>
<46F6103325A0C04E99DF2ECDA75808031D54EDAA7A@BLRX7MCDC202.AMER.DELL.COM>
Message-ID: <4D2C66AD.1000007@guh-software.de>
Am 11.01.2011 14:42, schrieb Vaibhav_Kumar at dell.com:
> Will check this behavior at my end and update soon.
>
> I hope the controller ID for PERC 4e/DC you used is correct.
I can confirm the bug Rainer is seeing.
I do get information about the controller and the physical disks. But I
don't get anything for the virtual disks.
Kind Regards,
Dirk
From johan.sjoberg at deltamanagement.se Tue Jan 11 09:19:14 2011
From: johan.sjoberg at deltamanagement.se (=?iso-8859-1?Q?Johan_Sj=F6berg?=)
Date: Tue, 11 Jan 2011 16:19:14 +0100
Subject: PERC 4e/Si firmware / openmanage 6.4
In-Reply-To: <4D2C6119.9060200@gmx.de>
References:
<46F6103325A0C04E99DF2ECDA75808031D54E0FC96@BLRX7MCDC202.AMER.DELL.COM>
<4D2C58D0.2050704@gmx.de>
<46F6103325A0C04E99DF2ECDA75808031D54EDAA7A@BLRX7MCDC202.AMER.DELL.COM>
<4D2C6119.9060200@gmx.de>
Message-ID:
Hi.
There is a possibly related bug with SNMP as well. Since the upgrade to 6.4, the OID which should return the Virtual disk layout (1.3.6.1.4.1.674.10893.1.20.140.1.1.13) does not exist. The other OIDs regarding virtual disks seem to work.
The server is a 1850 with PERC 4e/Si.
/Johan
-----Original Message-----
From: linux-poweredge-bounces at dell.com [mailto:linux-poweredge-bounces at dell.com] On Behalf Of Rainer Traut
Sent: den 11 januari 2011 14:55
To: Vaibhav_Kumar at dell.com
Cc: linux-poweredge at lists.us.dell.com
Subject: Re: PERC 4e/Si firmware / openmanage 6.4
Am 11.01.2011 14:42, schrieb Vaibhav_Kumar at Dell.com:
> Will check this behavior at my end and update soon.
>
> I hope the controller ID for PERC 4e/DC you used is correct.
Yes, it only has one controller:
# omreport storage controller
Controller PERC 4e/DC (Slot 1)
Controllers
ID : 0
Status : Non-Critical
Name : PERC 4e/DC
Slot ID : PCI Slot 1
State : Degraded
Firmware Version : 5B2D
Minimum Required Firmware Version : 522D
--SNIP--
Spin Down Unconfigured Drives : Not Applicable
Spin Down Hot Spares : Not Applicable
#
Regards
Rainer
_______________________________________________
Linux-PowerEdge mailing list
Linux-PowerEdge at dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq
From austin.murphy at gmail.com Tue Jan 11 10:32:11 2011
From: austin.murphy at gmail.com (Austin Murphy)
Date: Tue, 11 Jan 2011 11:32:11 -0500
Subject: Minimum Required Driver Version for PERC H700 Integrated
In-Reply-To: <45C69F838C2FE44295A56904EC465EE40CCA0FE950@S2035.inmarsat.local>
References: <45C69F838C2FE44295A56904EC465EE40CCA0FE950@S2035.inmarsat.local>
Message-ID:
Hi Richard,
On Tue, Jan 11, 2011 at 1:28 AM, Richard Walraven
wrote:
> I had to download and un-extract the firmware drive to get my firmware
> upgraded because im running Fedora 13, however when do i need to due next to
> have omreport report that im using at least the minimum required driver?
> Controller? PERC H700 Integrated (Embedded)
>
> Controllers
>
> ID??????????????????????????????????????????? : 0
> Status??????????????????????????????????????? : Non-Critical
> Name????????????????????????????????????????? : PERC H700 Integrated
> Slot ID?????????????????????????????????????? : Embedded
> State???????????????????????????????????????? : Degraded
> Firmware Version????????????????????????????? : 12.10.0-0025
> Minimum Required Firmware Version???????????? : Not Applicable
> Driver Version??? ????????????????????????????: 00.00.04.12-rc1
> Minimum Required Driver Version?????????????? : 00.00.04.17
It's the OS driver that is out of date.
I think your options are:
1. Install a version of linux with a more recent driver for this PERC.
RHEL 5.5 uses 4.17.something.
2. Use the DKMS driver provided by dell.
3. ignore the problem.
Here is the same info from one of my servers running RHEL 5.5:
# omreport storage controller
Controller PERC H700 Integrated (Embedded)
Controllers
ID : 0
Status : Ok
Name : PERC H700 Integrated
Slot ID : Embedded
State : Ready
Firmware Version : 12.10.0-0025
Minimum Required Firmware Version : Not Applicable
Driver Version : 00.00.04.17-4.31.z-RH1
Minimum Required Driver Version : Not Applicable
Austin
From icicimov at gmail.com Tue Jan 11 17:28:48 2011
From: icicimov at gmail.com (Igor Cicimov)
Date: Wed, 12 Jan 2011 10:28:48 +1100
Subject: No RAID controller in omreport - Debian Lenny on PE2500
Message-ID:
Hi all,
I have couple of questions regarding the PERC RAID controller and RAID5.
1. I have installed the OMSA 5.5.0 package from sara repositories as
recommended by couple of guys on one of my previous posts. But I can't get
any info about my PERC 2/DC RAID controller:
# omreport about
sh: /bin/rpm: No such file or directory
Product name : Information Not Available.
Version : 3.5.0
Copyright : Copyright (C) Dell Inc. 1995-2008. All rights reserved.
Company : Dell Inc.
# omreport storage controller
No controllers found
# modinfo megaraid
filename: /lib/modules/2.6.26-2-686/kernel/drivers/scsi/megaraid.ko
version: 2.00.4
license: GPL
description: LSI Logic MegaRAID legacy driver
author: sju at lsil.com
srcversion: 4F5308E80636CC8DD41C548
alias: pci:v00008086d00001960sv*sd*bc*sc*i*
alias: pci:v0000101Ed00009060sv*sd*bc*sc*i*
alias: pci:v0000101Ed00009010sv*sd*bc*sc*i*
depends: scsi_mod
vermagic: 2.6.26-2-686 SMP mod_unload modversions 686
parm: max_cmd_per_lun:Maximum number of commands which can be
issued to a single LUN (default=DEF_CMD_PER_LUN=63) (uint)
parm: max_sectors_per_io:Maximum number of sectors per I/O request
(default=MAX_SECTORS_PER_IO=128) (ushort)
parm: max_mbox_busy_wait:Maximum wait for mailbox in microseconds
if busy (default=MBOX_BUSY_WAIT=10) (ushort)
# modinfo megaraid_sas
filename:
/lib/modules/2.6.26-2-686/kernel/drivers/scsi/megaraid/megaraid_sas.ko
description: LSI MegaRAID SAS Driver
author: megaraidlinux at lsi.com
version: 00.00.04.01
license: GPL
srcversion: 49BEAEC53F4BE8F4646C64A
alias: pci:v00001028d00000015sv*sd*bc*sc*i*
alias: pci:v00001000d00000413sv*sd*bc*sc*i*
alias: pci:v00001000d00000079sv*sd*bc*sc*i*
alias: pci:v00001000d00000078sv*sd*bc*sc*i*
alias: pci:v00001000d0000007Csv*sd*bc*sc*i*
alias: pci:v00001000d00000060sv*sd*bc*sc*i*
alias: pci:v00001000d00000411sv*sd*bc*sc*i*
depends: scsi_mod
vermagic: 2.6.26-2-686 SMP mod_unload modversions 686
parm: poll_mode_io:Complete cmds from IO path, (default=0) (int)
and funny thing is that I can't even see the PCI card
# lspci
00:00.0 Host bridge: Broadcom CNB20HE Host Bridge (rev 23)
00:00.1 Host bridge: Broadcom CNB20HE Host Bridge (rev 01)
00:00.2 Host bridge: Broadcom CNB20HE Host Bridge (rev 01)
00:00.3 Host bridge: Broadcom CNB20HE Host Bridge (rev 01)
00:04.0 Ethernet controller: Intel Corporation 82557/8/9/0/1 Ethernet Pro
100 (rev 08)
00:0e.0 VGA compatible controller: ATI Technologies Inc Rage XL (rev 27)
00:0f.0 ISA bridge: Broadcom OSB4 South Bridge (rev 50)
00:0f.1 IDE interface: Broadcom OSB4 IDE Controller
00:0f.2 USB Controller: Broadcom OSB4/CSB5 OHCI USB Controller (rev 04)
01:02.0 PCI bridge: Intel Corporation 80960RM (i960RM) Bridge (rev 02)
01:0a.0 Token ring network controller: Madge Networks Smart 100/16/4 PCI
Ringnode (rev 01)
03:06.0 Ethernet controller: Intel Corporation 82557/8/9/0/1 Ethernet Pro
100 (rev 0c)
03:08.0 PCI bridge: Intel Corporation 80960RP (i960RP) Microprocessor/Bridge
(rev 02)
03:08.1 I2O: Intel Corporation 80960RP (i960RP) Microprocessor (rev 02)
and it should be a PCI card as per my understanding or maybe I'm wrong???
2. What is the exact procedure for disk replacement and array rebuild? I
have some knowledge about this and picked up some bits and pieces from other
posts here but just want to make sure I'm not doing anything wrong.
Thanks a lot for any reply and guidence.
Cheers,
Igor
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.us.dell.com/pipermail/linux-poweredge/attachments/20110111/64528c98/attachment.htm
From stroller at stellar.eclipse.co.uk Tue Jan 11 21:12:07 2011
From: stroller at stellar.eclipse.co.uk (Stroller)
Date: Wed, 12 Jan 2011 03:12:07 +0000
Subject: No RAID controller in omreport - Debian Lenny on PE2500
In-Reply-To:
References:
Message-ID: <30656EDE-CCB0-40CE-A40C-5C124842710A@stellar.eclipse.co.uk>
On 11/1/2011, at 11:28pm, Igor Cicimov wrote:
> ...
> 1. I have installed the OMSA 5.5.0 package from sara repositories as recommended by couple of guys on one of my previous posts. But I can't get any info about my PERC 2/DC RAID controller:
I can't find your previous posts, searching my email folder for "igor" or your email address.
However, the first hit on Google for "OMSA 5.5" is this page:
http://support.dell.com/support/edocs/software/svradmin/5.5/index.htm
One thing about Dell releases, the README always has a compatibility section:
http://support.dell.com/support/edocs/software/svradmin/5.5/en/README/readme_sa.txt
The 2500 is not mentioned on that page.
> ...
> and funny thing is that I can't even see the PCI card
>
> # lspci
> 00:00.0 Host bridge: Broadcom CNB20HE Host Bridge (rev 23)
> 00:00.1 Host bridge: Broadcom CNB20HE Host Bridge (rev 01)
> 00:00.2 Host bridge: Broadcom CNB20HE Host Bridge (rev 01)
> 00:00.3 Host bridge: Broadcom CNB20HE Host Bridge (rev 01)
> 00:04.0 Ethernet controller: Intel Corporation 82557/8/9/0/1 Ethernet Pro 100 (rev 08)
> 00:0e.0 VGA compatible controller: ATI Technologies Inc Rage XL (rev 27)
> 00:0f.0 ISA bridge: Broadcom OSB4 South Bridge (rev 50)
> 00:0f.1 IDE interface: Broadcom OSB4 IDE Controller
> 00:0f.2 USB Controller: Broadcom OSB4/CSB5 OHCI USB Controller (rev 04)
> 01:02.0 PCI bridge: Intel Corporation 80960RM (i960RM) Bridge (rev 02)
> 01:0a.0 Token ring network controller: Madge Networks Smart 100/16/4 PCI Ringnode (rev 01)
> 03:06.0 Ethernet controller: Intel Corporation 82557/8/9/0/1 Ethernet Pro 100 (rev 0c)
> 03:08.0 PCI bridge: Intel Corporation 80960RP (i960RP) Microprocessor/Bridge (rev 02)
> 03:08.1 I2O: Intel Corporation 80960RP (i960RP) Microprocessor (rev 02)
>
> and it should be a PCI card as per my understanding or maybe I'm wrong???
That does seem kinda weird.
Are you booted from this system? or are is this output from a liveCD?
> 2. What is the exact procedure for disk replacement and array rebuild? I have some knowledge about this and picked up some bits and pieces from other posts here but just want to make sure I'm not doing anything wrong.
Uh, I don't think I could manage this succinctly. It's easy when you've done it a dozen times, and I would guess many people learn to manage RAID arrays by being shown the first time. Once you understand RAID concepts, using a new RAID controller from a different manufacturer is pretty intuitive. Isn't there a manual?
If I'm understanding correctly that this is a 1000mhz Pentum III Poweredge 2500, then:
This system is pretty old. If it's your own machine, mess around with it until you understand what's going on. Don't put anything important on it, don't be afraid to to break it. That's the best way to learn.
If this machine is used in a business then update it. This machine is really too old to support properly. If you're a Linux geek and you love playing then have fun, but this machine is too old to be be supportable in any dependable way. Please excuse me for making assumptions on the basis of your name, but if you're somewhere where the exchange rate would make a brand new server prohibitively expensive (or your boss is tight with money) then at least update to a secondhand machine that's less than 5 years old. You will find those are affordable now and at least somewhat easier to support - look at newer OMSAs, see what systems are supported, see what's affordable on eBay; make sure you get spares. Play with the new system until you get comfortable with it before migrating to it.
Stroller.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.us.dell.com/pipermail/linux-poweredge/attachments/20110111/ef77e1ff/attachment.htm
From Dell at epperson.homelinux.net Tue Jan 11 21:24:05 2011
From: Dell at epperson.homelinux.net (J. Epperson)
Date: Tue, 11 Jan 2011 22:24:05 -0500
Subject: No RAID controller in omreport - Debian Lenny on PE2500
In-Reply-To: <30656EDE-CCB0-40CE-A40C-5C124842710A@stellar.eclipse.co.uk>
References:
<30656EDE-CCB0-40CE-A40C-5C124842710A@stellar.eclipse.co.uk>
Message-ID:
On Tue, January 11, 2011 22:12, Stroller wrote:
>
> Uh, I don't think I could manage this succinctly. It's easy when you've
> done it a dozen times, and I would guess many people learn to manage RAID
> arrays by being shown the first time. Once you understand RAID concepts,
> using a new RAID controller from a different manufacturer is pretty
> intuitive. Isn't there a manual?
>
> If I'm understanding correctly that this is a 1000mhz Pentum III Poweredge
> 2500, then:
>
> This system is pretty old. If it's your own machine, mess around with it
> until you understand what's going on. Don't put anything important on it,
> don't be afraid to to break it. That's the best way to learn.
>
> If this machine is used in a business then update it. This machine is
> really too old to support properly. If you're a Linux geek and you love
> playing then have fun, but this machine is too old to be be supportable in
> any dependable way. Please excuse me for making assumptions on the basis
> of your name, but if you're somewhere where the exchange rate would make a
> brand new server prohibitively expensive (or your boss is tight with
> money) then at least update to a secondhand machine that's less than 5
> years old. You will find those are affordable now and at least somewhat
> easier to support - look at newer OMSAs, see what systems are supported,
> see what's affordable on eBay; make sure you get spares. Play with the new
> system until you get comfortable with it before migrating to it.
>
>
Also, IIRC that PERC 2 requires the legacy megaraid driver, and I don't
even know if it's still there in current distros. You can pick up a PERC
3 on eBay for $20US or less, and it will work with current drivers. If
you just want to play with the box, as Stroller says.
From derek at umiacs.umd.edu Tue Jan 11 21:37:58 2011
From: derek at umiacs.umd.edu (Derek Yarnell)
Date: Tue, 11 Jan 2011 22:37:58 -0500
Subject: R710 2.5" mixed SAS/SATA
Message-ID: <4D2D2216.40508@umiacs.umd.edu>
Hi,
We bought a PowerEdge R710 with the 8bay 2.5" configuration. As per the
docs we bought two SAS drives in bays 0,1 and we are trying to put in 6
SSD after market drives (Crucial RealSSD C300) in the rest of the slots.
The H700 sees the drives but the chassis throws "Drive Slot sensor for
Stroage, drive fault was asserted" for each of the drives.
Anyone else doing this? It seems like everything is ok other than my
drives are blinking orange which is not optimal.
Thanks,
derek
--
---
Derek T. Yarnell
University of Maryland
Institute for Advanced Computer Studies
From icicimov at gmail.com Tue Jan 11 22:06:47 2011
From: icicimov at gmail.com (Igor Cicimov)
Date: Wed, 12 Jan 2011 15:06:47 +1100
Subject: Linux-PowerEdge Digest, Vol 80, Issue 23
In-Reply-To:
References:
Message-ID:
Thanks guys. Actually I was wrong the card is being seen by the system
# omreport system summary
sh: /bin/rpm: No such file or directory
System Summary
------------------
Software Profile
------------------
Systems Management
Name : Information not available.
Version : 3.5.0
Description : Systems Management Software
Operating System
Name : Linux
Version : Kernel 2.6.26-2-686 (i686)
System Time : Wed Jan 12 13:37:51 2011
System Bootup Time : Mon Jan 10 20:28:16 2011
--------
System
--------
System
Host Name : pe2500
System Location : Please set the value
---------------------
Main System Chassis
---------------------
Chassis Information
Chassis Model : PowerEdge 2500
Chassis Service Tag : 293Z41S
Chassis Lock : Present
Chassis Asset Tag : Unknown
Processor 1
Processor Manufacturer : Intel
Processor Family : Pentium III
Processor Version : Model 8 Stepping 10
Current Speed : 1000 MHz
Maximum Speed : 1533 MHz
External Clock Speed : 133 MHz
Voltage : 2000 mV
Processor 2
Processor Manufacturer : Intel
Processor Family : Pentium III
Processor Version : Model 8 Stepping 10
Current Speed : 1000 MHz
Maximum Speed : 1533 MHz
External Clock Speed : 133 MHz
Voltage : 2000 mV
Memory
Total Installed Capacity : 2048 MB
Memory Available to the OS : 2028 MB
Total Maximum Capacity : 6144 MB
Memory Array Count : 1
Memory Array 1
Location : System Board or Motherboard
Use : System Memory
Installed Capacity : 2048 MB
Maximum Capacity : 6144 MB
Slots Available : 6
Slots Used : 4
ECC Type : Single Bit ECC
Slot PCI1
Adapter : PRO/100 S Server Adapter
Type : PCI
Data Bus Width : 64 Bits
Speed : 33 MHz
Slot Length : Long
Voltage Supply : 3.3 Volts
Slot PCI2
Adapter : PERC 2/DC
Type : PCI
Data Bus Width : 64 Bits
Speed : 33 MHz
Slot Length : Long
Voltage Supply : 3.3 Volts
Slot PCI3
Adapter : [Not Occupied]
Type : PCI
Data Bus Width : 64 Bits
Speed : 33 MHz
Slot Length : Long
Voltage Supply : 5 Volts
Slot PCI4
Adapter : Smart 100/16/4 PCI Ringnode
Type : PCI
Data Bus Width : 64 Bits
Speed : 33 MHz
Slot Length : Long
Voltage Supply : 5 Volts
Slot PCI5
Adapter : [Not Occupied]
Type : PCI
Data Bus Width : 64 Bits
Speed : 33 MHz
Slot Length : Long
Voltage Supply : 5 Volts
Slot PCI6
Adapter : [Not Occupied]
Type : PCI
Data Bus Width : 32 Bits
Speed : 33 MHz
Slot Length : Long
Voltage Supply : 5 Volts
Slot PCI7
Adapter : [Not Occupied]
Type : PCI
Data Bus Width : 32 Bits
Speed : 33 MHz
Slot Length : Long
Voltage Supply : 5 Volts
BIOS Information
Manufacturer : Dell Inc.
Version : A05
Release Date : 08/12/2002
Firmware Information
Name : ESM firmware
Version : 5.43
Firmware Information
Name : Backplane firmware
Version : 1.29
just the controller doesn't want to play. So it's in the second PCI slot as
shown above. Also 'lspci -vv' shows that the I2O card is actually the RAID
controller
03:08.1 I2O: Intel Corporation 80960RP (i960RP) Microprocessor (rev 02)
(prog-if 01)
Subsystem: Dell PowerEdge Expandable RAID Controller 2/DC
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr-
Stepping- SERR+ FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort-
SERR- wrote:
> Send Linux-PowerEdge mailing list submissions to
> linux-poweredge at dell.com
>
> To subscribe or unsubscribe via the World Wide Web, visit
> https://lists.us.dell.com/mailman/listinfo/linux-poweredge
> or, via email, send a message with subject or body 'help' to
> linux-poweredge-request at dell.com
>
> You can reach the person managing the list at
> linux-poweredge-owner at dell.com
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Linux-PowerEdge digest..."
>
>
> Today's Topics:
>
> 1. No RAID controller in omreport - Debian Lenny on PE2500
> (Igor Cicimov)
> 2. Re: No RAID controller in omreport - Debian Lenny on PE2500
> (Stroller)
> 3. Re: No RAID controller in omreport - Debian Lenny on PE2500
> (J. Epperson)
> 4. R710 2.5" mixed SAS/SATA (Derek Yarnell)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Wed, 12 Jan 2011 10:28:48 +1100
> From: Igor Cicimov
> Subject: No RAID controller in omreport - Debian Lenny on PE2500
> To: Linux-PowerEdge at dell.com
> Message-ID:
>
> >
> Content-Type: text/plain; charset="iso-8859-1"
>
> Hi all,
>
> I have couple of questions regarding the PERC RAID controller and RAID5.
>
> 1. I have installed the OMSA 5.5.0 package from sara repositories as
> recommended by couple of guys on one of my previous posts. But I can't get
> any info about my PERC 2/DC RAID controller:
>
> # omreport about
> sh: /bin/rpm: No such file or directory
>
> Product name : Information Not Available.
> Version : 3.5.0
> Copyright : Copyright (C) Dell Inc. 1995-2008. All rights reserved.
> Company : Dell Inc.
>
> # omreport storage controller
> No controllers found
>
> # modinfo megaraid
> filename: /lib/modules/2.6.26-2-686/kernel/drivers/scsi/megaraid.ko
> version: 2.00.4
> license: GPL
> description: LSI Logic MegaRAID legacy driver
> author: sju at lsil.com
> srcversion: 4F5308E80636CC8DD41C548
> alias: pci:v00008086d00001960sv*sd*bc*sc*i*
> alias: pci:v0000101Ed00009060sv*sd*bc*sc*i*
> alias: pci:v0000101Ed00009010sv*sd*bc*sc*i*
> depends: scsi_mod
> vermagic: 2.6.26-2-686 SMP mod_unload modversions 686
> parm: max_cmd_per_lun:Maximum number of commands which can be
> issued to a single LUN (default=DEF_CMD_PER_LUN=63) (uint)
> parm: max_sectors_per_io:Maximum number of sectors per I/O
> request
> (default=MAX_SECTORS_PER_IO=128) (ushort)
> parm: max_mbox_busy_wait:Maximum wait for mailbox in microseconds
> if busy (default=MBOX_BUSY_WAIT=10) (ushort)
>
> # modinfo megaraid_sas
> filename:
> /lib/modules/2.6.26-2-686/kernel/drivers/scsi/megaraid/megaraid_sas.ko
> description: LSI MegaRAID SAS Driver
> author: megaraidlinux at lsi.com
> version: 00.00.04.01
> license: GPL
> srcversion: 49BEAEC53F4BE8F4646C64A
> alias: pci:v00001028d00000015sv*sd*bc*sc*i*
> alias: pci:v00001000d00000413sv*sd*bc*sc*i*
> alias: pci:v00001000d00000079sv*sd*bc*sc*i*
> alias: pci:v00001000d00000078sv*sd*bc*sc*i*
> alias: pci:v00001000d0000007Csv*sd*bc*sc*i*
> alias: pci:v00001000d00000060sv*sd*bc*sc*i*
> alias: pci:v00001000d00000411sv*sd*bc*sc*i*
> depends: scsi_mod
> vermagic: 2.6.26-2-686 SMP mod_unload modversions 686
> parm: poll_mode_io:Complete cmds from IO path, (default=0) (int)
>
> and funny thing is that I can't even see the PCI card
>
> # lspci
> 00:00.0 Host bridge: Broadcom CNB20HE Host Bridge (rev 23)
> 00:00.1 Host bridge: Broadcom CNB20HE Host Bridge (rev 01)
> 00:00.2 Host bridge: Broadcom CNB20HE Host Bridge (rev 01)
> 00:00.3 Host bridge: Broadcom CNB20HE Host Bridge (rev 01)
> 00:04.0 Ethernet controller: Intel Corporation 82557/8/9/0/1 Ethernet Pro
> 100 (rev 08)
> 00:0e.0 VGA compatible controller: ATI Technologies Inc Rage XL (rev 27)
> 00:0f.0 ISA bridge: Broadcom OSB4 South Bridge (rev 50)
> 00:0f.1 IDE interface: Broadcom OSB4 IDE Controller
> 00:0f.2 USB Controller: Broadcom OSB4/CSB5 OHCI USB Controller (rev 04)
> 01:02.0 PCI bridge: Intel Corporation 80960RM (i960RM) Bridge (rev 02)
> 01:0a.0 Token ring network controller: Madge Networks Smart 100/16/4 PCI
> Ringnode (rev 01)
> 03:06.0 Ethernet controller: Intel Corporation 82557/8/9/0/1 Ethernet Pro
> 100 (rev 0c)
> 03:08.0 PCI bridge: Intel Corporation 80960RP (i960RP)
> Microprocessor/Bridge
> (rev 02)
> 03:08.1 I2O: Intel Corporation 80960RP (i960RP) Microprocessor (rev 02)
>
> and it should be a PCI card as per my understanding or maybe I'm wrong???
>
>
> 2. What is the exact procedure for disk replacement and array rebuild? I
> have some knowledge about this and picked up some bits and pieces from
> other
> posts here but just want to make sure I'm not doing anything wrong.
>
> Thanks a lot for any reply and guidence.
>
> Cheers,
> Igor
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL:
> http://lists.us.dell.com/pipermail/linux-poweredge/attachments/20110111/64528c98/attachment-0001.htm
>
> ------------------------------
>
> Message: 2
> Date: Wed, 12 Jan 2011 03:12:07 +0000
> From: Stroller
> Subject: Re: No RAID controller in omreport - Debian Lenny on PE2500
> To: linux-poweredge
> Message-ID:
> <30656EDE-CCB0-40CE-A40C-5C124842710A at stellar.eclipse.co.uk>
> Content-Type: text/plain; charset="us-ascii"
>
>
> On 11/1/2011, at 11:28pm, Igor Cicimov wrote:
> > ...
> > 1. I have installed the OMSA 5.5.0 package from sara repositories as
> recommended by couple of guys on one of my previous posts. But I can't get
> any info about my PERC 2/DC RAID controller:
>
> I can't find your previous posts, searching my email folder for "igor" or
> your email address.
>
> However, the first hit on Google for "OMSA 5.5" is this page:
> http://support.dell.com/support/edocs/software/svradmin/5.5/index.htm
>
> One thing about Dell releases, the README always has a compatibility
> section:
>
> http://support.dell.com/support/edocs/software/svradmin/5.5/en/README/readme_sa.txt
>
> The 2500 is not mentioned on that page.
>
> > ...
> > and funny thing is that I can't even see the PCI card
> >
> > # lspci
> > 00:00.0 Host bridge: Broadcom CNB20HE Host Bridge (rev 23)
> > 00:00.1 Host bridge: Broadcom CNB20HE Host Bridge (rev 01)
> > 00:00.2 Host bridge: Broadcom CNB20HE Host Bridge (rev 01)
> > 00:00.3 Host bridge: Broadcom CNB20HE Host Bridge (rev 01)
> > 00:04.0 Ethernet controller: Intel Corporation 82557/8/9/0/1 Ethernet Pro
> 100 (rev 08)
> > 00:0e.0 VGA compatible controller: ATI Technologies Inc Rage XL (rev 27)
> > 00:0f.0 ISA bridge: Broadcom OSB4 South Bridge (rev 50)
> > 00:0f.1 IDE interface: Broadcom OSB4 IDE Controller
> > 00:0f.2 USB Controller: Broadcom OSB4/CSB5 OHCI USB Controller (rev 04)
> > 01:02.0 PCI bridge: Intel Corporation 80960RM (i960RM) Bridge (rev 02)
> > 01:0a.0 Token ring network controller: Madge Networks Smart 100/16/4 PCI
> Ringnode (rev 01)
> > 03:06.0 Ethernet controller: Intel Corporation 82557/8/9/0/1 Ethernet Pro
> 100 (rev 0c)
> > 03:08.0 PCI bridge: Intel Corporation 80960RP (i960RP)
> Microprocessor/Bridge (rev 02)
> > 03:08.1 I2O: Intel Corporation 80960RP (i960RP) Microprocessor (rev 02)
> >
> > and it should be a PCI card as per my understanding or maybe I'm wrong???
>
> That does seem kinda weird.
>
> Are you booted from this system? or are is this output from a liveCD?
>
> > 2. What is the exact procedure for disk replacement and array rebuild? I
> have some knowledge about this and picked up some bits and pieces from other
> posts here but just want to make sure I'm not doing anything wrong.
>
> Uh, I don't think I could manage this succinctly. It's easy when you've
> done it a dozen times, and I would guess many people learn to manage RAID
> arrays by being shown the first time. Once you understand RAID concepts,
> using a new RAID controller from a different manufacturer is pretty
> intuitive. Isn't there a manual?
>
> If I'm understanding correctly that this is a 1000mhz Pentum III Poweredge
> 2500, then:
>
> This system is pretty old. If it's your own machine, mess around with it
> until you understand what's going on. Don't put anything important on it,
> don't be afraid to to break it. That's the best way to learn.
>
> If this machine is used in a business then update it. This machine is
> really too old to support properly. If you're a Linux geek and you love
> playing then have fun, but this machine is too old to be be supportable in
> any dependable way. Please excuse me for making assumptions on the basis of
> your name, but if you're somewhere where the exchange rate would make a
> brand new server prohibitively expensive (or your boss is tight with money)
> then at least update to a secondhand machine that's less than 5 years old.
> You will find those are affordable now and at least somewhat easier to
> support - look at newer OMSAs, see what systems are supported, see what's
> affordable on eBay; make sure you get spares. Play with the new system until
> you get comfortable with it before migrating to it.
>
> Stroller.
>
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL:
> http://lists.us.dell.com/pipermail/linux-poweredge/attachments/20110111/ef77e1ff/attachment-0001.htm
>
> ------------------------------
>
> Message: 3
> Date: Tue, 11 Jan 2011 22:24:05 -0500
> From: "J. Epperson"
> Subject: Re: No RAID controller in omreport - Debian Lenny on PE2500
> To: linux-poweredge at dell.com
> Message-ID:
>
> Content-Type: text/plain;charset=iso-8859-1
>
> On Tue, January 11, 2011 22:12, Stroller wrote:
> >
> > Uh, I don't think I could manage this succinctly. It's easy when you've
> > done it a dozen times, and I would guess many people learn to manage RAID
> > arrays by being shown the first time. Once you understand RAID concepts,
> > using a new RAID controller from a different manufacturer is pretty
> > intuitive. Isn't there a manual?
> >
> > If I'm understanding correctly that this is a 1000mhz Pentum III
> Poweredge
> > 2500, then:
> >
> > This system is pretty old. If it's your own machine, mess around with it
> > until you understand what's going on. Don't put anything important on it,
> > don't be afraid to to break it. That's the best way to learn.
> >
> > If this machine is used in a business then update it. This machine is
> > really too old to support properly. If you're a Linux geek and you love
> > playing then have fun, but this machine is too old to be be supportable
> in
> > any dependable way. Please excuse me for making assumptions on the basis
> > of your name, but if you're somewhere where the exchange rate would make
> a
> > brand new server prohibitively expensive (or your boss is tight with
> > money) then at least update to a secondhand machine that's less than 5
> > years old. You will find those are affordable now and at least somewhat
> > easier to support - look at newer OMSAs, see what systems are supported,
> > see what's affordable on eBay; make sure you get spares. Play with the
> new
> > system until you get comfortable with it before migrating to it.
> >
> >
>
> Also, IIRC that PERC 2 requires the legacy megaraid driver, and I don't
> even know if it's still there in current distros. You can pick up a PERC
> 3 on eBay for $20US or less, and it will work with current drivers. If
> you just want to play with the box, as Stroller says.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.us.dell.com/pipermail/linux-poweredge/attachments/20110111/f3a6347e/attachment-0001.htm
From brian.omahony at curamsoftware.com Wed Jan 12 03:58:37 2011
From: brian.omahony at curamsoftware.com (Brian O'Mahony)
Date: Wed, 12 Jan 2011 09:58:37 +0000
Subject: [OT] Openmanage Nagios check timeouts
Message-ID: <86E8DA9E18BC2344BD0218BF23C88DF301435EE4891F@MAIL06.curamsoftware.com>
I know this is slightly (wildly?) off topic, but I've been seeing a strange issue with my OpenManage nagios checks on RHEL 5 boxes for the last few months. When the servers are under heavy load, the check times out. This is the only check that does.
I am seeing this mainly on a PE2950 that is doing secure copies of about 300Gb data from various VOB servers, to itself, and then backing these up via network agent & Backup Exec.
The machine goes into very heavy load, but Im not sure why only the check_openamnage times out
Has anyone else seen this behavior?
B
The information in this email is confidential and may be legally privileged.
It is intended solely for the addressee. Access to this email by anyone else
is unauthorized. If you are not the intended recipient, any disclosure,
copying, distribution or any action taken or omitted to be taken in reliance
on it, is prohibited and may be unlawful. If you are not the intended
addressee please contact the sender and dispose of this e-mail. Thank you.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.us.dell.com/pipermail/linux-poweredge/attachments/20110112/f7bbc065/attachment.htm
From marius.boeru at hostway.ro Wed Jan 12 03:58:44 2011
From: marius.boeru at hostway.ro (Marius Boeru)
Date: Wed, 12 Jan 2011 11:58:44 +0200
Subject: PE 2850 CPU upgrade questions
Message-ID: <3EBE80A7-8E7F-4159-9754-819529000F32@hostway.ro>
Hello,
We Have a Dell Poweredge 2850 with 2 x Intel(R) Xeon(TM) CPU 3.20GHz and we want to upgrade to faster processors.
I found out that the Socket of the proc is 604, and after googling a little found these:
http://www.amazon.com/Processor-Intel-Xeon-3-6-Socket/dp/B000GM7MP8
http://www.amazon.com/gp/product/B0002HM98W
lshw shows the following info:
id: core
description: Motherboard
product: 0NJ023
vendor: Dell Computer Corporation
physical id: 0
version: A01
id: cpu:0
description: CPU
product: Intel(R) Xeon(TM) CPU 3.20GHz
vendor: Intel Corp.
physical id: 400
bus info: cpu at 0
version: 15.4.10
serial: 0000-0F4A-0000-0000-0000-0000
slot: PROC_1
size: 3200MHz
capacity: 3600MHz
width: 64 bits
clock: 800MHz
capabilities: boot fpu fpu_exception wp vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx x86-64 constant_tsc pebs bts pni monitor ds_cpl cid cx16 xtpr lahf_lm
configuration:
id = 1
I just want to check and see if the information i got is correct and that proc is the best one i can find for this type of server. Or maybe buy proccesor that supports hyperthreading?
Please let me know.
Thanks,
Marius
From Dell at epperson.homelinux.net Wed Jan 12 06:15:23 2011
From: Dell at epperson.homelinux.net (J. Epperson)
Date: Wed, 12 Jan 2011 07:15:23 -0500
Subject: No RAID controller in omreport - Debian Lenny on PE2500
In-Reply-To:
References:
Message-ID: <36467aaf59b7404b31a7b77ad345e26d.squirrel@epperson.homelinux.net>
On Tue, January 11, 2011 23:06, Igor Cicimov wrote:
> Thanks guys. Actually I was wrong the card is being seen by the system
>
> # omreport system summary
> sh: /bin/rpm: No such file or directory
> System Summary
> ------------------
> Software Profile
> ------------------
> Systems Management
> Name : Information not available.
> Version : 3.5.0
> Description : Systems Management Software
> Operating System
> Name : Linux
> Version : Kernel 2.6.26-2-686 (i686)
> System Time : Wed Jan 12 13:37:51 2011
> System Bootup Time : Mon Jan 10 20:28:16 2011
> --------
> System
> --------
> System
> Host Name : pe2500
> System Location : Please set the value
> ---------------------
> Main System Chassis
> ---------------------
> Chassis Information
> Chassis Model : PowerEdge 2500
> Chassis Service Tag : 293Z41S
> Chassis Lock : Present
> Chassis Asset Tag : Unknown
> Processor 1
> Processor Manufacturer : Intel
> Processor Family : Pentium III
> Processor Version : Model 8 Stepping 10
> Current Speed : 1000 MHz
> Maximum Speed : 1533 MHz
> External Clock Speed : 133 MHz
> Voltage : 2000 mV
> Processor 2
> Processor Manufacturer : Intel
> Processor Family : Pentium III
> Processor Version : Model 8 Stepping 10
> Current Speed : 1000 MHz
> Maximum Speed : 1533 MHz
> External Clock Speed : 133 MHz
> Voltage : 2000 mV
> Memory
> Total Installed Capacity : 2048 MB
> Memory Available to the OS : 2028 MB
> Total Maximum Capacity : 6144 MB
> Memory Array Count : 1
> Memory Array 1
> Location : System Board or Motherboard
> Use : System Memory
> Installed Capacity : 2048 MB
> Maximum Capacity : 6144 MB
> Slots Available : 6
> Slots Used : 4
> ECC Type : Single Bit ECC
> Slot PCI1
> Adapter : PRO/100 S Server Adapter
> Type : PCI
> Data Bus Width : 64 Bits
> Speed : 33 MHz
> Slot Length : Long
> Voltage Supply : 3.3 Volts
> Slot PCI2
> Adapter : PERC 2/DC
> Type : PCI
> Data Bus Width : 64 Bits
> Speed : 33 MHz
> Slot Length : Long
> Voltage Supply : 3.3 Volts
> Slot PCI3
> Adapter : [Not Occupied]
> Type : PCI
> Data Bus Width : 64 Bits
> Speed : 33 MHz
> Slot Length : Long
> Voltage Supply : 5 Volts
> Slot PCI4
> Adapter : Smart 100/16/4 PCI Ringnode
> Type : PCI
> Data Bus Width : 64 Bits
> Speed : 33 MHz
> Slot Length : Long
> Voltage Supply : 5 Volts
> Slot PCI5
> Adapter : [Not Occupied]
> Type : PCI
> Data Bus Width : 64 Bits
> Speed : 33 MHz
> Slot Length : Long
> Voltage Supply : 5 Volts
> Slot PCI6
> Adapter : [Not Occupied]
> Type : PCI
> Data Bus Width : 32 Bits
> Speed : 33 MHz
> Slot Length : Long
> Voltage Supply : 5 Volts
> Slot PCI7
> Adapter : [Not Occupied]
> Type : PCI
> Data Bus Width : 32 Bits
> Speed : 33 MHz
> Slot Length : Long
> Voltage Supply : 5 Volts
> BIOS Information
> Manufacturer : Dell Inc.
> Version : A05
> Release Date : 08/12/2002
> Firmware Information
> Name : ESM firmware
> Version : 5.43
> Firmware Information
> Name : Backplane firmware
> Version : 1.29
>
> just the controller doesn't want to play. So it's in the second PCI slot
> as
> shown above. Also 'lspci -vv' shows that the I2O card is actually the RAID
> controller
>
> 03:08.1 I2O: Intel Corporation 80960RP (i960RP) Microprocessor (rev 02)
> (prog-if 01)
> Subsystem: Dell PowerEdge Expandable RAID Controller 2/DC
> Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop-
> ParErr-
> Stepping- SERR+ FastB2B- DisINTx-
> Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort-
> SERR- Latency: 64, Cache Line Size: 32 bytes
> Interrupt: pin A routed to IRQ 17
> Region 0: Memory at fe400000 (32-bit, prefetchable) [size=4M]
> [virtual] Expansion ROM at 80020000 [disabled] [size=32K]
> Capabilities: [80] Power Management version 2
> Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA
> PME(D0-,D1-,D2-,D3hot-,D3cold-)
> Status: D0 PME-Enable- DSel=0 DScale=0 PME-
> Kernel driver in use: megaraid_legacy
> Kernel modules: megaraid, i2o_core
> The driver is obviously the megaraid_legacy Debian kernel driver and is
> managed by the megaraid and i2o_core modules.
>
> Maybe fimware upgrade od driver upgrade will help?
>
>
If you have "heaps of stuff" on a 2500, be prepared for a "steaming heap".
When I took my last one out of service ~2 years ago, there were 7
cannibalized ones in the rack with it that had contributed vital parts to
its longevity.
I have not touched a PERC 2 in a couple of years, but I seem to recall
that I20 mode was bad news for some reason(s). You may want to try
toggling out of that in the card bios, don't remember exactly where it is,
probably Objects-->Adapter.
From t.h.amundsen at usit.uio.no Wed Jan 12 06:23:31 2011
From: t.h.amundsen at usit.uio.no (Trond Hasle Amundsen)
Date: Wed, 12 Jan 2011 13:23:31 +0100
Subject: [OT] Openmanage Nagios check timeouts
In-Reply-To: <86E8DA9E18BC2344BD0218BF23C88DF301435EE4891F@MAIL06.curamsoftware.com>
(Brian O'Mahony's message of "Wed, 12 Jan 2011 09:58:37 +0000")
References: <86E8DA9E18BC2344BD0218BF23C88DF301435EE4891F@MAIL06.curamsoftware.com>
Message-ID: <15ty66q8g58.fsf@tux.uio.no>
"Brian O'Mahony" writes:
> I know this is slightly (wildly?) off topic, but I?ve been seeing a strange issue with my
> OpenManage nagios checks on RHEL 5 boxes for the last few months. When the servers are under
> heavy load, the check times out. This is the only check that does.
>
> I am seeing this mainly on a PE2950 that is doing secure copies of about 300Gb data from
> various VOB servers, to itself, and then backing these up via network agent & Backup Exec.
>
> The machine goes into very heavy load, but Im not sure why only the check_openamnage times out
>
> Has anyone else seen this behavior?
Hi Brian,
If you're running check_openmanage in local mode, it will run various
omreport commands to determine component statuses. Under heavy load,
these commands will suffer and take more time, which is normal behaviour
in any OS. Besides reducing the load on the monitored server, you have
two options:
1. Increase the plugin timeout with the '-t' option. You may also need
to increase the NRPE timeout accordingly.
2. Switch to checking via SNMP, which is more lightweight and doesn't
need as much CPU time on the monitored server.
Cheers,
--
Trond H. Amundsen
Center for Information Technology Services, University of Oslo
From Ian.Anderson at NRCan-RNCan.gc.ca Wed Jan 12 08:28:39 2011
From: Ian.Anderson at NRCan-RNCan.gc.ca (Anderson, Ian)
Date: Wed, 12 Jan 2011 09:28:39 -0500
Subject: Bug in OMSA script on a minimal install RHEL/CentOS
Message-ID:
There is a small bug/omission in the following OMSA bootstrap.cgi
script:
http://linux.dell.com/repo/hardware/OMSA_6.4/bootstrap.cgi
On line 100:
email=$(gpg -v ${GPG_FN} 2>/dev/null | grep 1024D | perl -p -i -e
's/.*<(.*)>/\1/')
gpg is used for extracting the email address from the pgp key.
The problem is that a RedHat/CentOS minimal install does not install
gnupg by default. This results in the email line returning nothing,
which in turn causes line 104 in the bootstrap.cgi script to return
success for any pgp key that may be installed, which finally causes the
dell pgp keys not to be installed, because the script thinks the already
are.
This then causes the script to fail when it gets to the point of trying
to verify the signature of the dell-omsa-repository-2-5.noarch.rpm
I'm not sure what is easier, to add a check for the presence of gnupg at
the beginning of the script or to modify the script not to use gpg.
Ian
From daniele-ml at libertyline.it Wed Jan 12 09:07:01 2011
From: daniele-ml at libertyline.it (Daniele Paoni)
Date: Wed, 12 Jan 2011 16:07:01 +0100
Subject: R710 SAS 6/iR Integrated: omreport says the driver is old
Message-ID: <4D2DC395.5050600@libertyline.it>
Hello I have a problem with a Dell R710 with SAS 6/iR Integrated controller.
The system is a CentOS 5.5 with the latest updates and OMSA 6.4
installed from the repository,
I have installed the latest driver found on the site (
mptlinux-4.00.38.02) but omreport still says that the SAS 6/iR is Degraded.
[root at web ~]# omreport storage controller
Controller SAS 6/iR Integrated (Embedded)
Controllers
ID : 0
Status : Non-Critical
Name : SAS 6/iR Integrated
Slot ID : Embedded
State : Degraded
Firmware Version : 00.25.47.00.06.22.03.00
Minimum Required Firmware Version : Not Applicable
Driver Version : 3.04.13rh
Minimum Required Driver Version : 3.12.29.00
Storport Driver Version : Not Applicable
Minimum Required Storport Driver Version : Not Applicable
Number of Connectors : 2
[root at web ~]# /sbin/modinfo mptsas
filename: /lib/modules/2.6.18-194.32.1.el5/extra/mptsas.ko
version: 4.00.38.02
license: GPL
description: Fusion MPT SAS Host driver
author: LSI Corporation
srcversion: E51EE0D539AEBDC23494300
alias: pci:v00001000d00000062sv*sd*bc*sc*i*
alias: pci:v00001000d00000058sv*sd*bc*sc*i*
alias: pci:v00001000d00000056sv*sd*bc*sc*i*
alias: pci:v00001000d00000054sv*sd*bc*sc*i*
alias: pci:v00001000d00000050sv*sd*bc*sc*i*
depends:
mptscsih,scsi_mod,mptbase,scsi_transport_sas,mptbase,mptscsih
vermagic: 2.6.18-194.32.1.el5 SMP mod_unload gcc-4.1
parm: mpt_pt_clear: Clear persistency table: enable=1
(default=MPTSCSIH_PT_CLEAR=0) (int)
parm: mpt_cmd_retry_count: Device discovery TUR command retry
count: default=144 (int)
parm: mpt_disable_hotplug_remove: Disable hotpug remove
events: default=0 (int)
parm: mpt_sdev_queue_depth: Max Device Queue Depth (default=64)
The driver seems ok but omreport reports a different version (3.04.13rh
instead of 4.00.38.02)
Am I checking the wrong driver?
Regards
Daniele
From robin-lists at robinbowes.com Wed Jan 12 13:55:38 2011
From: robin-lists at robinbowes.com (Robin Bowes)
Date: Wed, 12 Jan 2011 19:55:38 +0000
Subject: R710 2.5" mixed SAS/SATA
In-Reply-To: <4D2D2216.40508@umiacs.umd.edu>
References: <4D2D2216.40508@umiacs.umd.edu>
Message-ID: <4D2E073A.3010205@robinbowes.com>
On 12/01/11 03:37, Derek Yarnell wrote:
> Hi,
>
> We bought a PowerEdge R710 with the 8bay 2.5" configuration. As per the
> docs we bought two SAS drives in bays 0,1 and we are trying to put in 6
> SSD after market drives (Crucial RealSSD C300) in the rest of the slots.
> The H700 sees the drives but the chassis throws "Drive Slot sensor for
> Stroage, drive fault was asserted" for each of the drives.
>
> Anyone else doing this? It seems like everything is ok other than my
> drives are blinking orange which is not optimal.
Just a hunch, but make sure you update the H700 firmware. Specifically,
make sure you have the one that allows you to use non-Dell drives.
R.
--
"Feed that ego and you starve the soul" - Colonel J.D. Wilkes
http://www.theshackshakers.com/
From tcooper at ucsd.edu Wed Jan 12 14:26:24 2011
From: tcooper at ucsd.edu (Trevor Cooper)
Date: Wed, 12 Jan 2011 12:26:24 -0800
Subject: R710 SAS 6/iR Integrated: omreport says the driver is old
In-Reply-To: <4D2DC395.5050600@libertyline.it>
References: <4D2DC395.5050600@libertyline.it>
Message-ID: <4D2E0E70.5010300@ucsd.edu>
On 01/12/11 07:07, Daniele Paoni wrote:
> Hello I have a problem with a Dell R710 with SAS 6/iR Integrated controller.
>
> The system is a CentOS 5.5 with the latest updates and OMSA 6.4
> installed from the repository,
>
> I have installed the latest driver found on the site (
> mptlinux-4.00.38.02) but omreport still says that the SAS 6/iR is Degraded.
>
> ...
>
> The driver seems ok but omreport reports a different version (3.04.13rh
> instead of 4.00.38.02)
>
> Am I checking the wrong driver?
>
> Regards
> Daniele
I think you're likely doing everything correctly but are running into an
unreported error during the build/install.
I had the same problem.
Building/installing the driver manually and rebooting seemed to make
everything 'appear' correctly...
[admin at XXXX ~]# sudo rpm -ivh mptlinux-4.00.38.02-3dkms.noarch.rpm
[admin at XXXX ~]# sudo shutdown -r now
After reboot versions did NOT match. So...
[root at XXXX ~]# dkms build -m mptlinux -v 4.00.38.02
[root at XXXX ~]# dkms install -m mptlinux -v 4.00.38.02
[root at XXXX ~]# shutdown -r now
After reboot...
[root at XXXX ~]# modinfo mptsas
filename: /lib/modules/2.6.18-194.26.1.el5/extra/mptsas.ko
version: 4.00.38.02
license: GPL
description: Fusion MPT SAS Host driver
author: LSI Corporation
srcversion: E51EE0D539AEBDC23494300
alias: pci:v00001000d00000062sv*sd*bc*sc*i*
alias: pci:v00001000d00000058sv*sd*bc*sc*i*
alias: pci:v00001000d00000056sv*sd*bc*sc*i*
alias: pci:v00001000d00000054sv*sd*bc*sc*i*
alias: pci:v00001000d00000050sv*sd*bc*sc*i*
depends:
mptscsih,scsi_mod,mptbase,scsi_transport_sas,mptbase,mptscsih
vermagic: 2.6.18-194.26.1.el5 SMP mod_unload gcc-4.1
parm: mpt_pt_clear: Clear persistency table: enable=1
(default=MPTSCSIH_PT_CLEAR=0) (int)
parm: mpt_cmd_retry_count: Device discovery TUR command retry
count: default=144 (int)
parm: mpt_disable_hotplug_remove: Disable hotpug remove
events: default=0 (int)
parm: mpt_sdev_queue_depth: Max Device Queue Depth (default=64)
parm: max_lun: max lun, default=16895 (int)
[root at XXXX ~]# dkms status
mptlinux, 4.00.38.02, 2.6.18-194.26.1.el5, x86_64: installed
(original_module exists)
mptlinux, 4.00.38.02, 2.6.18-194.26.1.el5, x86_64: installed-weak from
2.6.18-194.17.4.el5
[admin at XXXX ~]$ omreport storage controller controller=2
Controller SAS 6/iR Integrated (Embedded)
Controllers
ID : 2
Status : Ok
Name : SAS 6/iR Integrated
Slot ID : Embedded
State : Ready
Firmware Version : 00.25.47.00.06.22.03.00
Driver Version : 4.00.38.02
Versions match and OMSA no longer reports a degraded state.
Good luck,
Trevor
--
Trevor Cooper, M.Sc.
Data Systems Programmer / System Administrator
University of California, San Diego
Multimodal Imaging Laboratory
8950 Villa La Jolla Dr., Suite C101
La Jolla, CA 92037
Phone: (858) 822-4330
Fax: (858) 534-1078
From icicimov at gmail.com Wed Jan 12 16:43:37 2011
From: icicimov at gmail.com (Igor Cicimov)
Date: Thu, 13 Jan 2011 09:43:37 +1100
Subject: Linux-PowerEdge Digest, Vol 80, Issue 25
In-Reply-To:
References:
Message-ID:
Thanks for the tip I appreciate your help. Regarding replacing the PERC 2/DC
card with PERC 3/DC lets say ... do I need new cables or the same cables
from the PERC2 card are good for PERC3 too (guess the connectors are
standard but just to make sure)? Also when I connect the new PERC3 card will
it recognize the RAID5 and virtual disk structure and simply take over and
configure it self or I have to do it myself?
Sorry for the format of the reply but I get the emails in the digest form so
can't reply to a particular message.
>
> Message: 3
> Date: Wed, 12 Jan 2011 07:15:23 -0500
> From: "J. Epperson"
> Subject: Re: No RAID controller in omreport - Debian Lenny on PE2500
> To: linux-poweredge at dell.com
> Message-ID:
> <36467aaf59b7404b31a7b77ad345e26d.squirrel at epperson.homelinux.net>
> Content-Type: text/plain;charset=iso-8859-1
>
> On Tue, January 11, 2011 23:06, Igor Cicimov wrote:
> > Thanks guys. Actually I was wrong the card is being seen by the system
> >
> > # omreport system summary
> > sh: /bin/rpm: No such file or directory
> > System Summary
> > ------------------
> > Software Profile
> > ------------------
> > Systems Management
> > Name : Information not available.
> > Version : 3.5.0
> > Description : Systems Management Software
> > Operating System
> > Name : Linux
> > Version : Kernel 2.6.26-2-686 (i686)
> > System Time : Wed Jan 12 13:37:51 2011
> > System Bootup Time : Mon Jan 10 20:28:16 2011
> > --------
> > System
> > --------
> > System
> > Host Name : pe2500
> > System Location : Please set the value
> > ---------------------
> > Main System Chassis
> > ---------------------
> > Chassis Information
> > Chassis Model : PowerEdge 2500
> > Chassis Service Tag : 293Z41S
> > Chassis Lock : Present
> > Chassis Asset Tag : Unknown
> > Processor 1
> > Processor Manufacturer : Intel
> > Processor Family : Pentium III
> > Processor Version : Model 8 Stepping 10
> > Current Speed : 1000 MHz
> > Maximum Speed : 1533 MHz
> > External Clock Speed : 133 MHz
> > Voltage : 2000 mV
> > Processor 2
> > Processor Manufacturer : Intel
> > Processor Family : Pentium III
> > Processor Version : Model 8 Stepping 10
> > Current Speed : 1000 MHz
> > Maximum Speed : 1533 MHz
> > External Clock Speed : 133 MHz
> > Voltage : 2000 mV
> > Memory
> > Total Installed Capacity : 2048 MB
> > Memory Available to the OS : 2028 MB
> > Total Maximum Capacity : 6144 MB
> > Memory Array Count : 1
> > Memory Array 1
> > Location : System Board or Motherboard
> > Use : System Memory
> > Installed Capacity : 2048 MB
> > Maximum Capacity : 6144 MB
> > Slots Available : 6
> > Slots Used : 4
> > ECC Type : Single Bit ECC
> > Slot PCI1
> > Adapter : PRO/100 S Server Adapter
> > Type : PCI
> > Data Bus Width : 64 Bits
> > Speed : 33 MHz
> > Slot Length : Long
> > Voltage Supply : 3.3 Volts
> > Slot PCI2
> > Adapter : PERC 2/DC
> > Type : PCI
> > Data Bus Width : 64 Bits
> > Speed : 33 MHz
> > Slot Length : Long
> > Voltage Supply : 3.3 Volts
> > Slot PCI3
> > Adapter : [Not Occupied]
> > Type : PCI
> > Data Bus Width : 64 Bits
> > Speed : 33 MHz
> > Slot Length : Long
> > Voltage Supply : 5 Volts
> > Slot PCI4
> > Adapter : Smart 100/16/4 PCI Ringnode
> > Type : PCI
> > Data Bus Width : 64 Bits
> > Speed : 33 MHz
> > Slot Length : Long
> > Voltage Supply : 5 Volts
> > Slot PCI5
> > Adapter : [Not Occupied]
> > Type : PCI
> > Data Bus Width : 64 Bits
> > Speed : 33 MHz
> > Slot Length : Long
> > Voltage Supply : 5 Volts
> > Slot PCI6
> > Adapter : [Not Occupied]
> > Type : PCI
> > Data Bus Width : 32 Bits
> > Speed : 33 MHz
> > Slot Length : Long
> > Voltage Supply : 5 Volts
> > Slot PCI7
> > Adapter : [Not Occupied]
> > Type : PCI
> > Data Bus Width : 32 Bits
> > Speed : 33 MHz
> > Slot Length : Long
> > Voltage Supply : 5 Volts
> > BIOS Information
> > Manufacturer : Dell Inc.
> > Version : A05
> > Release Date : 08/12/2002
> > Firmware Information
> > Name : ESM firmware
> > Version : 5.43
> > Firmware Information
> > Name : Backplane firmware
> > Version : 1.29
> >
> > just the controller doesn't want to play. So it's in the second PCI slot
> > as
> > shown above. Also 'lspci -vv' shows that the I2O card is actually the
> RAID
> > controller
> >
> > 03:08.1 I2O: Intel Corporation 80960RP (i960RP) Microprocessor (rev 02)
> > (prog-if 01)
> > Subsystem: Dell PowerEdge Expandable RAID Controller 2/DC
> > Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop-
> > ParErr-
> > Stepping- SERR+ FastB2B- DisINTx-
> > Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort-
> > SERR- > Latency: 64, Cache Line Size: 32 bytes
> > Interrupt: pin A routed to IRQ 17
> > Region 0: Memory at fe400000 (32-bit, prefetchable) [size=4M]
> > [virtual] Expansion ROM at 80020000 [disabled] [size=32K]
> > Capabilities: [80] Power Management version 2
> > Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA
> > PME(D0-,D1-,D2-,D3hot-,D3cold-)
> > Status: D0 PME-Enable- DSel=0 DScale=0 PME-
> > Kernel driver in use: megaraid_legacy
> > Kernel modules: megaraid, i2o_core
> > The driver is obviously the megaraid_legacy Debian kernel driver and is
> > managed by the megaraid and i2o_core modules.
> >
> > Maybe fimware upgrade od driver upgrade will help?
> >
> >
>
> If you have "heaps of stuff" on a 2500, be prepared for a "steaming heap".
> When I took my last one out of service ~2 years ago, there were 7
> cannibalized ones in the rack with it that had contributed vital parts to
> its longevity.
>
> I have not touched a PERC 2 in a couple of years, but I seem to recall
> that I20 mode was bad news for some reason(s). You may want to try
> toggling out of that in the card bios, don't remember exactly where it is,
> probably Objects-->Adapter.
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.us.dell.com/pipermail/linux-poweredge/attachments/20110112/7130aaaa/attachment-0001.htm
From Richard.Nadeau at zionsbancorp.com Wed Jan 12 17:16:44 2011
From: Richard.Nadeau at zionsbancorp.com (Richard Nadeau)
Date: Wed, 12 Jan 2011 16:16:44 -0700
Subject: PowerEdge Temperature readings
Message-ID: <451A0E3E1CF570489DE040C2661F67A41277B88FDC@UTEXVS02.zbc.internal>
Hi,
We're currently monitoring our systems using IPMI and have noticed that several systems (R610, R710, T710) have temperature readings on one Memory sensor above the upper critical range:
Sensor ID : Temp (0xc)
Entity ID : 8.1 (Memory Module)
Sensor Type (Analog) : Temperature
Sensor Reading : 49 (+/- 1) degrees C
Status : Upper Critical
Nominal Reading : 23.000
Normal Minimum : 11.000
Normal Maximum : 69.000
Upper critical : 47.000
Upper non-critical : 42.000
Question: Should this concern us?
I am otherwise just going to adjust the upper thresholds, considering that the "Normal Maximum" reading is 69, but wanted to get some feedback from the list first. :)
Thanks,
Rick
=======================================================
THIS ELECTRONIC MESSAGE, INCLUDING ANY ACCOMPANYING DOCUMENTS, IS CONFIDENTIAL and may contain information that is privileged and exempt from disclosure under applicable law. If you are neither the intended recipient nor responsible for delivering the message to the intended recipient, please note that any dissemination, distribution, copying or the taking of any action in reliance upon the message is strictly prohibited. If you have received this communication in error, please notify the sender immediately. Thank you.
From Dell at epperson.homelinux.net Wed Jan 12 17:29:04 2011
From: Dell at epperson.homelinux.net (J. Epperson)
Date: Wed, 12 Jan 2011 18:29:04 -0500
Subject: Linux-PowerEdge Digest, Vol 80, Issue 25
In-Reply-To:
References:
Message-ID: <7540d19fde5fff4134dae92957959776.squirrel@epperson.homelinux.net>
On Wed, January 12, 2011 17:43, Igor Cicimov wrote:
> Thanks for the tip I appreciate your help. Regarding replacing the PERC
> 2/DC card with PERC 3/DC lets say ... do I need new cables or the same
> cables from the PERC2 card are good for PERC3 too (guess the connectors
> are standard but just to make sure)? Also when I connect the new PERC3
> card will it recognize the RAID5 and virtual disk structure and simply
> take over and configure it self or I have to do it myself?
>
Same cables/connectors will work. Simplest way to ensure import of the
Configuration On Disk (COD) is to install the card and boot into the PERC
bios (CTRL-M) without the drives attached. Clear the NVRAM configuration.
Shut down and attach the drives. Boot. If you get a prompt about
importing the config, say yes. It should boot on up and your volumes
should be there.
I've done this, but I don't remember if I had to put an alias for
megaraid_mbox into /etc/modprobe.conf (or /etc/modprobe.d/local.conf) and
remake the initrd with mkinitrd. The alias line would look like:
alias scsi_hostadapter1 megaraid_mbox
Of course, you never want to do anything this major without a backup.
From Prudhvi_Tella at Dell.com Thu Jan 13 14:59:46 2011
From: Prudhvi_Tella at Dell.com (Prudhvi_Tella at Dell.com)
Date: Thu, 13 Jan 2011 14:59:46 -0600
Subject: OpenManage 6.4 for Ubuntu/Debian posted
Message-ID: <478B365948C8D1479951067B1C47DD919B7D021695@AUSX7MCPC101.AMER.DELL.COM>
OpenManage 6.4 for Ubuntu/Debian is now available for download. At the moment, only 64-bit packages are available.
We are still working on 32-bit packages and will update once they are posted.
http://linux.dell.com/repo/community/deb/latest/
http://en.community.dell.com/dell-blogs/enterprise/b/tech-center/archive/2011/01/13/dell-openmanage-server-administrator-6-4-for-ubuntu.aspx
The above blog post goes into detail about this release.
-Prudhvi Tella
From stevej at cheatcodes.com Thu Jan 13 15:14:02 2011
From: stevej at cheatcodes.com (Steve Jenkins)
Date: Thu, 13 Jan 2011 16:14:02 -0500
Subject: OpenManage 6.4 for Ubuntu/Debian posted
In-Reply-To: <478B365948C8D1479951067B1C47DD919B7D021695@AUSX7MCPC101.AMER.DELL.COM>
References: <478B365948C8D1479951067B1C47DD919B7D021695@AUSX7MCPC101.AMER.DELL.COM>
Message-ID: <324A11FB6AE3924083DACAAE042340C605E4AFFD@BE19.exg3.exghost.com>
Any update on a patch for OM 6.4 on RedHat that will stop my PE1850 from
complaining that "Controller 0 [PERC 4e/Si]: Firmware '5B2D' is out of
date"?
It's more than just a minor nuisance. We use nagios to check OMSA status
in order tell us if a number of things go wrong, and since check_omsa is
now in a "warning state" about the firmware, it won't tell us about
anything else that OMSA might find that IS actually broken.
Thanks,
SteveJ
-----Original Message-----
From: linux-poweredge-bounces at dell.com
[mailto:linux-poweredge-bounces at dell.com] On Behalf Of
Prudhvi_Tella at dell.com
Sent: Thursday, January 13, 2011 1:00 PM
To: linux-poweredge at lists.us.dell.com
Subject: OpenManage 6.4 for Ubuntu/Debian posted
OpenManage 6.4 for Ubuntu/Debian is now available for download. At the
moment, only 64-bit packages are available.
We are still working on 32-bit packages and will update once they are
posted.
http://linux.dell.com/repo/community/deb/latest/
http://en.community.dell.com/dell-blogs/enterprise/b/tech-center/archive
/2011/01/13/dell-openmanage-server-administrator-6-4-for-ubuntu.aspx
The above blog post goes into detail about this release.
-Prudhvi Tella
_______________________________________________
Linux-PowerEdge mailing list
Linux-PowerEdge at dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq
From ksuehring at web.de Thu Jan 13 16:40:26 2011
From: ksuehring at web.de (Karsten Suehring)
Date: Thu, 13 Jan 2011 23:40:26 +0100
Subject: OpenManage 6.4 for Ubuntu/Debian posted
In-Reply-To: <478B365948C8D1479951067B1C47DD919B7D021695@AUSX7MCPC101.AMER.DELL.COM>
References: <478B365948C8D1479951067B1C47DD919B7D021695@AUSX7MCPC101.AMER.DELL.COM>
Message-ID:
(from the blog post:)
> Another noteworthy improvement is a fix for a memory leak issue with
> dsm_sa_snmpd that occurs when you use the Nagios check_openmanage plugin.
Wow, this sounds great. Can't wait to try SNMP monitoring again.
Thanks for making these packages available.
Best regards,
Karsten
On Thu, Jan 13, 2011 at 9:59 PM, wrote:
> OpenManage 6.4 for Ubuntu/Debian is now available for download. At the moment, only 64-bit packages are available.
> We are still working on 32-bit packages and will update once they are posted.
>
> http://linux.dell.com/repo/community/deb/latest/
> http://en.community.dell.com/dell-blogs/enterprise/b/tech-center/archive/2011/01/13/dell-openmanage-server-administrator-6-4-for-ubuntu.aspx
>
> The above blog post goes into detail about this release.
>
> -Prudhvi Tella
>
> _______________________________________________
> Linux-PowerEdge mailing list
> Linux-PowerEdge at dell.com
> https://lists.us.dell.com/mailman/listinfo/linux-poweredge
> Please read the FAQ at http://lists.us.dell.com/faq
>
From icicimov at gmail.com Thu Jan 13 17:44:40 2011
From: icicimov at gmail.com (Igor Cicimov)
Date: Fri, 14 Jan 2011 10:44:40 +1100
Subject: Linux-PowerEdge Digest, Vol 80, Issue 28
In-Reply-To:
References:
Message-ID:
Thanks for the instructions mate. Couple of more questions though. All the
PERC 3/DC cards I can find are PCI-X cards. I have 2xPCI 64-bit 66MHz slots
on PE2500 and no PCI-X so wonder if the card will work ok in one of them
(with lower bandwith of course)?
>From the PERC BIOS manual I can see that I can disable the I2O mode on the
PERC 2/DC card but it says in that case the card will use the Dell driver
instead the OS (Debian) one. How can I get around this? Do I have to have
the driver on a floppy during the configuration?
Thanks for all your help so far.
Message: 2
> Date: Wed, 12 Jan 2011 18:29:04 -0500
> From: "J. Epperson"
> Subject: Re: Linux-PowerEdge Digest, Vol 80, Issue 25
> To: linux-poweredge at dell.com
> Message-ID:
> <7540d19fde5fff4134dae92957959776.squirrel at epperson.homelinux.net>
> Content-Type: text/plain;charset=iso-8859-1
>
> On Wed, January 12, 2011 17:43, Igor Cicimov wrote:
> > Thanks for the tip I appreciate your help. Regarding replacing the PERC
> > 2/DC card with PERC 3/DC lets say ... do I need new cables or the same
> > cables from the PERC2 card are good for PERC3 too (guess the connectors
> > are standard but just to make sure)? Also when I connect the new PERC3
> > card will it recognize the RAID5 and virtual disk structure and simply
> > take over and configure it self or I have to do it myself?
> >
>
> Same cables/connectors will work. Simplest way to ensure import of the
> Configuration On Disk (COD) is to install the card and boot into the PERC
> bios (CTRL-M) without the drives attached. Clear the NVRAM configuration.
> Shut down and attach the drives. Boot. If you get a prompt about
> importing the config, say yes. It should boot on up and your volumes
> should be there.
>
> I've done this, but I don't remember if I had to put an alias for
> megaraid_mbox into /etc/modprobe.conf (or /etc/modprobe.d/local.conf) and
> remake the initrd with mkinitrd. The alias line would look like:
> alias scsi_hostadapter1 megaraid_mbox
>
> Of course, you never want to do anything this major without a backup.
>
>
>
>
> ------------------------------
>
> _______________________________________________
> Linux-PowerEdge mailing list
> Linux-PowerEdge at dell.com
> https://lists.us.dell.com/mailman/listinfo/linux-poweredge
> Please read the FAQ at http://lists.us.dell.com/faq
>
> End of Linux-PowerEdge Digest, Vol 80, Issue 28
> ***********************************************
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.us.dell.com/pipermail/linux-poweredge/attachments/20110113/77679325/attachment.htm
From Dell at epperson.homelinux.net Thu Jan 13 18:20:27 2011
From: Dell at epperson.homelinux.net (J. Epperson)
Date: Thu, 13 Jan 2011 19:20:27 -0500
Subject: I20 discussion with Igor [Re: Linux-PowerEdge Digest, Vol 80, Issue
28]
In-Reply-To:
References:
Message-ID: <9b7d438afe28de6e2571ea65f2ccbf34.squirrel@epperson.homelinux.net>
On Thu, January 13, 2011 18:44, Igor Cicimov wrote:
> Thanks for the instructions mate. Couple of more questions though. All
> the PERC 3/DC cards I can find are PCI-X cards. I have 2xPCI 64-bit 66MHz
> slots on PE2500 and no PCI-X so wonder if the card will work ok in one
> of them (with lower bandwith of course)?
>
> From the PERC BIOS manual I can see that I can disable the I2O mode on
> the PERC 2/DC card but it says in that case the card will use the Dell
> driver instead the OS (Debian) one. How can I get around this? Do I have
> to have the driver on a floppy during the configuration?
>
The 9M912 PERC 3/DC is a PCI card. Searching eBay for 9M912, I see one
mislabeled as PCI-X on the first page of results, but several of the
others clearly state PCI. They were of the 2650 generation, and those
didn't have PCI-X risers, AFAIK
Can't say for sure about the I20 driver issue. My recollection was that
it used the legacy megaraid driver in non-I20 mode. Googling "PERC I20"
turns up a 2007 discussion of Ubuntu users having problems until they
toggled it to Mass Storage, after which it booted right up. I'm pretty
sure I never had to add any drivers to run PERC 2/DCs that way under
RHEL3.
BTW, make sure your PERC 2 is the LSI one if you're going to get the 3/DC.
There was an Adaptec PERC 2 (that also had an I20 mode). You can't port
an existing RAID from an Adaptec to an LSI controller, would have to build
new and restore backup.
From basv at sara.nl Fri Jan 14 01:45:45 2011
From: basv at sara.nl (Bas van der Vlies)
Date: Fri, 14 Jan 2011 08:45:45 +0100
Subject: OpenManage 6.4 for Ubuntu/Debian posted
In-Reply-To: <478B365948C8D1479951067B1C47DD919B7D021695@AUSX7MCPC101.AMER.DELL.COM>
References: <478B365948C8D1479951067B1C47DD919B7D021695@AUSX7MCPC101.AMER.DELL.COM>
Message-ID: <4D2FFF29.60204@sara.nl>
On 13-01-11 21:59, Prudhvi_Tella at dell.com wrote:
> OpenManage 6.4 for Ubuntu/Debian is now available for download. At the moment, only 64-bit packages are available.
> We are still working on 32-bit packages and will update once they are posted.
>
> http://linux.dell.com/repo/community/deb/latest/
> http://en.community.dell.com/dell-blogs/enterprise/b/tech-center/archive/2011/01/13/dell-openmanage-server-administrator-6-4-for-ubuntu.aspx
>
> The above blog post goes into detail about this release.
>
That is good news ;-)
--
********************************************************************
* Bas van der Vlies e-mail: basv at sara.nl *
* SARA - Academic Computing Services Amsterdam, The Netherlands *
********************************************************************
From J.H.Hodrien at leeds.ac.uk Fri Jan 14 03:30:58 2011
From: J.H.Hodrien at leeds.ac.uk (John Hodrien)
Date: Fri, 14 Jan 2011 09:30:58 +0000 (GMT)
Subject: OpenManage 6.4 for Ubuntu/Debian posted
In-Reply-To: <324A11FB6AE3924083DACAAE042340C605E4AFFD@BE19.exg3.exghost.com>
References: <478B365948C8D1479951067B1C47DD919B7D021695@AUSX7MCPC101.AMER.DELL.COM>
<324A11FB6AE3924083DACAAE042340C605E4AFFD@BE19.exg3.exghost.com>
Message-ID:
On Thu, 13 Jan 2011, Steve Jenkins wrote:
> Any update on a patch for OM 6.4 on RedHat that will stop my PE1850 from
> complaining that "Controller 0 [PERC 4e/Si]: Firmware '5B2D' is out of
> date"?
>
> It's more than just a minor nuisance. We use nagios to check OMSA status
> in order tell us if a number of things go wrong, and since check_omsa is
> now in a "warning state" about the firmware, it won't tell us about
> anything else that OMSA might find that IS actually broken.
Have you thought of using check_openmanage instead? You can dial out checks
you don't want to perform, which is useful for cases like this.
jh
From brian.omahony at curamsoftware.com Fri Jan 14 03:48:12 2011
From: brian.omahony at curamsoftware.com (Brian O'Mahony)
Date: Fri, 14 Jan 2011 09:48:12 +0000
Subject: [OT] Openmanage Nagios check timeouts
In-Reply-To: <5183E900B0A14242B2D0EB1C8FBA7074@davinci>
References: <86E8DA9E18BC2344BD0218BF23C88DF301435EE4891F@MAIL06.curamsoftware.com>
<5183E900B0A14242B2D0EB1C8FBA7074@davinci>
Message-ID: <86E8DA9E18BC2344BD0218BF23C88DF301435EE4893C@MAIL06.curamsoftware.com>
Worked a treat setting it to 60sec in Centreon.
Thx
B
-----Original Message-----
From: Steve Baroti [mailto:steve.baroti at gmail.com]
Sent: Thursday, January 13, 2011 5:07 AM
To: Brian O'Mahony
Subject: Re: [OT] Openmanage Nagios check timeouts
Yes, we noticed a similar behaviour while under heavy load for: PE 1950 (RHEL 5), R710 (RHEL 5), PE 1955 (Win2k3) Did you try to increase the open_manage check time-out value, from the default 30 seconds to 45 seconds, or even 60 seconds? I did it on our systems few moments ago, and I will let you know if still happening in the next 24 hours.
:-) Cheers, Steve.
----- Original Message -----
From: Brian O'Mahony
To: linux-poweredge at dell.com
Sent: Wednesday, January 12, 2011 04:58
Subject: [OT] Openmanage Nagios check timeouts
I know this is slightly (wildly?) off topic, but I've been seeing a strange
issue with my OpenManage nagios checks on RHEL 5 boxes for the last few
months. When the servers are under heavy load, the check times out. This is
the only check that does.
I am seeing this mainly on a PE2950 that is doing secure copies of about
300Gb data from various VOB servers, to itself, and then backing these up
via network agent & Backup Exec.
The machine goes into very heavy load, but Im not sure why only the
check_openamnage times out
Has anyone else seen this behavior?
B
The information in this email is confidential and may be legally privileged.
It is intended solely for the addressee. Access to this email by anyone else
is unauthorized. If you are not the intended recipient, any disclosure,
copying, distribution or any action taken or omitted to be taken in reliance
on it, is prohibited and may be unlawful. If you are not the intended
addressee please contact the sender and dispose of this e-mail. Thank you.
_______________________________________________
Linux-PowerEdge mailing list
Linux-PowerEdge at dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq
The information in this email is confidential and may be legally privileged.
It is intended solely for the addressee. Access to this email by anyone else
is unauthorized. If you are not the intended recipient, any disclosure,
copying, distribution or any action taken or omitted to be taken in reliance
on it, is prohibited and may be unlawful. If you are not the intended
addressee please contact the sender and dispose of this e-mail. Thank you.
From daniele-ml at libertyline.it Fri Jan 14 09:06:58 2011
From: daniele-ml at libertyline.it (Daniele Paoni)
Date: Fri, 14 Jan 2011 16:06:58 +0100
Subject: R710 SAS 6/iR Integrated: omreport says the driver is old
In-Reply-To: <4D2E0E70.5010300@ucsd.edu>
References: <4D2DC395.5050600@libertyline.it> <4D2E0E70.5010300@ucsd.edu>
Message-ID: <4D306692.3040906@libertyline.it>
Il 12/01/2011 21:26, Trevor Cooper ha scritto:
> On 01/12/11 07:07, Daniele Paoni wrote:
>>> I think you're likely doing everything correctly but are running into an
> unreported error during the build/install.
>
> I had the same problem.
>
> Building/installing the driver manually and rebooting seemed to make
> everything 'appear' correctly...
> [...]
> > Versions match and OMSA no longer reports a degraded state.
>
I will try it during the weekend when I can reboot the server and I'll
tell you if it worked.
Daniele
--
Daniele Paoni - Developer and System Administrator
Liberty Line srl - Via Macaggi 17/14 - 16121 Genova
From sascha.bendix at 360t.com Fri Jan 14 09:51:46 2011
From: sascha.bendix at 360t.com (Sascha Bendix)
Date: Fri, 14 Jan 2011 16:51:46 +0100
Subject: OMSA on mosty plain debian
Message-ID: <3561FAED2AB7294783F9ECEFE0836BED2982C3944E@talent.360t.com>
Hi,
I just wanted to ask: Is there a proper way to setup Dell OMSA on a mostly unchanged debian lenny?
I normally stick with RHEL and there the support is fine, but now I got some appliance which is based on debian lenny and didn't find a proper way to install it. The ubuntu repository doesn't work because of the old libstdc++ and I only found some weird hack like installing it in a chroot.
Does anybody use it on a mostly plain debian lenny? Or is there another proper way to get at least the monitoring functionalities working?
Regards,
Sascha Bendix
--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.
From davide.ferrari at atrapalo.com Fri Jan 14 09:59:53 2011
From: davide.ferrari at atrapalo.com (Davide Ferrari)
Date: Fri, 14 Jan 2011 16:59:53 +0100
Subject: OMSA on mosty plain debian
In-Reply-To: <3561FAED2AB7294783F9ECEFE0836BED2982C3944E@talent.360t.com>
References: <3561FAED2AB7294783F9ECEFE0836BED2982C3944E@talent.360t.com>
Message-ID: <1295020793.2938.2.camel@pc-0707-007>
On Fri, 2011-01-14 at 16:51 +0100, Sascha Bendix wrote:
> Does anybody use it on a mostly plain debian lenny? Or is there
> another proper way to get at least the monitoring functionalities
> working?
The Sara.nl repository:
deb ftp://ftp.sara.nl/pub/sara-omsa dell sara
(thanks again to Sara people for their awesome work)
--
Davide Ferrari
System Administrator
Atrapalo S.L.
From philip at naoj.org Fri Jan 14 12:50:45 2011
From: philip at naoj.org (Philip Tait)
Date: Fri, 14 Jan 2011 08:50:45 -1000
Subject: OMSA on mosty plain debian
In-Reply-To: <3561FAED2AB7294783F9ECEFE0836BED2982C3944E@talent.360t.com>
References: <3561FAED2AB7294783F9ECEFE0836BED2982C3944E@talent.360t.com>
Message-ID:
On Fri, Jan 14, 2011 at 05:51, Sascha Bendix wrote:
> Hi,
>
> I just wanted to ask: Is there a proper way to setup Dell OMSA on a mostly unchanged debian lenny?
>
> I normally stick with RHEL and there the support is fine, but now I got some appliance which is based on debian lenny and didn't find a proper way to install it. The ubuntu repository doesn't work because of the old libstdc++ and I only found some weird hack like installing it in a chroot.
>
> Does anybody use it on a mostly plain debian lenny? Or is there another proper way to get at least the monitoring functionalities working?
I use the 'sara.nl' version:
deb ftp://ftp.sara.nl/pub/sara-omsa dell6 sara
Seems to work OK.
--
Philip J. Tait, Software Engineer (FMOS,HSC)
http://subarutelescope.org
From ray at rainstormconsulting.com Fri Jan 14 13:36:08 2011
From: ray at rainstormconsulting.com (Ray Kolbe)
Date: Fri, 14 Jan 2011 14:36:08 -0500
Subject: PE1750 DRACs not responsive
Message-ID: <4D30A5A8.7030402@rainstormconsulting.com>
Hi all,
I have two PE1750 servers with DRACs in them. About a month or two ago
both DRACs stopped working. I can ping them but I am unable to connect
via racadm or the web interface.
How can I go about regaining access to my DRACs and why would something
like this happen (I did not change any settings or upgrade firmware
recently).
Thanks,
Ray
From joepgottlieb at i3D.net Sat Jan 15 05:38:53 2011
From: joepgottlieb at i3D.net (i3D.net - Joep Gottlieb)
Date: Sat, 15 Jan 2011 12:38:53 +0100
Subject: PE1750 DRACs not responsive
In-Reply-To: <4D30A5A8.7030402@rainstormconsulting.com>
References: <4D30A5A8.7030402@rainstormconsulting.com>
Message-ID: <4D31874D.2040503@i3D.net>
Hi Ray,
We have a lot of PE860's on which after a very long period, the BMC also
becomes unavailable. (seems random, some don't suffer the problem,
others do)
The only solution so far we have found, is turn of the server and make
it completely powerless for 30 seconds.
Boot it up again, reset the DRAC / BMC, save. Turn off the server again
and make it powerless for 30 seconds.
Turn the server on again and setup the DRAC / BMC again, and it will
respond again to everything.
If you have the chance to try this, I suggest you do, this fixes the
problem for us in 99% of all cases, if that does not help, you can
safely assume the DRAC / BMC is broken down. (wouldn't expect that in
your case though, as it still pings)
Best regards,
Joep Gottlieb
Contact:
E-mail Personal: joepgottlieb at i3D.net
E-mail Support: info at i3D.net
E-mail NOC: noc at i3D.net
Website: http://www.i3D.net
Office:
Interactive 3D B.V.
Meent 93b
3011 JG Rotterdam
The Netherlands
Visit www.smartDC.net -- SmartDC is our in-house 36,000 sq. ft. datacenter in Rotterdam, The Netherlands. High density hosting -- multiple fiber carriers in-house -- Level3 PoP.
Interactive 3D (i3D.net) is a company registered in The Netherlands at Meent 93b, Rotterdam. Registration #: 14074337 - VAT # NL 8202.63.886.B01. Interactive 3D (i3D.net) is CDSA certified on content protection and security. We are ranked in the Deloitte Technology Fast 50 as one of the fastest growing technology companies.
Op 14-1-2011 20:36, Ray Kolbe schreef:
> Hi all,
>
> I have two PE1750 servers with DRACs in them. About a month or two ago
> both DRACs stopped working. I can ping them but I am unable to connect
> via racadm or the web interface.
>
> How can I go about regaining access to my DRACs and why would something
> like this happen (I did not change any settings or upgrade firmware
> recently).
>
> Thanks,
> Ray
>
> _______________________________________________
> Linux-PowerEdge mailing list
> Linux-PowerEdge at dell.com
> https://lists.us.dell.com/mailman/listinfo/linux-poweredge
> Please read the FAQ at http://lists.us.dell.com/faq
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.us.dell.com/pipermail/linux-poweredge/attachments/20110115/30199a38/attachment.htm
From stevej at cheatcodes.com Sun Jan 16 00:14:53 2011
From: stevej at cheatcodes.com (Steve Jenkins)
Date: Sun, 16 Jan 2011 01:14:53 -0500
Subject: OpenManage 6.4 for Ubuntu/Debian posted
In-Reply-To:
References: <478B365948C8D1479951067B1C47DD919B7D021695@AUSX7MCPC101.AMER.DELL.COM><324A11FB6AE3924083DACAAE042340C605E4AFFD@BE19.exg3.exghost.com>
Message-ID: <324A11FB6AE3924083DACAAE042340C605E4B1C8@BE19.exg3.exghost.com>
>Have you thought of using check_openmanage instead? You can dial out
checks
>you don't want to perform, which is useful for cases like this.
Thanks John - we were running check_openmanage, but without any options.
We now run it as:
./check_openmanage -b ctrl_fw=0
And the message is silenced. :)
Thanks,
Steve
From phyre at rogers.com Sat Jan 15 18:41:46 2011
From: phyre at rogers.com (MK)
Date: Sat, 15 Jan 2011 19:41:46 -0500
Subject: Lifecycle controller for OS install via Remote File Share (R710)
Message-ID: <4D323ECA.8010909@rogers.com>
iDRAC6 is firmware 1.54 running in shared NIC mode. latest 6.x.x
Broadcom firmware.
Using a iDRAC6 with a Debain ISO mounted via a local NFS share using the
'Remote File Share' feature.
Everything goes great, boots off the ISO, install starts, copies all the
files it needs and so on. As soon as it loads the bnx2 drivers, I
receive media errors from the CD and the install doesn't move from there.
It's loading the standard linux kernel bnx2 drivers. I'm assuming
there's some bug where the OS mucks with the IDRAC6's connection. The
iDRAC6 is still accessible via the Web and for my video/keyboard
console, however from that point forward, the keyboard is laggy (I'll
push a button and it'll repeat it multiple times), as though it's having
network issues detecting the current state of the keyboard.
In any case, the above makes it completely unusable to install an OS,
which is probably the key purpose of it.
Ideas?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.us.dell.com/pipermail/linux-poweredge/attachments/20110116/4f54cab6/attachment.htm
From frederik at kriewitz.eu Mon Jan 17 00:24:31 2011
From: frederik at kriewitz.eu (Frederik Kriewitz)
Date: Mon, 17 Jan 2011 07:24:31 +0100
Subject: Lifecycle controller for OS install via Remote File Share (R710)
In-Reply-To: <4D323ECA.8010909@rogers.com>
References: <4D323ECA.8010909@rogers.com>
Message-ID:
On Sun, Jan 16, 2011 at 1:41 AM, MK wrote:
> Everything goes great, boots off the ISO, install starts, copies all the
> files it needs and so on.? As soon as it loads the bnx2 drivers, I receive
> media errors from the CD and the install doesn't move from there.
>
> It's loading the standard linux kernel bnx2 drivers.? I'm assuming there's
> some bug where the OS mucks with the IDRAC6's connection.? The iDRAC6 is
> still accessible via the Web and for my video/keyboard console, however from
> that point forward, the keyboard is laggy (I'll push a button and it'll
> repeat it multiple times), as though it's having network issues detecting
> the current state of the keyboard.
Are you sure it's a "read error"? I think it's a problem caused by
debians 'don't include proprietary firmwares' policy. The bnx2
firmware falls into this category and is missing from the isos.
Using the official netinstall CD you'll have to create another image
containing the firmware and mount it using the iDRAC.
A few day's ago I found these unofficial netisntall isos including the
non-free firmwares:
http://cdimage.debian.org/cdimage/unofficial/non-free/cd-including-firmware/
Would be great if you could try them and let us know if they're working.
From nick.lunt at patech-solutions.com Mon Jan 17 03:53:57 2011
From: nick.lunt at patech-solutions.com (Nick Lunt)
Date: Mon, 17 Jan 2011 09:53:57 -0000
Subject: Updating PERC6E firmware
Message-ID: <47073A5E92271A409F44D18958C1AAA60153A086@server13.PatechSolutions.local>
Hi folks
OS: Redhat 5.2 kernel x86-64 2.6.18-92.el5
PE2950
The battery on the PERC6e has failed, Dell replaced the PERC6e and the
battery but the battery is still failed.
Dell recommend I update the firmware to the latest version, so I tried
to update from the running 6.2.0-0013 to 6.3.0-0001.
This failed as shown here:
# sh RAID_FRMW_LX_R278430.BIN
Collecting inventory...
....
Running validation...
This Update Package is not compatible with your system configuration.
So I redownloaded the update and tried again, but got the same error.
I also thought I download the firmware update for the PERC6i as the
server has one of those as well, but that update failed with the exact
same error!
Anyone had any luck updating the PERC firmware to 6.3.0-0001 on a PE2950
running 64 bit RH 5.2?
Or does anyone have any idea what the problem might be?
Cheers
Nick .
__________ Information from ESET NOD32 Antivirus, version of virus
signature database 5792 (20110116) __________
The message was checked by ESET NOD32 Antivirus.
http://www.eset.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.us.dell.com/pipermail/linux-poweredge/attachments/20110117/03174cea/attachment.htm
From antony at cantoute.com Mon Jan 17 04:21:14 2011
From: antony at cantoute.com (Antony GIBBS)
Date: Mon, 17 Jan 2011 11:21:14 +0100
Subject: Monitoring SAS 6iR RAID status
In-Reply-To:
References:
Message-ID:
Le 20 juil. 2010 ? 16:11, Nicholas Lozo a ?crit :
> We use mpt-status for this. Ours is wrapped in a Nagios plugin, but simpler implementation could just be called from shell script.
>
> Nick
> --
>
> On Jul 20, 2010, at 8:44 AM, Rodrigo Trevisaneli wrote:
>
>> Hi,
>>
>> This morning a database running on a PE T105 with a SAS 6iR
>> was down, and after reboot I found the filesystem demaged.
>>
>> Running fsck fix this, and the /var/log/messages has a lot
>> of disk-related messages starting at 10:00 pm. The SAS6iR
>> utility show me RAID status degradaded and a fail status on disk 0.
>>
>> Disk 0 was removed and will be replaced. The server is running
>> with only disk 1, and my question is, is there a way to be
>> warned when a disk starting failing before the filesystem become
>> demaged? Any utility, that can read data from SAS 6iR and put
>> messages on console/syslog/email?
>>
>> Thanks,
>> -Rodrigo
>>
I have a h200 controller but had same problem.... and really better to do then playing around with dell's junk
So I looked for an easy solution and I solved it by using sas2ircu (mpt-status would probably do the job).
Simply I put in a text file the output of the raid info in good status and i then compare it using a cron
so running once
# /usr/local/sbin/sas2ircu 0 DISPLAY > /root/sas2ircu0DISPLAY.ok
and adding this line to root's crontab would check raid status every hour
0 * * * * /usr/local/sbin/sas2ircu 0 DISPLAY | diff - /root/sas2ircu0DISPLAY.ok
If anything changes in the raid status, I get it by mail (perhaps you'll have to set a MAILTO=you at domaine.com in the crontab)
Hope this helps
Antony
From t.h.amundsen at usit.uio.no Mon Jan 17 04:23:04 2011
From: t.h.amundsen at usit.uio.no (Trond Hasle Amundsen)
Date: Mon, 17 Jan 2011 11:23:04 +0100
Subject: Updating PERC6E firmware
In-Reply-To: <47073A5E92271A409F44D18958C1AAA60153A086@server13.PatechSolutions.local>
(Nick Lunt's message of "Mon, 17 Jan 2011 09:53:57 -0000")
References: <47073A5E92271A409F44D18958C1AAA60153A086@server13.PatechSolutions.local>
Message-ID: <15toc7fn81j.fsf@tux.uio.no>
"Nick Lunt" writes:
> Hi folks
>
> OS: Redhat 5.2 kernel x86-64 2.6.18-92.el5
>
> PE2950
>
> The battery on the PERC6e has failed, Dell replaced the PERC6e and the battery
> but the battery is still failed.
>
> Dell recommend I update the firmware to the latest version, so I tried to
> update from the running 6.2.0-0013 to 6.3.0-0001.
>
> This failed as shown here:
>
> # sh RAID_FRMW_LX_R278430.BIN
>
> Collecting inventory...
>
> ....
>
> Running validation...
>
> This Update Package is not compatible with your system configuration.
>
> So I redownloaded the update and tried again, but got the same error.
>
> I also thought I download the firmware update for the PERC6i as the server has
> one of those as well, but that update failed with the exact same error!
>
> Anyone had any luck updating the PERC firmware to 6.3.0-0001 on a PE2950
> running 64 bit RH 5.2?
>
> Or does anyone have any idea what the problem might be?
Hi Nick,
I had the same problem. It was resolved by upgrading OMSA from 6.2 to
6.4, after which the firmware update ran fine.
Cheers,
--
Trond H. Amundsen
Center for Information Technology Services, University of Oslo
From nick.lunt at patech-solutions.com Mon Jan 17 04:19:42 2011
From: nick.lunt at patech-solutions.com (Nick Lunt)
Date: Mon, 17 Jan 2011 10:19:42 -0000
Subject: Updating PERC6E firmware
References: <47073A5E92271A409F44D18958C1AAA60153A086@server13.PatechSolutions.local>
<15toc7fn81j.fsf@tux.uio.no>
Message-ID: <47073A5E92271A409F44D18958C1AAA60153A0A8@server13.PatechSolutions.local>
> -----Original Message-----
> From: Trond Hasle Amundsen [mailto:t.h.amundsen at usit.uio.no]
> Sent: 17 January 2011 10:23
> To: Nick Lunt
> Cc: linux-poweredge at lists.us.dell.com
> Subject: Re: Updating PERC6E firmware
>
> "Nick Lunt" writes:
>
> > Hi folks
> >
> > OS: Redhat 5.2 kernel x86-64 2.6.18-92.el5
> >
> > PE2950
> >
> > The battery on the PERC6e has failed, Dell replaced the PERC6e and
the
> battery
> > but the battery is still failed.
> >
> > Dell recommend I update the firmware to the latest version, so I
tried
> to
> > update from the running 6.2.0-0013 to 6.3.0-0001.
> >
> > This failed as shown here:
> >
> > # sh RAID_FRMW_LX_R278430.BIN
> >
> > Collecting inventory...
> >
> > ....
> >
> > Running validation...
> >
> > This Update Package is not compatible with your system
configuration.
> >
> > So I redownloaded the update and tried again, but got the same
error.
> >
> > I also thought I download the firmware update for the PERC6i as the
> server has
> > one of those as well, but that update failed with the exact same
error!
> >
> > Anyone had any luck updating the PERC firmware to 6.3.0-0001 on a
PE2950
> > running 64 bit RH 5.2?
> >
> > Or does anyone have any idea what the problem might be?
>
> Hi Nick,
>
> I had the same problem. It was resolved by upgrading OMSA from 6.2 to
> 6.4, after which the firmware update ran fine.
>
> Cheers,
Hi Trond,
thanks for that, I'll give it a go when the client gives me downtime on
the server and report back.
Cheers
Nick .
__________ Information from ESET NOD32 Antivirus, version of virus
signature database 5793 (20110117) __________
The message was checked by ESET NOD32 Antivirus.
http://www.eset.com
From frederik at kriewitz.eu Mon Jan 17 05:12:53 2011
From: frederik at kriewitz.eu (Frederik Kriewitz)
Date: Mon, 17 Jan 2011 12:12:53 +0100
Subject: OpenManage 6.4 for Ubuntu/Debian posted
In-Reply-To: <478B365948C8D1479951067B1C47DD919B7D021695@AUSX7MCPC101.AMER.DELL.COM>
References: <478B365948C8D1479951067B1C47DD919B7D021695@AUSX7MCPC101.AMER.DELL.COM>
Message-ID:
On Thu, Jan 13, 2011 at 9:59 PM, wrote:
> OpenManage 6.4 for Ubuntu/Debian is now available for download. At the moment, only 64-bit packages are available.
> We are still working on 32-bit packages and will update once they are posted.
>
> http://linux.dell.com/repo/community/deb/latest/
Is there any way to verify the packages? I couldn't find any GPG key for them.
Best Regards,
Frederik Kriewitz
From hjcochofel at pst.com.br Mon Jan 17 06:00:54 2011
From: hjcochofel at pst.com.br (=?ISO-8859-1?Q?H=E9lder_J_Cochofel?=)
Date: Mon, 17 Jan 2011 10:00:54 -0200
Subject: =?ISO-8859-1?Q?AUTO=3A_H=E9lder_J_Cochofel=2FPst_est=E1_fora_do?=
=?ISO-8859-1?Q?_escrit=F3rio=2E_=28returning_Sun_01=2F23=2F2011=29?=
Message-ID:
I am out of the office from Mon 01/17/2011 until Sun 01/23/2011.
Irei responder sua mensagem quando retornar.
Note: This is an automated response to your message "Linux-PowerEdge
Digest, Vol 80, Issue 33" sent on 17/01/2011 08:25:21.
This is the only notification you will receive while this person is away.
From phyre at rogers.com Mon Jan 17 06:33:14 2011
From: phyre at rogers.com (Michael Krieger)
Date: Mon, 17 Jan 2011 12:33:14 +0000
Subject: Lifecycle controller for OS install via Remote File Share (R710)
In-Reply-To:
References: <4D323ECA.8010909@rogers.com>
Message-ID: <721704863-1295267589-cardhu_decombobulator_blackberry.rim.net-1111509650-@bda2202.bisx.prod.on.blackberry>
I had properly done the firmware load and also used the firmware included image, with the same troubles. As soon as network loads (and works at that point), network gives medium errors.
I saw a similar post with VMware ESX's site about a possibly similar issue
-----Original Message-----
From: Frederik Kriewitz
Sender: freddy36 at gmail.com
Date: Mon, 17 Jan 2011 07:24:31
To:
Reply-To: frederik at kriewitz.eu
Cc:
Subject: Re: Lifecycle controller for OS install via Remote File Share (R710)
On Sun, Jan 16, 2011 at 1:41 AM, MK wrote:
> Everything goes great, boots off the ISO, install starts, copies all the
> files it needs and so on.? As soon as it loads the bnx2 drivers, I receive
> media errors from the CD and the install doesn't move from there.
>
> It's loading the standard linux kernel bnx2 drivers.? I'm assuming there's
> some bug where the OS mucks with the IDRAC6's connection.? The iDRAC6 is
> still accessible via the Web and for my video/keyboard console, however from
> that point forward, the keyboard is laggy (I'll push a button and it'll
> repeat it multiple times), as though it's having network issues detecting
> the current state of the keyboard.
Are you sure it's a "read error"? I think it's a problem caused by
debians 'don't include proprietary firmwares' policy. The bnx2
firmware falls into this category and is missing from the isos.
Using the official netinstall CD you'll have to create another image
containing the firmware and mount it using the iDRAC.
A few day's ago I found these unofficial netisntall isos including the
non-free firmwares:
http://cdimage.debian.org/cdimage/unofficial/non-free/cd-including-firmware/
Would be great if you could try them and let us know if they're working.
From Prudhvi_Tella at Dell.com Mon Jan 17 18:16:36 2011
From: Prudhvi_Tella at Dell.com (Prudhvi_Tella at Dell.com)
Date: Mon, 17 Jan 2011 18:16:36 -0600
Subject: OpenManage 6.4 for Ubuntu/Debian posted
In-Reply-To:
References: <478B365948C8D1479951067B1C47DD919B7D021695@AUSX7MCPC101.AMER.DELL.COM>
Message-ID: <478B365948C8D1479951067B1C47DD919B7D8FD5D3@AUSX7MCPC101.AMER.DELL.COM>
They are not signed.
-----Original Message-----
From: freddy36 at gmail.com [mailto:freddy36 at gmail.com] On Behalf Of Frederik Kriewitz
Sent: Monday, January 17, 2011 5:13 AM
To: Tella, Prudhvi
Cc: linux-poweredge-Lists
Subject: Re: OpenManage 6.4 for Ubuntu/Debian posted
On Thu, Jan 13, 2011 at 9:59 PM, wrote:
> OpenManage 6.4 for Ubuntu/Debian is now available for download. At the moment, only 64-bit packages are available.
> We are still working on 32-bit packages and will update once they are posted.
>
> http://linux.dell.com/repo/community/deb/latest/
Is there any way to verify the packages? I couldn't find any GPG key for them.
Best Regards,
Frederik Kriewitz
From siavoush11 at yahoo.com Tue Jan 18 03:02:30 2011
From: siavoush11 at yahoo.com (Siavoush Dastmalchi)
Date: Tue, 18 Jan 2011 01:02:30 -0800 (PST)
Subject: Dell PowerEdge M710 with Intel Xeon 5667 processor
Message-ID: <787081.56761.qm@web31604.mail.mud.yahoo.com>
Dear list,
I will appreciate it if I can get your expert opinion on doing parallel
computation (I will use GROMACS and AMBER molecular mechanics packages and some
other programs like CYANA, ARIA and CNS to do structure calculations based on
NMR experimental data) using a cluster based on Dell PowerEdge M710 with
Intel Xeon 5667 processor architecture which apparently each blade has two
quad-core cpus. I was wondering if I can get some information about LINUX
compatibility and parallel computation on this system.
Best regards,
Siavoush
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.us.dell.com/pipermail/linux-poweredge/attachments/20110118/2387acd1/attachment.htm
From 999cgm at gmail.com Tue Jan 18 03:23:44 2011
From: 999cgm at gmail.com (cgm)
Date: Tue, 18 Jan 2011 11:23:44 +0200
Subject: debian 2.6.26-2-amd64 R710 locks just before reboot
Message-ID:
This happens random on some servers(and not on same servers ) , even R610 .
Bios version is 2.1.15.
When this happens VGA shows a black screen and I have to power cycle it to
complete the reboot, so it sounds
like a bug somewhere .
If there is a fix for this please share it
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.us.dell.com/pipermail/linux-poweredge/attachments/20110118/d97e0598/attachment.htm
From sean at duke.edu Tue Jan 18 06:31:15 2011
From: sean at duke.edu (Sean Dilda)
Date: Tue, 18 Jan 2011 07:31:15 -0500
Subject: Dell PowerEdge M710 with Intel Xeon 5667 processor
In-Reply-To: <787081.56761.qm@web31604.mail.mud.yahoo.com>
References: <787081.56761.qm@web31604.mail.mud.yahoo.com>
Message-ID: <4D358813.2060906@duke.edu>
On 1/18/11 4:02 AM, Siavoush Dastmalchi wrote:
> Dear list,
>
> I will appreciate it if I can get your expert opinion on doing parallel
> computation (I will use GROMACS and AMBER molecular mechanics packages
> and some other programs like CYANA, ARIA and CNS to do structure
> calculations based on NMR experimental data) using a cluster based on
> Dell PowerEdge M710 with Intel Xeon 5667 processor architecture which
> apparently each blade has two quad-core cpus. I was wondering if I can
> get some information about LINUX compatibility and parallel computation
> on this system.
There shouldn't be any linux compatibility issues with any PowerEdge
system. At Duke we have a large compute cluster using a variety of
PowerEdge blades (including M710's) all running on linux.
What interconnect are you using? And are your jobs memory bound, cpu
bound, disk bound, or network bound?
If your computation is more dependent on the interlink and communication
between the nodes, its more important to worry about your interconnect.
If Inter-node communication is highly important, you may also want to
consider something like the M910. The M910 can be configured with 4
8-core CPUs, thus giving you 32 NUMA-connected cores. Or 64 logical
processors if your job is one that can benefit from HT. Note that when
going with more cores-per chip, your max clockrate tends to be lower.
As such, its really important to know how your jobs are bound so that
you can order a cluster configuration that'll be best for that job.
From alex.dupuy at mac.com Tue Jan 18 12:29:12 2011
From: alex.dupuy at mac.com (Alexander Dupuy)
Date: Tue, 18 Jan 2011 13:29:12 -0500
Subject: Broadcom network firmware for PE servers
Message-ID: <4D35DBF8.5000004@mac.com>
I tried downloading the latest "Broadcom Firmware update package, family
version 6.0.1." (from January 5) but although I can get the DUP, the
matching GPG signature file NETW_FRMW_LX_R292659.BIN.sign
is missing. Going to ftp://ftp.us.dell.com/network/ I can see that this
is not just some link typo - the file is truly missing. Adding to the
mystery, that ftp directory has a newer(?) version
(NETW_FRMW_LX_R294342.BIN
) from this
morning, which does have a signature file (but there is nothing on the
support site describing it).
Can anybody provide some insight into this?
@alex
--
mailto:alex.dupuy at mac.com
From sid.young at gmail.com Tue Jan 18 17:09:46 2011
From: sid.young at gmail.com (Sid Young)
Date: Wed, 19 Jan 2011 09:09:46 +1000
Subject: iSCSI support under Centos on M710
Message-ID:
G'Day all,
I am about to install Centos 5.5 on a group of M710/M610 blades and need
information on getting the iSCSI to work using the built in TOE ports. is
there a document from Dell (or anyone) that describes how to get centos to
see the SAN at the other end of the built in NIC's
Thanks
Sid
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.us.dell.com/pipermail/linux-poweredge/attachments/20110118/f493d626/attachment.htm
From mrolen at opubco.com Tue Jan 18 17:59:11 2011
From: mrolen at opubco.com (Mark Rolen)
Date: Tue, 18 Jan 2011 17:59:11 -0600
Subject: iSCSI support under Centos on M710
In-Reply-To:
References:
Message-ID: <1295395151.14120.31.camel@localhost.localdomain>
On Wed, 2011-01-19 at 09:09 +1000, Sid Young wrote:
> G'Day all,
>
> is there a document from Dell (or anyone) that describes how to get
> centos to see the SAN at the other end of the built in NIC's
>
> Thanks
>
> Sid
Very quick-n-dirty iSCSI "getting connected" (you'll of course want
redundant paths and so on). These steps are done on one CentOS box
connecting to another CentOS box that's running the scsi-target-utils to
present the iSCSI LUN. See the important bits at the bottom for getting
rid of unwanted targets as well.
Install the initiator:
[root at dbtest2 ~]# yum -y install iscsi-initiator-utils
Change your initiator name to whatever you like for your environment:
[root at dbtest2 ~]# vi /etc/iscsi/initiatorname.iscsi
Edit iscsid.conf to enable CHAP and set a username and pass, if that's
how you'd like to auth your iSCSI sessions (you can probably also choose
to do so based on initiatorname or IP address, your choice). I'll use
CHAP here since the linux iSCSI target is a little limited, so here are
the only three lines I have to change:
[root at dbtest2 ~]# vi /etc/iscsi/iscsid.conf
node.session.auth.authmethod = CHAP
node.session.auth.username = someuser
node.session.auth.password = b4dpass
Start the iscsi daemon:
[root at dbtest2 etc]# /etc/init.d/iscsi start
Starting iSCSI daemon: [ OK ]
[ OK ]
Setting up iSCSI targets: iscsiadm: No records found!
[ OK ]
[root at dbtest2 etc]#
Do a discovery to get my target:
[root at dbtest2 etc]# iscsiadm -m discovery -t sendtargets -p 11.11.11.5
11.11.11.5:3260,1 iqn.2008-09.com.itest:dbtest2_mysql1
Log into the target:
[root at dbtest2 etc]# iscsiadm -m node --target \ # this line wrapped
iqn.2008-09.com.itest:dbtest2_mysql1 --login
Logging in to [iface: default, target:
iqn.2008-09.com.itest:dbtest2_mysql1, portal: 11.11.11.5,3260]
Login to [iface: default, target: iqn.2008-09.com.itest:dbtest2_mysql1,
portal: 11.11.11.5,3260]: successful
Check dmesg for the new disk:
[root at dbtest2 etc]# dmesg | tail -20
scsi12 : Broadcom Offload iSCSI Initiator
iscsi: registered transport (iser)
iscsi: registered transport (be2iscsi)
scsi13 : iSCSI Initiator over TCP/IP
Vendor: IET Model: Controller Rev: 0001
Type: RAID ANSI SCSI revision: 05
scsi 13:0:0:0: Attached scsi generic sg2 type 12
Vendor: IET Model: VIRTUAL-DISK Rev: 0001
Type: Direct-Access ANSI SCSI revision: 05
SCSI device sdc: 125829120 512-byte hdwr sectors (64425 MB)
sdc: Write Protect is off
sdc: Mode Sense: 79 00 00 08
SCSI device sdc: drive cache: write back
SCSI device sdc: 125829120 512-byte hdwr sectors (64425 MB)
sdc: Write Protect is off
sdc: Mode Sense: 79 00 00 08
SCSI device sdc: drive cache: write back
sdc: sdc1
sd 13:0:0:1: Attached scsi disk sdc
sd 13:0:0:1: Attached scsi generic sg3 type 0
Good to go!
One big thing to note, the iscsi initiator in this configuration is
going to try to log into every target it saw during discovery when you
reboot. If your gear allows you to only present the target(s) that you
want it to mount, then no-harm-no-foul. But if it sees targets during
discovery that you don't want it to mount, you need to remove them from
the discovery database. In my case, the following is a LUN for a
different test server, so get rid of it:
[root at dbtest2 /]# iscsiadm -m node
11.11.11.5:3260,1 iqn.2008-09.com.itest:dbtest1_mysql1 <-to delete this
11.11.11.5:3260,1 iqn.2008-09.com.itest:dbtest2_mysql1
[root at dbteste2 /]# iscsiadm -m node -T \ # wrapped
iqn.2008-09.com.itest:dbtest1_mysql1 -o delete
[root at dbtest2 /]# iscsiadm -m node
11.11.11.5:3260,1 iqn.2008-09.com.itest:dbtest2_mysql1
Now dbtest2 only has the single target I want it to, and it will
reconnect on reboot (double-check your 'chkconfig iscsi --list', should
default to on though).
From craig.mcelroy at contegix.com Tue Jan 18 18:09:34 2011
From: craig.mcelroy at contegix.com (Craig McElroy)
Date: Tue, 18 Jan 2011 18:09:34 -0600
Subject: 6.4 OMSA for 32 Bit RHEL 6
Message-ID: <67A9287E-BFA4-4A3C-8500-4FB8A2B93E89@contegix.com>
First, thanks for releasing 6.4 OMSA with support for the scripted install for RHEL 6. However, we have noticed for cases where a 32 bit OS is installed that there aren't packages available. We have worked around this problem for the time being by forcing the RHEL version "$releasever" tag to 5 in /etc/yum.repos.d/dell-omsa-repository.repo since 32 bit RHEL 5 packages are still there. Are there plans to release these for 32 Bit RHEL 6?
http://linux.dell.com/repo/hardware/OMSA_6.4/per710/
Cheers,
-craig
From 999cgm at gmail.com Wed Jan 19 05:37:53 2011
From: 999cgm at gmail.com (cgm)
Date: Wed, 19 Jan 2011 13:37:53 +0200
Subject: debian 2.6.26-2-amd64 R710 locks just before reboot
In-Reply-To:
References:
Message-ID:
Is any1 else seeing this problem? Our Debian install is not having anything
special. ACPI is enabled .
On Tue, Jan 18, 2011 at 11:23 AM, cgm <999cgm at gmail.com> wrote:
> This happens random on some servers(and not on same servers ) , even R610
> . Bios version is 2.1.15.
> When this happens VGA shows a black screen and I have to power cycle it to
> complete the reboot, so it sounds
> like a bug somewhere .
> If there is a fix for this please share it
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.us.dell.com/pipermail/linux-poweredge/attachments/20110119/f303926b/attachment.htm
From Narendra_K at Dell.com Wed Jan 19 11:04:43 2011
From: Narendra_K at Dell.com (Narendra_K at Dell.com)
Date: Wed, 19 Jan 2011 22:34:43 +0530
Subject: Fedora Rawhide Test Day for Network Device Naming on January 27th 2011
Message-ID:
Hello,
We are conducting a Fedora Rawhide Test Day on "Network Device Naming" on Thursday, January 27th 2011.
The objective is to test the new naming scheme for onboard and PCI add-in network interfaces as suggested by 'biosdevname' utility. It would be great if you could participate and provide your feedback which would help us flush bugs.
Please join us on IRC for the event on freenode in #fedora-test-day all day on January 27th 2011.
https://fedorahosted.org/fedora-qa/ticket/159
https://fedoraproject.org/wiki/Test_Day:2011-01-27_Network_Device_Naming_With_Biosdevname
With regards,
Narendra K
From stevej at cheatcodes.com Wed Jan 19 18:31:07 2011
From: stevej at cheatcodes.com (Steve Jenkins)
Date: Wed, 19 Jan 2011 19:31:07 -0500
Subject: OpenManage 6.4 yum repository posted
In-Reply-To:
References: <75F7F7632819D94BA80703D8B1F10B6D1D101CAA70@BLRX7MCDC201.AMER.DELL.COM><2779CAD4A7612E4C99DFA78D744D811407C1E13C@exchange.sedc.sedata.com><46F6103325A0C04E99DF2ECDA75808031D54E0FCF8@BLRX7MCDC202.AMER.DELL.COM><46F6103325A0C04E99DF2ECDA75808031D54E0FD6A@BLRX7MCDC202.AMER.DELL.COM>
Message-ID: <324A11FB6AE3924083DACAAE042340C605E4B60A@BE19.exg3.exghost.com>
FYI - I followed the instructions below to migrate from OMSA 6.3 to 6.4
on a 64-bit CentOS 5.5 box. The only additional step I needed was:
rm -rf /etc/openwsman
Otherwise, after the yum install srvadmin-all step, I got complaints
about incompatibilities between the old and new openwsman. Nuking the
/etc dir with the openwsman.conf file did the trick.
I also had a little trouble getting OMSA to fire up, but took a couple
shots at it and everything is now running fine.
Thanks,
SteveJ
-----Original Message-----
From: linux-poweredge-bounces at dell.com
[mailto:linux-poweredge-bounces at dell.com] On Behalf Of Stephan van
Hienen
Sent: Wednesday, December 29, 2010 4:02 AM
To: Vaibhav_Kumar at Dell.com; linux-poweredge at lists.us.dell.com
Subject: RE: OpenManage 6.4 yum repository posted
>-----Original Message-----
>From: Vaibhav_Kumar at Dell.com [mailto:Vaibhav_Kumar at Dell.com]
>Sent: woensdag 29 december 2010 11:22
>To: Stephan van Hienen; tcooper at ucsd.edu;
linux-poweredge at lists.us.dell.com
>Subject: RE: OpenManage 6.4 yum repository posted
>
>Is following is the upgrade scenario ?
>
>OM 6.3 (32-bit RPMs) --> OM 6.4
>
>As per the link below,
>
>http://support.dell.com/support/edocs/software/smsom/6.4/en/omsa_ig/htm
l/instlx.htm#wp1054425
>
>For migrating to OM 6.4 (64-bit RPMs), 6.3 (32-bit RPMs) needs to get
uninstalled.
>
>Same link also says that upgrading to OM 6.4 (32-bit RPMs) is supported
from OM 6.2 (32-bit RPMs) and OM 6.1 (32-bit
>RPMs)
Vaibhav,
You are correct, the servers had the 32bits 6.3 openmange installed.
The correct fix was :
yum remove dell* firmware* libcmpiCppImpl0 libsmbios libsmbios*
libwsman* openwsman-* python-smbios smbios-utils-* srvadmin-*
rm -Rf /opt/dell
wget -q -O - http://linux.dell.com/repo/hardware/latest/bootstrap.cgi |
bash
yum install srvadmin-all
Stephan
_______________________________________________
Linux-PowerEdge mailing list
Linux-PowerEdge at dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq
From Steve.Tempest at gts.apn.com.au Wed Jan 19 18:37:48 2011
From: Steve.Tempest at gts.apn.com.au (Steve Tempest)
Date: Thu, 20 Jan 2011 10:37:48 +1000
Subject: OpenManage 6.4 yum repository posted
In-Reply-To: <324A11FB6AE3924083DACAAE042340C605E4B60A@BE19.exg3.exghost.com>
References: <75F7F7632819D94BA80703D8B1F10B6D1D101CAA70@BLRX7MCDC201.AMER.DELL.COM><2779CAD4A7612E4C99DFA78D744D811407C1E13C@exchange.sedc.sedata.com><46F6103325A0C04E99DF2ECDA75808031D54E0FCF8@BLRX7MCDC202.AMER.DELL.COM><46F6103325A0C04E99DF2ECDA75808031D54E0FD6A@BLRX7MCDC202.AMER.DELL.COM>
<324A11FB6AE3924083DACAAE042340C605E4B60A@BE19.exg3.exghost.com>
Message-ID: <20113_1295483870_4D3783DE_20113_212048_1_E9442BA9A8E08643AFEADC10FAE5722B3A3029@APNITSYANEXC001.apn.au>
Hi,
I've done this recently on about 10 servers ... all centos 5.5 from 6.3
to latest with only one problem cause by an old i386 rpm i still had
installed... once that was erased it worked as expected.
Steve Tempest
-----Original Message-----
From: linux-poweredge-bounces at dell.com
[mailto:linux-poweredge-bounces at dell.com] On Behalf Of Steve Jenkins
Sent: Thursday, 20 January 2011 10:31 AM
To: Stephan van Hienen; Vaibhav_Kumar at Dell.com;
linux-poweredge at lists.us.dell.com
Subject: RE: OpenManage 6.4 yum repository posted
FYI - I followed the instructions below to migrate from OMSA 6.3 to 6.4
on a 64-bit CentOS 5.5 box. The only additional step I needed was:
rm -rf /etc/openwsman
Otherwise, after the yum install srvadmin-all step, I got complaints
about incompatibilities between the old and new openwsman. Nuking the
/etc dir with the openwsman.conf file did the trick.
I also had a little trouble getting OMSA to fire up, but took a couple
shots at it and everything is now running fine.
Thanks,
SteveJ
-----Original Message-----
From: linux-poweredge-bounces at dell.com
[mailto:linux-poweredge-bounces at dell.com] On Behalf Of Stephan van
Hienen
Sent: Wednesday, December 29, 2010 4:02 AM
To: Vaibhav_Kumar at Dell.com; linux-poweredge at lists.us.dell.com
Subject: RE: OpenManage 6.4 yum repository posted
>-----Original Message-----
>From: Vaibhav_Kumar at Dell.com [mailto:Vaibhav_Kumar at Dell.com]
>Sent: woensdag 29 december 2010 11:22
>To: Stephan van Hienen; tcooper at ucsd.edu;
linux-poweredge at lists.us.dell.com
>Subject: RE: OpenManage 6.4 yum repository posted
>
>Is following is the upgrade scenario ?
>
>OM 6.3 (32-bit RPMs) --> OM 6.4
>
>As per the link below,
>
>http://support.dell.com/support/edocs/software/smsom/6.4/en/omsa_ig/htm
l/instlx.htm#wp1054425
>
>For migrating to OM 6.4 (64-bit RPMs), 6.3 (32-bit RPMs) needs to get
uninstalled.
>
>Same link also says that upgrading to OM 6.4 (32-bit RPMs) is supported
from OM 6.2 (32-bit RPMs) and OM 6.1 (32-bit
>RPMs)
Vaibhav,
You are correct, the servers had the 32bits 6.3 openmange installed.
The correct fix was :
yum remove dell* firmware* libcmpiCppImpl0 libsmbios libsmbios*
libwsman* openwsman-* python-smbios smbios-utils-* srvadmin-*
rm -Rf /opt/dell
wget -q -O - http://linux.dell.com/repo/hardware/latest/bootstrap.cgi |
bash
yum install srvadmin-all
Stephan
_______________________________________________
Linux-PowerEdge mailing list
Linux-PowerEdge at dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq
_______________________________________________
Linux-PowerEdge mailing list
Linux-PowerEdge at dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq
Notice
This email and any attachments are strictly confidential and subject to copyright. They may
contain privileged information. If you are not the intended recipient please delete the message
and notify the sender. You should not read, copy, use, change, alter or disclose this email or
its attachments without authorisation. The company and any related or associated companies do
not accept any liability in connection with this email and any attachments including in connection
with computer viruses, data corruption, delay, interruption, unauthorised access or unauthorised
amendment. Any views expressed in this email and any attachments do not necessarily reflect the
views of the company or the views of any of our related or associated companies.
From stevejenkins at gmail.com Wed Jan 19 18:45:37 2011
From: stevejenkins at gmail.com (Steve Jenkins)
Date: Wed, 19 Jan 2011 16:45:37 -0800
Subject: OpenManage 6.4 yum repository posted
In-Reply-To: <20113_1295483870_4D3783DE_20113_212048_1_E9442BA9A8E08643AFEADC10FAE5722B3A3029@APNITSYANEXC001.apn.au>
References: <75F7F7632819D94BA80703D8B1F10B6D1D101CAA70@BLRX7MCDC201.AMER.DELL.COM>
<2779CAD4A7612E4C99DFA78D744D811407C1E13C@exchange.sedc.sedata.com>
<46F6103325A0C04E99DF2ECDA75808031D54E0FCF8@BLRX7MCDC202.AMER.DELL.COM>
<46F6103325A0C04E99DF2ECDA75808031D54E0FD6A@BLRX7MCDC202.AMER.DELL.COM>
<324A11FB6AE3924083DACAAE042340C605E4B60A@BE19.exg3.exghost.com>
<20113_1295483870_4D3783DE_20113_212048_1_E9442BA9A8E08643AFEADC10FAE5722B3A3029@APNITSYANEXC001.apn.au>
Message-ID:
On Wed, Jan 19, 2011 at 4:37 PM, Steve Tempest
wrote:
> Hi,
>
> I've done this recently on about 10 servers ... all centos 5.5 from 6.3
> to latest with only one problem cause by an old i386 rpm i still had
> installed... once that was erased it worked as expected.
>
>
> Steve Tempest
G'day, Steve. That's great to hear. I've done only 2 of ours so far:
PE1850 CentOS 5.5 i386 upgrade was flawless and the PE2950 CentOS 5.5
x64_86 had that one minor hiccup I mentioned below.
Got about 10 more to go myself, so thanks for posting that procedure! :)
Steve Jenkins (fellow Aussie) :)
From rspchan at starhub.net.sg Wed Jan 19 23:20:47 2011
From: rspchan at starhub.net.sg (Richard Chan)
Date: Thu, 20 Jan 2011 13:20:47 +0800
Subject: Ubuntu OMSA 6.4: install hanging at srvadmin-omacore:
register-omacore.sh
Message-ID:
Hi,
I am trying to install OMSA 6.4 on Ubuntu 10.10 but dpkg seems to be hanging
at srvadmin-omacore.postinst configure.
Possibly it is stuck at register-omacore.sh and there is this child sed
process too.
root 30489 30468 0 13:09 pts/9 00:00:00 /bin/sh
/var/lib/dpkg/info/srvadmin-omacore.postinst configure
root 30494 30489 0 13:09 pts/9 00:00:00 bash
/opt/dell/srvadmin/lib/srvadmin-omacore/register-omacore.sh
root 30501 30494 0 13:09 pts/9 00:00:00 bash
/opt/dell/srvadmin/lib/srvadmin-omacore/register-omacore.sh
root 30502 30501 0 13:09 pts/9 00:00:00 bash
/opt/dell/srvadmin/lib/srvadmin-omacore/register-omacore.sh
root 30503 30502 2 13:09 pts/9 00:00:00 [smbios-sys-info]
root 30504 30502 0 13:09 pts/9 00:00:00 grep System ID
root 30505 30502 0 13:09 pts/9 00:00:00 sed s#^.*0x##;
s#[[:space:]].*$##
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.us.dell.com/pipermail/linux-poweredge/attachments/20110119/2c8e1613/attachment.htm
From stevejenkins at gmail.com Wed Jan 19 23:48:41 2011
From: stevejenkins at gmail.com (Steve Jenkins)
Date: Wed, 19 Jan 2011 21:48:41 -0800
Subject: PERC4 "No contollers found" $basearch workaround doesn't work on OMSA
6.4
Message-ID:
As I'm upgrading many of our systems from OMSA 6.3 -> 6.4, the
workaround that I blogged about
(http://stevejenkins.com/blog/2010/10/no-controllers-found-fix-set-up-dell-omsa-6-3-32-bit-on-rhel-centos-5-5-64-bit/)
of installing the 32-bit version of OMSA 6.3 on a 64-bit OS so that
the PERC4 controller info could be displayed does NOT seen to work
with OMSA 6.4.
Changing the $basearch variable in the repo file does install the
32-bit version of OMSA 6.4 installs as expected, but when I fire it up
I still get:
# omreport storage controller
No controllers found
Anyone hack a different workaround yet?
Thanks,
SteveJ
From rspchan at starhub.net.sg Wed Jan 19 23:54:26 2011
From: rspchan at starhub.net.sg (Richard Chan)
Date: Thu, 20 Jan 2011 13:54:26 +0800
Subject: Ubuntu OMSA 6.4: install hanging at srvadmin-omacore:
register-omacore.sh
In-Reply-To:
References:
Message-ID:
I tracked it down further to a hanging smbios-sys-info; it shows partial
information and hangs.
root at dell6950:~# smbios-sys-info
Libsmbios version: 2.2.26
Product Name: PowerEdge 6950
Vendor: Dell Inc.
BIOS Version: 1.4.6
System ID: 0x01EA
Service Tag: 5RTZN1S
Express Service Code: 12566870128
Asset Tag:
I cannot kill the process; the rest of the system seems fine.
On Thu, Jan 20, 2011 at 1:20 PM, Richard Chan wrote:
> Hi,
>
> I am trying to install OMSA 6.4 on Ubuntu 10.10 but dpkg seems to be
> hanging at srvadmin-omacore.postinst configure.
> Possibly it is stuck at register-omacore.sh and there is this child sed
> process too.
>
>
> root 30489 30468 0 13:09 pts/9 00:00:00 /bin/sh
> /var/lib/dpkg/info/srvadmin-omacore.postinst configure
> root 30494 30489 0 13:09 pts/9 00:00:00 bash
> /opt/dell/srvadmin/lib/srvadmin-omacore/register-omacore.sh
> root 30501 30494 0 13:09 pts/9 00:00:00 bash
> /opt/dell/srvadmin/lib/srvadmin-omacore/register-omacore.sh
> root 30502 30501 0 13:09 pts/9 00:00:00 bash
> /opt/dell/srvadmin/lib/srvadmin-omacore/register-omacore.sh
> root 30503 30502 2 13:09 pts/9 00:00:00 [smbios-sys-info]
> root 30504 30502 0 13:09 pts/9 00:00:00 grep System ID
> root 30505 30502 0 13:09 pts/9 00:00:00 sed s#^.*0x##;
> s#[[:space:]].*$##
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.us.dell.com/pipermail/linux-poweredge/attachments/20110119/c92bcb04/attachment.htm
From stevejenkins at gmail.com Wed Jan 19 23:58:48 2011
From: stevejenkins at gmail.com (Steve Jenkins)
Date: Wed, 19 Jan 2011 21:58:48 -0800
Subject: PERC4 "No contollers found" $basearch workaround doesn't work on
OMSA 6.4
In-Reply-To:
References:
Message-ID:
On Wed, Jan 19, 2011 at 9:48 PM, Steve Jenkins wrote:
> As I'm upgrading many of our systems from OMSA 6.3 -> 6.4, the
> workaround that I blogged about
> (http://stevejenkins.com/blog/2010/10/no-controllers-found-fix-set-up-dell-omsa-6-3-32-bit-on-rhel-centos-5-5-64-bit/)
> of installing the 32-bit version of OMSA 6.3 on a 64-bit OS so that
> the PERC4 controller info could be displayed does NOT seen to work
> with OMSA 6.4.
>
> Changing the $basearch variable in the repo file does install the
> 32-bit version of OMSA 6.4 installs as expected, but when I fire it up
> I still get:
>
> # omreport storage controller
> No controllers found
>
> Anyone hack a different workaround yet?
>
In addition to the controller issue, I also just noticed the following
error messages when I run these commands (well, Nagios noticed them
with check_openmanage, and then I confirmed them all individually):
omreport chassis memory: Error: Memory object not found
omreport chassis fans: Error! No fan probes found on this system.
omreport chassis temps: Error! No temperature probes found on this system.
omreport chassis volts: Error! No voltage probes found on this system.
However, in the web GUI, I can see the memory, and fans, and temps,
and volts - and everything looks normal.
Any ideas?
SteveJ
From rspchan at starhub.net.sg Thu Jan 20 00:24:50 2011
From: rspchan at starhub.net.sg (Richard Chan)
Date: Thu, 20 Jan 2011 14:24:50 +0800
Subject: smbios-sys-info hanging [Was: Ubuntu OMSA 6.4: install hanging at
srvadmin-omacore: register-omacore.sh]
Message-ID:
Further investigation shows that both smbios-sys-info and
smbios-sys-info-lite are not returning.
They print out partial/reasonable information and hang.
The post-installation scripts for srvadmin-omacore and srvadmin-storage are
then stuck.
If I replace smbios-sys-info and smbios-sys-info-lite with a simple shell
script that merely echoes the information below,
installation succeeds and OMSA seems to run.
root at dell6950:~# smbios-sys-info
Libsmbios version: 2.2.26
Product Name: PowerEdge 6950
Vendor: Dell Inc.
BIOS Version: 1.4.6
System ID: 0x01EA
Has anyone else observed the working-but-not-returning smbios?
BTW when I boot and run Centos 5.5 on the same machine, OMSA/smbios-sys-info
work.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.us.dell.com/pipermail/linux-poweredge/attachments/20110120/e2b05d35/attachment.htm
From Narendra_K at Dell.com Thu Jan 20 01:11:23 2011
From: Narendra_K at Dell.com (Narendra_K at Dell.com)
Date: Wed, 19 Jan 2011 23:11:23 -0800
Subject: smbios-sys-info hanging [Was: Ubuntu OMSA 6.4: install hanging
at srvadmin-omacore: register-omacore.sh]
In-Reply-To:
References:
Message-ID:
Richard,
Could you try running smbios-sys-info after unloading dcdbas driver ? After that if the utility does not hang, then this behavior is known. We saw the same utility hang on a PowerEdge 2970. The reason it happening is that
1. smbios-sys-info writes to and reads from sysfs files exposed by dcdbas module to raise SMIs to retrieve Service Tag and other details as the first method. (If SMI fails to return the data, then it falls back onto other available methods)
2.The chipset is delaying IO writes issued to raise the SMI and by the time SMI actually happens, the SMI handler could be writing to wrong locations causing hang. On the latest upstream kernel we observed a panic.
3. My guess is the system RAM in your system is <=4G ?
If you could confirm that after unloading the dcdbas, issue is not seen, it would be great.
With regards,
Narendra K
From: linux-poweredge-bounces-Lists On Behalf Of Richard Chan
Sent: Thursday, January 20, 2011 11:55 AM
To: linux-poweredge-Lists
Subject: smbios-sys-info hanging [Was: Ubuntu OMSA 6.4: install hanging at srvadmin-omacore: register-omacore.sh]
Further investigation shows that both smbios-sys-info and smbios-sys-info-lite are not returning.
They print out partial/reasonable information and hang.
The post-installation scripts for srvadmin-omacore and srvadmin-storage are then stuck.
If I replace smbios-sys-info and smbios-sys-info-lite with a simple shell script that merely echoes the information below,
installation succeeds and OMSA seems to run.
root at dell6950:~# smbios-sys-info
Libsmbios version: 2.2.26
Product Name: PowerEdge 6950
Vendor: Dell Inc.
BIOS Version: 1.4.6
System ID: 0x01EA
Has anyone else observed the working-but-not-returning smbios?
BTW when I boot and run Centos 5.5 on the same machine, OMSA/smbios-sys-info work.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.us.dell.com/pipermail/linux-poweredge/attachments/20110120/f6c48be3/attachment.htm
From rspchan at starhub.net.sg Thu Jan 20 01:27:41 2011
From: rspchan at starhub.net.sg (Richard Chan)
Date: Thu, 20 Jan 2011 15:27:41 +0800
Subject: smbios-sys-info hanging [Was: Ubuntu OMSA 6.4: install hanging at
srvadmin-omacore: register-omacore.sh]
In-Reply-To:
References:
Message-ID:
Hi Narendra
That is the situation exactly!
My PowerEdge 6950 has 4GB RAM.
When I remove dcdbas, smbios-sys-info smbios-sys-info-lite function
normally.
Is there a recommended workaround?
Tks
Richard
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.us.dell.com/pipermail/linux-poweredge/attachments/20110120/bc57f210/attachment.htm
From brian.omahony at curamsoftware.com Thu Jan 20 04:44:06 2011
From: brian.omahony at curamsoftware.com (Brian O'Mahony)
Date: Thu, 20 Jan 2011 10:44:06 +0000
Subject: Issue with sdb flooding messages on PE2850 [RHEL5.4]
Message-ID: <86E8DA9E18BC2344BD0218BF23C88DF301435EE48952@MAIL06.curamsoftware.com>
[Apologies in advance for the size of the logs]
I have an issue where the logs have been flooded with errors since Jan 02:
sdb : READ CAPACITY failed.
sdb : status=0, message=00, host=4, driver=00
sdb : sense not available.
sdb: Write Protect is off
sdb: Mode Sense: 00 00 00 00
sdb: asking for cache data failed
sdb: assuming drive cache: write through
sdb : READ CAPACITY failed.
sdb : status=0, message=00, host=4, driver=00
sdb : sense not available.
sdb: Write Protect is off
sdb: Mode Sense: 00 00 00 00
sdb: asking for cache data failed
sdb: assuming drive cache: write through
sdb:Dev sdb: unable to read RDB block 0
unable to read partition table
>From Jan 02:
Jan 2 13:23:53 ccvobblrpr kernel: usb 2-1: USB disconnect, address 2
Jan 2 13:23:56 ccvobblrpr kernel: ata1.01: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Jan 2 13:23:56 ccvobblrpr kernel: ata1.01: cmd a0/00:00:00:00:00/00:00:00:00:00/b0 tag 0
Jan 2 13:23:56 ccvobblrpr kernel: cdb 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Jan 2 13:23:56 ccvobblrpr kernel: res 40/00:03:00:00:00/00:00:00:00:00/a0 Emask 0x4 (timeout)
Jan 2 13:23:56 ccvobblrpr kernel: ata1.01: status: { DRDY }
Jan 2 13:23:56 ccvobblrpr kernel: ata1: soft resetting link
Jan 2 13:23:56 ccvobblrpr kernel: ata1.01: revalidation failed (errno=-2)
Jan 2 13:23:56 ccvobblrpr kernel: ata1: failed to recover some devices, retrying in 5 secs
Jan 2 13:24:01 ccvobblrpr kernel: ata1: soft resetting link
Jan 2 13:24:01 ccvobblrpr kernel: ata1.01: revalidation failed (errno=-2)
Jan 2 13:24:01 ccvobblrpr kernel: ata1: failed to recover some devices, retrying in 5 secs
Jan 2 13:24:06 ccvobblrpr kernel: ata1: soft resetting link
Jan 2 13:24:07 ccvobblrpr kernel: ata1.00: configured for PIO3
Jan 2 13:24:07 ccvobblrpr kernel: ata1.01: configured for PIO3
Jan 2 13:24:07 ccvobblrpr kernel: ata1: EH complete
Jan 2 13:24:39 ccvobblrpr kernel: ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Jan 2 13:24:39 ccvobblrpr kernel: ata1.00: cmd a0/00:00:00:00:00/00:00:00:00:00/a0 tag 0
Jan 2 13:24:39 ccvobblrpr kernel: cdb 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Jan 2 13:24:39 ccvobblrpr kernel: res 40/00:03:00:00:00/00:00:00:00:00/b0 Emask 0x4 (timeout)
Jan 2 13:24:39 ccvobblrpr kernel: ata1.00: status: { DRDY }
Jan 2 13:24:44 ccvobblrpr kernel: ata1: link is slow to respond, please be patient (ready=0)
Jan 2 13:24:45 ccvobblrpr nmbd[4128]: [2011/01/02 13:24:45, 0] nmbd/nmbd_namequery.c:query_name_response(109)
Jan 2 13:24:45 ccvobblrpr nmbd[4128]: query_name_response: Multiple (2) responses received for a query on subnet 10.10.20.33 for
name ITDESIGN2<1d>.
Jan 2 13:24:45 ccvobblrpr nmbd[4128]: This response was from IP 10.10.20.43, reporting an IP address of 10.10.20.43.
Jan 2 13:24:49 ccvobblrpr kernel: ata1: device not ready (errno=-16), forcing hardreset
Jan 2 13:24:49 ccvobblrpr kernel: ata1: soft resetting link
Jan 2 13:25:03 ccvobblrpr kernel: ata1.01: revalidation failed (errno=-2)
Jan 2 13:25:03 ccvobblrpr kernel: ata1: failed to recover some devices, retrying in 5 secs
Jan 2 13:25:08 ccvobblrpr kernel: ata1: soft resetting link
Jan 2 13:25:09 ccvobblrpr kernel: ata1.01: revalidation failed (errno=-2)
Jan 2 13:25:09 ccvobblrpr kernel: ata1: failed to recover some devices, retrying in 5 secs
Jan 2 13:25:14 ccvobblrpr kernel: ata1: soft resetting link
Jan 2 13:25:14 ccvobblrpr kernel: ata1.01: revalidation failed (errno=-2)
Jan 2 13:25:14 ccvobblrpr kernel: ata1.01: disabled
Jan 2 13:25:14 ccvobblrpr kernel: ata1.00: failed to IDENTIFY (I/O error, err_mask=0x40)
Jan 2 13:25:14 ccvobblrpr kernel: ata1.00: revalidation failed (errno=-5)
Jan 2 13:25:14 ccvobblrpr kernel: ata1: failed to recover some devices, retrying in 5 secs
Im pretty sure this is the DRAC . From a machine built pretty much the same, dmesg includes:
Vendor: Dell Model: Virtual CDROM Rev: 123
Type: CD-ROM ANSI SCSI revision: 00
usb-storage: device scan complete
Vendor: Dell Model: Virtual Floppy Rev: 123
Type: Direct-Access ANSI SCSI revision: 00
sd 2:0:0:0: Attached scsi removable disk sdb
usb-storage: device scan complete
But racadm works, as does a remote connection to the DRAC and console. The server has been up for 60 days, so this is just a recent issue.
Any ideas what is wrong, and how to fix it? I reckon a reboot probably would, or possibly a racreset. But I probably wont get to do these till late April.
B
The information in this email is confidential and may be legally privileged.
It is intended solely for the addressee. Access to this email by anyone else
is unauthorized. If you are not the intended recipient, any disclosure,
copying, distribution or any action taken or omitted to be taken in reliance
on it, is prohibited and may be unlawful. If you are not the intended
addressee please contact the sender and dispose of this e-mail. Thank you.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.us.dell.com/pipermail/linux-poweredge/attachments/20110120/fc9f540d/attachment-0001.htm
From owe at spacemetric.com Thu Jan 20 05:10:21 2011
From: owe at spacemetric.com (Ola Westin)
Date: Thu, 20 Jan 2011 12:10:21 +0100
Subject: PERC4 "No contollers found" $basearch workaround doesn't work on
OMSA 6.4
In-Reply-To:
References:
Message-ID:
On 20 January 2011 06:48, Steve Jenkins wrote:
> As I'm upgrading many of our systems from OMSA 6.3 -> 6.4, the
> workaround that I blogged about
> (http://stevejenkins.com/blog/2010/10/no-controllers-found-fix-set-up-dell-omsa-6-3-32-bit-on-rhel-centos-5-5-64-bit/)
> of installing the 32-bit version of OMSA 6.3 on a 64-bit OS so that
> the PERC4 controller info could be displayed does NOT seen to work
> with OMSA 6.4.
>
> Changing the $basearch variable in the repo file does install the
> 32-bit version of OMSA 6.4 installs as expected, but when I fire it up
> I still get:
>
> # omreport storage controller
> No controllers found
>
> Anyone hack a different workaround yet?
I don't know how much it helps you but I have a measurement point from
another similar system.
We have an old PE1850 with Centos 5.5 that also required 32-bit OMSA
6.3 to detect its PERC4 cards. I updated it to 32-bit OMSA 6.4 using a
simple "yum update" and it worked without me having to do anything
special. The only thing that is weird is that OMSA complains about the
firmware version even though it is higher than the required one.
Copied from the web interface:
Firmware/Driver Information for Controller PERC 4e/Si
Caution
Firmware version is out of date.
Firmware Version 5B2D
Minimum Required Firmware Version 522D
Driver Version Not Applicable
/Ola
From brian.omahony at curamsoftware.com Thu Jan 20 06:10:41 2011
From: brian.omahony at curamsoftware.com (Brian O'Mahony)
Date: Thu, 20 Jan 2011 12:10:41 +0000
Subject: R510 - RAID Card query
Message-ID: <86E8DA9E18BC2344BD0218BF23C88DF301435EE4895A@MAIL06.curamsoftware.com>
Im looking into specing a new server in the next few days, and was wondering if anyone has any advice on the RAID cards in the R510. The opions are
* H200
* H700
* Perc 6i
The server will be used as a standby server for data syncs, but nightly will be using tar-gz to archive about 400Gb of data locally, and then send this via backup exec agent over the network to a backup server. Has anyone seen any performance issues with any of the above cards in this kind of set up?
Thanks
B
The information in this email is confidential and may be legally privileged.
It is intended solely for the addressee. Access to this email by anyone else
is unauthorized. If you are not the intended recipient, any disclosure,
copying, distribution or any action taken or omitted to be taken in reliance
on it, is prohibited and may be unlawful. If you are not the intended
addressee please contact the sender and dispose of this e-mail. Thank you.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.us.dell.com/pipermail/linux-poweredge/attachments/20110120/7f5ae089/attachment.htm
From dellpoweredge at semantico.com Thu Jan 20 07:54:07 2011
From: dellpoweredge at semantico.com (dellpoweredge at semantico.com)
Date: Thu, 20 Jan 2011 13:54:07 +0000
Subject: PE2950_BIOS_LX_2.7.0.BIN install on Ubuntu
Message-ID: <4D383E7F.6020601@semantico.com>
Hi all,
I'm having some problems installing this bios update on a pe2950. When I
ran the file under Ubuntu I got an error about the temp file not being
in gz format. I thought it might be a /bin/dash problem so I modified
the file to use /bin/bash and still got the error.
As a workaround I tried adding the file to a repository that had an
older pe2950 package in it.
I updated catalog.xml file and PE2950-LIN-R273704.XML
to contain :-
yet when I scan the repository on a usb stick from a update dvd it
doesn't appear in the list. I'm probably missing something completely
obvious but at the moment I'm quite stuck so would appreciate any help,
even just a pointer. This bios update wasn't contained in the most
recent quarterly repository.
g.
From pjwelsh at gmail.com Thu Jan 20 08:25:12 2011
From: pjwelsh at gmail.com (pjwelsh)
Date: Thu, 20 Jan 2011 08:25:12 -0600
Subject: OpenManage 6.4 yum repository posted
In-Reply-To:
References:
Message-ID: <4D3845C8.8020104@gmail.com>
On 01/19/2011 11:58 PM, linux-poweredge-request at dell.com wrote:
> RE: OpenManage 6.4 yum repository posted
> yum remove dell* firmware* libcmpiCppImpl0 libsmbios libsmbios*
> libwsman* openwsman-* python-smbios smbios-utils-* srvadmin-*
> rm -Rf /opt/dell
> wget -q -O - http://linux.dell.com/repo/hardware/latest/bootstrap.cgi |
> bash
> yum install srvadmin-all
>
Please do not suggest or document the "rm -Rf /opt/dell"! There are other
items that can get installed into that location this is NOT part of OMSA like:
SMruntime-03.35.A6.00-1.i586
SMfwupgrade-03.35.G6.24-1.noarch
SMfirmware-03.35.G6.04-1.noarch
delldset-2.0.0-119.i386
and likely many more...
pjwelsh
From raq at cttc.upc.edu Thu Jan 20 10:15:24 2011
From: raq at cttc.upc.edu (Ramiro Alba)
Date: Thu, 20 Jan 2011 17:15:24 +0100
Subject: multipath and error logging
Message-ID: <1295540124.2310.207.camel@mundo.cttc.org>
Hi everybody,
I am using Ubuntu 10.04 LTS with a SLES11 SP1 kernel (2.6.32.19-0.2.1)
on a PE 2970 server with 2 SAS 5/E controllers accessing to a MD3000
unit. I've installed multipath-tools package o have a cotroller
redundancy and it works fine:
multipath -ll
mdtvd (36001ec9000d4ccc100000c7e4d3576da) dm-1 DELL ,MD3000
[size=2.9T][features=0][hwhandler=0]
\_ round-robin 0 [prio=3][active]
\_ 8:0:0:0 sdc 8:32 [active][ready]
\_ round-robin 0 [prio=0][enabled]
\_ 7:0:0:0 sdb 8:16 [active][ghost]
The only problem is that the kernel log is full of the following useless
messages:
Jan 20 17:05:51 jffmds kernel: [ 718.596622] sd 7:0:0:0: [sdb] Result:
hostbyte=DID_OK driverbyte=DRIVER_SENSE
Jan 20 17:05:51 jffmds kernel: [ 718.596627] sd 7:0:0:0: [sdb] Sense
Key : Illegal Request [current]
Jan 20 17:05:51 jffmds kernel: [ 718.596632] sd 7:0:0:0: [sdb]
<> ASC=0x94 ASCQ=0x1ASC=0x94 ASCQ=0x1
Jan 20 17:05:51 jffmds kernel: [ 718.596641] sd 7:0:0:0: [sdb] CDB:
Read(10): 28 00 00 00 00 80 00 00 08 00
Jan 20 17:05:51 jffmds kernel: [ 718.596648] end_request: I/O error,
dev sdb, sector 128
Jan 20 17:05:51 jffmds kernel: [ 718.596763] Buffer I/O error on device
sdb, logical block 16
Has anyone an idea of how to get riding of those messages?
Any comment suggestion will be welcomed!!!
Best Regards
Cheers
--
Ramiro Alba
Centre Tecnol??gic de Tranfer??ncia de Calor
http://www.cttc.upc.edu
Escola T??cnica Superior d'Enginyeries
Industrial i Aeron??utica de Terrassa
Colom 11, E-08222, Terrassa, Barcelona, Spain
Tel: (+34) 93 739 86 46
--
Aquest missatge ha estat analitzat per MailScanner
a la cerca de virus i d'altres continguts perillosos,
i es considera que est? net.
From Jens_Heinz at Dell.com Thu Jan 20 10:25:07 2011
From: Jens_Heinz at Dell.com (Jens_Heinz at Dell.com)
Date: Thu, 20 Jan 2011 17:25:07 +0100
Subject: multipath and error logging
In-Reply-To: <1295540124.2310.207.camel@mundo.cttc.org>
References: <1295540124.2310.207.camel@mundo.cttc.org>
Message-ID: <399212640037934A8F25742DA026AFD7022A88B0@LEJX7ADC103.EMEA.DELL.COM>
You should use 'rdac' as HW handler. Then the errors should disappear after a while.
Jens.
-----Original Message-----
From: linux-poweredge-bounces-Lists On Behalf Of Ramiro Alba
Sent: 20 January 2011 17:15
To: linux-poweredge-Lists
Subject: multipath and error logging
Hi everybody,
I am using Ubuntu 10.04 LTS with a SLES11 SP1 kernel (2.6.32.19-0.2.1) on a PE 2970 server with 2 SAS 5/E controllers accessing to a MD3000 unit. I've installed multipath-tools package o have a cotroller redundancy and it works fine:
multipath -ll
mdtvd (36001ec9000d4ccc100000c7e4d3576da) dm-1 DELL ,MD3000
[size=2.9T][features=0][hwhandler=0]
\_ round-robin 0 [prio=3][active]
\_ 8:0:0:0 sdc 8:32 [active][ready]
\_ round-robin 0 [prio=0][enabled]
\_ 7:0:0:0 sdb 8:16 [active][ghost]
The only problem is that the kernel log is full of the following useless
messages:
Jan 20 17:05:51 jffmds kernel: [ 718.596622] sd 7:0:0:0: [sdb] Result:
hostbyte=DID_OK driverbyte=DRIVER_SENSE
Jan 20 17:05:51 jffmds kernel: [ 718.596627] sd 7:0:0:0: [sdb] Sense Key : Illegal Request [current] Jan 20 17:05:51 jffmds kernel: [ 718.596632] sd 7:0:0:0: [sdb] <> ASC=0x94 ASCQ=0x1ASC=0x94 ASCQ=0x1 Jan 20 17:05:51 jffmds kernel: [ 718.596641] sd 7:0:0:0: [sdb] CDB:
Read(10): 28 00 00 00 00 80 00 00 08 00
Jan 20 17:05:51 jffmds kernel: [ 718.596648] end_request: I/O error, dev sdb, sector 128 Jan 20 17:05:51 jffmds kernel: [ 718.596763] Buffer I/O error on device sdb, logical block 16
Has anyone an idea of how to get riding of those messages?
Any comment suggestion will be welcomed!!!
Best Regards
Cheers
--
Ramiro Alba
Centre Tecnol?gic de Tranfer?ncia de Calor http://www.cttc.upc.edu
Escola T?cnica Superior d'Enginyeries
Industrial i Aeron?utica de Terrassa
Colom 11, E-08222, Terrassa, Barcelona, Spain
Tel: (+34) 93 739 86 46
--
Aquest missatge ha estat analitzat per MailScanner a la cerca de virus i d'altres continguts perillosos, i es considera que est net.
From raq at cttc.upc.edu Thu Jan 20 10:53:31 2011
From: raq at cttc.upc.edu (Ramiro Alba)
Date: Thu, 20 Jan 2011 17:53:31 +0100
Subject: multipath and error logging
In-Reply-To: <399212640037934A8F25742DA026AFD7022A88B0@LEJX7ADC103.EMEA.DELL.COM>
References: <1295540124.2310.207.camel@mundo.cttc.org>
<399212640037934A8F25742DA026AFD7022A88B0@LEJX7ADC103.EMEA.DELL.COM>
Message-ID: <1295542411.2310.214.camel@mundo.cttc.org>
Jens,
Thanks a million. It works. It is not documented on 'man 5
multipath.conf', though.
Thanks again.
Cheers
On Thu, 2011-01-20 at 17:25 +0100, Jens_Heinz at Dell.com wrote:
> You should use 'rdac' as HW handler. Then the errors should disappear after a while.
>
> Jens.
>
> -----Original Message-----
> From: linux-poweredge-bounces-Lists On Behalf Of Ramiro Alba
> Sent: 20 January 2011 17:15
> To: linux-poweredge-Lists
> Subject: multipath and error logging
>
> Hi everybody,
>
> I am using Ubuntu 10.04 LTS with a SLES11 SP1 kernel (2.6.32.19-0.2.1) on a PE 2970 server with 2 SAS 5/E controllers accessing to a MD3000 unit. I've installed multipath-tools package o have a cotroller redundancy and it works fine:
>
> multipath -ll
>
> mdtvd (36001ec9000d4ccc100000c7e4d3576da) dm-1 DELL ,MD3000
> [size=2.9T][features=0][hwhandler=0]
> \_ round-robin 0 [prio=3][active]
> \_ 8:0:0:0 sdc 8:32 [active][ready]
> \_ round-robin 0 [prio=0][enabled]
> \_ 7:0:0:0 sdb 8:16 [active][ghost]
>
> The only problem is that the kernel log is full of the following useless
> messages:
>
> Jan 20 17:05:51 jffmds kernel: [ 718.596622] sd 7:0:0:0: [sdb] Result:
> hostbyte=DID_OK driverbyte=DRIVER_SENSE
> Jan 20 17:05:51 jffmds kernel: [ 718.596627] sd 7:0:0:0: [sdb] Sense Key : Illegal Request [current] Jan 20 17:05:51 jffmds kernel: [ 718.596632] sd 7:0:0:0: [sdb] <> ASC=0x94 ASCQ=0x1ASC=0x94 ASCQ=0x1 Jan 20 17:05:51 jffmds kernel: [ 718.596641] sd 7:0:0:0: [sdb] CDB:
> Read(10): 28 00 00 00 00 80 00 00 08 00
> Jan 20 17:05:51 jffmds kernel: [ 718.596648] end_request: I/O error, dev sdb, sector 128 Jan 20 17:05:51 jffmds kernel: [ 718.596763] Buffer I/O error on device sdb, logical block 16
>
> Has anyone an idea of how to get riding of those messages?
> Any comment suggestion will be welcomed!!!
>
> Best Regards
> Cheers
>
> --
> Ramiro Alba
>
> Centre Tecnol??gic de Tranfer??ncia de Calor http://www.cttc.upc.edu
>
>
> Escola T??cnica Superior d'Enginyeries
> Industrial i Aeron??utica de Terrassa
> Colom 11, E-08222, Terrassa, Barcelona, Spain
> Tel: (+34) 93 739 86 46
>
>
>
> --
> Aquest missatge ha estat analitzat per MailScanner a la cerca de virus i d'altres continguts perillosos, i es considera que est net.
>
>
--
Ramiro Alba
Centre Tecnol??gic de Tranfer??ncia de Calor
http://www.cttc.upc.edu
Escola T??cnica Superior d'Enginyeries
Industrial i Aeron??utica de Terrassa
Colom 11, E-08222, Terrassa, Barcelona, Spain
Tel: (+34) 93 739 86 46
--
Aquest missatge ha estat analitzat per MailScanner
a la cerca de virus i d'altres continguts perillosos,
i es considera que est? net.
From cupertino at gmx.net Thu Jan 20 12:49:40 2011
From: cupertino at gmx.net (Cupertino)
Date: Thu, 20 Jan 2011 19:49:40 +0100
Subject: *** GMX Spamverdacht *** R510 - RAID Card query
In-Reply-To: <86E8DA9E18BC2344BD0218BF23C88DF301435EE4895A@MAIL06.curamsoftware.com>
References: <86E8DA9E18BC2344BD0218BF23C88DF301435EE4895A@MAIL06.curamsoftware.com>
Message-ID: <1295549380.7525.4.camel@E6410.jshweb.info>
H200 has no cache memory, the other 2x are 'real' RAID controllers.
while Perc6/i is only 3Gbps SAS and H700 its successor with 6Gbps
support aswell as other new features like SSD and SED support.
I would go for one of the last 2x
On Thu, 2011-01-20 at 12:10 +0000, Brian O'Mahony wrote:
> Im looking into specing a new server in the next few days, and was
> wondering if anyone has any advice on the RAID cards in the R510. The
> opions are
>
>
>
> ? H200
>
> ? H700
>
> ? Perc 6i
>
>
>
> The server will be used as a standby server for data syncs, but
> nightly will be using tar-gz to archive about 400Gb of data locally,
> and then send this via backup exec agent over the network to a backup
> server. Has anyone seen any performance issues with any of the above
> cards in this kind of set up?
>
>
>
> Thanks
>
>
>
> B
>
>
>
>
> The information in this email is confidential and may be legally
> privileged.
> It is intended solely for the addressee. Access to this email by
> anyone else
> is unauthorized. If you are not the intended recipient, any
> disclosure,
> copying, distribution or any action taken or omitted to be taken in
> reliance
> on it, is prohibited and may be unlawful. If you are not the intended
> addressee please contact the sender and dispose of this e-mail. Thank
> you.
> _______________________________________________
> Linux-PowerEdge mailing list
> Linux-PowerEdge at dell.com
> https://lists.us.dell.com/mailman/listinfo/linux-poweredge
> Please read the FAQ at http://lists.us.dell.com/faq
From isaiah at urfix.com Thu Jan 20 13:47:48 2011
From: isaiah at urfix.com (Isaiah Irizarry)
Date: Thu, 20 Jan 2011 15:47:48 -0400
Subject: Linux-PowerEdge Digest, Vol 80, Issue 39
In-Reply-To:
References:
Message-ID:
http://blog.urfix.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.us.dell.com/pipermail/linux-poweredge/attachments/20110120/3cf4fb12/attachment.htm
From matthew at acfr.usyd.edu.au Thu Jan 20 16:15:14 2011
From: matthew at acfr.usyd.edu.au (Matthew Geier)
Date: Fri, 21 Jan 2011 09:15:14 +1100
Subject: multipath and error logging
In-Reply-To: <1295540124.2310.207.camel@mundo.cttc.org>
References: <1295540124.2310.207.camel@mundo.cttc.org>
Message-ID: <4D38B3F2.6000900@acfr.usyd.edu.au>
On 21/01/11 03:15, Ramiro Alba wrote:
> The only problem is that the kernel log is full of the following useless
> messages:
>
> Jan 20 17:05:51 jffmds kernel: [ 718.596622] sd 7:0:0:0: [sdb] Result:
> hostbyte=DID_OK driverbyte=DRIVER_SENSE
> Jan 20 17:05:51 jffmds kernel: [ 718.596627] sd 7:0:0:0: [sdb] Sense
> Key : Illegal Request [current]
I have similar issues with my MD3000i - the boot log is full of such
errors until multipathd and it's 'RDAC' driver takes over path management.
I've never been able figure out how to stop pages of SCSI errors during
boot.
From libor.klepac at bcom.cz Thu Jan 20 16:53:39 2011
From: libor.klepac at bcom.cz (Libor =?utf-8?q?Klep=C3=A1=C4=8D?=)
Date: Thu, 20 Jan 2011 23:53:39 +0100
Subject: mptlinux for newer kernel
In-Reply-To: <4D2261C3.3010409@web.de>
References: <4CC1CCA2.5030006@web.de> <4D22186E.3060102@seoss.co.uk>
<4D2261C3.3010409@web.de>
Message-ID: <201101202353.43339.libor.klepac@bcom.cz>
Hi,
i have just (re)installed omsa on our debian server and i see it complaining
about mptsas version.
I was also wondering, why is kernel version so outdated.
Do you know, wheter lsi version is based on same code as inkernel version. Or
is it completly different?
After yesterdays server crash (unrelated to mptsas) I think, there will be
opportunity to prepare LSI drivers
My question is - is it worth it? Should I expect some problems? (connected
MD3000 is running fine with inkernel mptsas driver)
With regards
Libor
Dne ?ter? 04 Leden 2011 00:54:43 Karsten Suehring napsal(a):
> Yes, I was also hoping LSI would push a newer version into the mainline
> kernel. But I was unable to find any contact/forum/mailing list on the
> LSI web site which I could ask for the reasons.
>
> I think at least the Dell people should be interested in that push since
> OMSA is complaining for several versions now that 3.04 is to old. I also
> don't know if the Dell version of the driver has any modification except
> changing the strings to the Dell controller names.
>
> Some weeks ago I had some trouble with a machine which had an internal
> SAS RAID card. I was unable to generate proper error reports with that
> controller. The support case ended after I installed Windows 2008 Server
> (for which the driver constantly logged errors) and Dell had replaced
> half of the machine. My guess is that the problem was caused by the
> backplane. After that we have chosen PERC controller for our newer
> machines because they have their own error logs.
>
> Unfortunately so far nobody at Dell was able to suggest an alternative
> controller for connecting a PowerVault MD3000.
>
> Best regards,
> Karsten
>
> On 03.01.2011 19:41, Tim Small wrote:
> > On 03/01/11 16:33, Ernst Pijper wrote:
> >> I can't install the 4.24 version on Centos 5.5 because of a LZMA issue:
> > It's worth noting that the version of the mpt fusion driver in the
> > official kernel.org tree is v3.04.17. The kernel.org maintainers don't
> > promote the maintenance of out-of-tree code. If this 4.x driver is so
> > great, then why don't LSI have it in the kernel.org tree, instead of the
> > 3.x versions which they seem to sort-of-maintain.
> >
> > http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f
> > =drivers/message/fusion/mptbase.h;h=f71f2294847780472851bd17835937aaa8fce
> > 6f1;hb=HEAD
> >
> > This is a bit of an odd situation, but it's worth noting that this 4.x
> > branch does still have this out-of-tree status, and the kernel.org
> > maintainers generally don't keep code out of the kernel unless there's a
> > good reason. So the possibilities seem to be that either LSI don't want
> > the code in the main kernel for some weird reason, or it isn't good
> > enough to go in.
> >
> > Same with the Redhat maintainers, so even if there was some sort of
> > political trouble between LSI and the kernel.org folks - if Redhat
> > thought that this 4.x branch was the best driver, they'd probably be
> > shipping it in their enterprise kernels.
> >
> > AFAIK Redhat don't advise it's use either.
> >
> > I would be wary of running this 4.x code for that reason...
> >
> > In fact, that whole weird-attitude by LSI, together with various
> > reliability problems I've seen with LSI mpt chips means that I normally
> > advise avoiding them entirely if at all possible. Intel ICHx AHCI
> > controllers seem to be better engineered, better supported, and are
> > certainly many-many times better tested just because there are probably
> > 1000x more of them in circulation.
> >
> > Now when is Dell going to start supporting Intel ICHx for hotplug too,
> > like the other vendors do?
> >
> > Tim.
>
> _______________________________________________
> Linux-PowerEdge mailing list
> Linux-PowerEdge at dell.com
> https://lists.us.dell.com/mailman/listinfo/linux-poweredge
> Please read the FAQ at http://lists.us.dell.com/faq
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 836 bytes
Desc: This is a digitally signed message part.
Url : http://lists.us.dell.com/pipermail/linux-poweredge/attachments/20110120/9ee509de/attachment.sig
From rspchan at starhub.net.sg Fri Jan 21 00:21:08 2011
From: rspchan at starhub.net.sg (Richard Chan)
Date: Fri, 21 Jan 2011 14:21:08 +0800
Subject: bnx2 vpd r/w failed. This is likely a firmware bug on this device
Message-ID:
My server Dell 6950 was power recycled and and came up with the messages
below.
bnx2 vpd r/w failed. This is likely a firmware bug on this device
What does this mean ?(Well it's pretty obvious from the message) but the NIC
/ server
had been functioning correctly)
Does this indicate a Embedded NIC hardware/motherboard problem?
PowerEdge 6950, Ubuntu 10.10, no iSCSI attached to eth0 BCM5708.
Both LEDs are blinking in unison: (3 flashes , pause) repeat indefinitely.
The other NIC is functioning.
I have tried upgrading to the January 18, v6.01 firmware but that didn't
help.
When the server boots, there is a brief message about "Problem dissociating
from firmware. TOE not enabled".
Thanks for any help.
Richard
[ 1564.031576] bnx2 0000:05:00.0: vpd r/w failed. This is likely a firmware
bug on this device. Contact the card vendor for a firmware update.
[ 1574.711573] bnx2 0000:05:00.0: vpd r/w failed. This is likely a firmware
bug on this device. Contact the card vendor for a firmware update.
[ 1574.791895] bnx2 0000:05:00.0: vpd r/w failed. This is likely a firmware
bug on this device. Contact the card vendor for a firmware update.
[ 1574.871900] bnx2 0000:05:00.0: vpd r/w failed. This is likely a firmware
bug on this device. Contact the card vendor for a firmware update.
[ 1574.951887] bnx2 0000:05:00.0: vpd r/w failed. This is likely a firmware
bug on this device. Contact the card vendor for a firmware update.
[ 1575.031893] bnx2 0000:05:00.0: vpd r/w failed. This is likely a firmware
bug on this device. Contact the card vendor for a firmware update.
[ 1575.111886] bnx2 0000:05:00.0: vpd r/w failed. This is likely a firmware
bug on this device. Contact the card vendor for a firmware update.
[ 1575.190324] bnx2 0000:05:00.0: vpd r/w failed. This is likely a firmware
bug on this device. Contact the card vendor for a firmware update.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.us.dell.com/pipermail/linux-poweredge/attachments/20110121/410adc14/attachment.htm
From Patrick_Fischer at Dell.com Fri Jan 21 02:24:01 2011
From: Patrick_Fischer at Dell.com (Patrick_Fischer at Dell.com)
Date: Fri, 21 Jan 2011 09:24:01 +0100
Subject: PERC4 "No contollers found" $basearch workaround doesn't work
on OMSA 6.4
In-Reply-To:
References: