Linux-PowerEdge Digest, Vol 88, Issue 5

Brent Kimberley Brent.Kimberley at durham.ca
Tue Sep 6 12:48:12 CDT 2011


Not normal.  Here's a similar machine (although with a PERC):

[root at utility etc]# dmidecode|grep -i poweredge;lspci|grep RAID;dmesg|grep
-i  model ;fdisk -l;dd bs=4MB if=/dev/sda3 of=/dev/null count=1000
        Product Name: PowerEdge 2850
02:0e.0 RAID bus controller: Dell PowerEdge Expandable RAID controller 4
(rev 06)
  Vendor: PE/PV     Model: 1x6 SCSI BP       Rev: 1.0
  Vendor: MegaRAID  Model: LD 0 RAID5  419G  Rev: 521X

Disk /dev/sda: 440.0 GB, 440087347200 bytes
255 heads, 63 sectors/track, 53504 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1               1           4       32098+  de  Dell Utility
/dev/sda2   *           5          69      522112+  83  Linux
/dev/sda3              70        1101     8289540   82  Linux swap / Solaris
/dev/sda4            1102       53504   420927097+   f  W95 Ext'd (LBA)
/dev/sda5            1102        8908    62709696   8e  Linux LVM
/dev/sda6            8909       53504   358217338+  83  Linux
1000+0 records in
1000+0 records out
4000000000 bytes (4.0 GB) copied, 30.6845 seconds, 130 MB/s



---------------------------------------------------
Hi J. et Al:

Hi everyone.  Thanks for your help.  The numbers show above are in the expected range.* 

I have been led to believe that a subsystem failure or predicted failure may have knocked the PE2850 system into a fail-safe / SCSI-2 mode.  A data package has been collected for offline analysis.  I'm now trying to figure out when the machine went into SCSI-2 mode.

Background / Context
I was given the machine on joining the company.  Performance was abysmal from the get-go.  For performance & integrity reasons, given the state of the box, backups were performed offline on an event driven basis - using a trusted config.   Backups would take at the very least 3 days to for the D2D portion. (The only limit was the RAID backplane - typically <<< 4MB/sec )  Maximum outage window (from command to go offline to full system synchronization) is/was four days.  Restoration is/was proportional outage duration -- typically at least 3 more days to re-synchronize.

The last cold backup took one of the four stripes offline - permanently.  Now, the system is in degraded mode.

The replacement has been in the works for two+ years.  Issues:
  - the replacement machine is not 100% identical
  - It's a customized / continuous query / continuous append / business critical / regulatory / process / performance / surveillance / compliance / 'smart <insert your sector here>'  machine
  - there are over 10 years of one-of-a-kind-data 
  - the loading on RAID subsystem is increasing in a continuous, ongoing, organic, super-linear manner
  - continuous availability has been strongly encouraged

Have a good day!

Brent


* The subsystem should be able to supply 40 to 80 MB/sec per drive.  For RAID-5, U320 where the parity drive gets dropped, that works out to 120 to 240 MB/second (i.e. 3*40 to 3*80)



THIS MESSAGE IS FOR THE USE OF THE INTENDED RECIPIENT(S) ONLY AND MAY CONTAIN INFORMATION THAT IS PRIVILEGED, PROPRIETARY, CONFIDENTIAL, AND/OR EXEMPT FROM DISCLOSURE UNDER ANY RELEVANT PRIVACY LEGISLATION.  No rights to any privilege have been waived.  If you are not the intended recipient, you are hereby notified that any review, retransmission, dissemination, distribution, copying, conversion to hard copy, taking of action in reliance on or other use of this communication is strictly prohibited.  If you are not the intended recipient and have received this message in error, please notify me by return e-mail and delete or destroy all copies of this message.



More information about the Linux-PowerEdge mailing list