Disk problems?

Andy Stubbs andrew.stubbs at activehotels.com
Thu Jan 23 06:49:00 CST 2003


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


Hi,

I've been running a database on one of our machines - a PE2500 with a 
Perc3/Di RAID controller.

I have 2 containers - 1 mirrored contained (0:0:0 and 0:1:0) and one raid
5 container (1:3:0 1:4:0 and 1:5:0) for the DB data.

Yesterday afternoon, the database performance plummeted and load on the
machine shot up. This was only a couple of minutes after a network problem 
with our supplier was resolved, so initially this was seen as a possible - 
although inexplicable - cause.

However, there is an entry in the system log for the day before:
Jan 20 03:49:10 db1 kernel: aacraid:Enclosure 0 - Temperature 168, over 
threshold 120

I googled this and found that people don't seem to think much of this 
error. I checked out OMSA and certainly that wasn't reporting any unusual 
temperatures. I moved the database to another machine and the load 
returned to normal for a machine not doing much. I ran the OMSA storage 
diagnostics on the RAID Controller and although it all passed just fine, 
the following messages appeared in the system logs:

Jan 22 16:59:44 db1 kernel: aacraid:ID(1:03:0); Error Event [command:0x37]
Jan 22 16:59:44 db1 kernel: aacraid:ID(1:03:0); Recovered Error [k:0x1,c:0x1c,q:0x2]
Jan 22 16:59:44 db1 kernel: aacraid:ID(1:03:0); Grown Defect List Not Found
Jan 22 16:59:46 db1 kernel: aacraid:ID(1:04:0); Error Event [command:0x37]
Jan 22 16:59:46 db1 kernel: aacraid:ID(1:04:0); Recovered Error [k:0x1,c:0x1c,q:0x2]
Jan 22 16:59:46 db1 kernel: aacraid:ID(1:04:0); Grown Defect List Not Found
Jan 22 16:59:48 db1 kernel: aacraid:ID(1:05:0); Error Event [command:0x37]
Jan 22 16:59:48 db1 kernel: aacraid:ID(1:05:0); Recovered Error [k:0x1,c:0x1c,q:0x2]
Jan 22 16:59:48 db1 kernel: aacraid:ID(1:05:0); Grown Defect List Not Found

This looks like Bad News[tm].

Output from afacli:
AFA0> container list
Executing: container list
Num          Total  Oth Chunk          Scsi   Partition
Label Type   Size   Ctr Size   Usage   B:ID:L Offset:Size
- ----- ------ ------ --- ------ ------- ------ -------------
 0    Mirror 16.9GB            Open    0:00:0 64.0KB:16.9GB 
 /dev/sda             db1_os           0:01:0 64.0KB:16.9GB 

 1    RAID-5 67.7GB        8KB Open    1:03:0 64.0KB:33.8GB 
 /dev/sdb             db1_db           1:04:0 64.0KB:33.8GB 
                                       1:05:0 64.0KB:33.8GB 


Can anybody shed some light on this? Or at least give me a couple of 
pointers as to how I can better diagnose the problem?

Regards,

Andy

- -- 
Andy Stubbs, B.A., Ph.D.
Network Manager, Active Hotels Ltd.
+44 1223 578106
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iD8DBQE+L+VyQsxCoxPcZ3QRAuUoAJ0VNlDWYuTsElAbmLV8Pl4GBTsC6ACfYO3q
uTVd00L57YT2ypEguoJBa6Y=
=dfBL
-----END PGP SIGNATURE-----




More information about the Linux-PowerEdge mailing list