Advice please for an aacraid issue

John Logsdon j.logsdon at quantex-research.com
Sat Jul 23 05:01:15 CDT 2005


Eberhard and list

Thanks for your experience.

I note that quite a lot of people have also seen the same thing but in all
cases the messages in /var/log/messages go on with diagnostic information.  
In my case, there was just one message then nothing.  And looking at
/boot, some part of all available kernels has been fried as well as
grub.conf!!!  Now I am getting a lot of console messages from ext2 - the
boot partition is ext2 - when I try and look at files.

I am using a 2.6.11.12 bespoke kernel with CentOS4.  I think this has a
newer aacraid driver but I also notice that there is 2.8.0 v6095 A10 build
available now, dated 9 June 2005.  I was using 6092 with the latest A20
BIOS on motherboard hardware version A03.  (Why does Dell start all these
with A?  It is very confusing!)  I don't know whether this latest firmware
is a production version of the beta driver announced recently that claims
to have fixed the problem:-))) It would be nice to know that it won't
happen again.

Whichever way, rebooting will either succeed or fail.  If it fails, I have
the CentOS4 disks to re-install and I may take the opportunity to change
away from Reiser3 that is on all partitions this container - the other is
xfs.  If it succeeds then I will carry on but add a few things to my
snapshot backups plus take a copy of the last boot messages file
automatically - which I haven't got so I can't confirm things.
Unfortunately /var/log/dmesg is one of those things that was fried and was
not backed up...

So rebooting will be pretty nervy although there is nothing to be lost.  
I will file a Dell report although last time I had a motherboard failure
it took some time to go through the question and answer routine over the
phone.  I did get a new motherboard eventually.

Any other tips welcome!

Best wishes

John

John Logsdon                               "Try to make things as simple
Quantex Research Ltd, Manchester UK         as possible but not simpler"
j.logsdon at quantex-research.com              a.einstein at relativity.org
+44(0)161 445 4951/G:+44(0)7717758675       www.quantex-research.com


On Fri, 22 Jul 2005, Eberhard Moenkeberg wrote:

> Hi,
> 
> On Fri, 22 Jul 2005, John Logsdon wrote:
> 
> > PE 2650, Perc3/Di, Container A is RAID1 and container B is RAID5.
> >
> > last message in the logs:
> >
> > ===========================================================
> > Jul 21 02:31:40 unix kernel: aacraid: Host adapter reset request. SCSI
> > hang ?
> > ===========================================================
> >
> > after that, while the system can do some things, there are lots of files
> > that are not available giving
> >
> > ?---------- ? ?
> >
> > for ls -l.
> >
> > Things like / are noted as read-only filesystem, the permissions for /root
> > have been changed to d-wx------ and other nasties.
> >
> > The partitions involved include the following directories, which are all
> > in container A:
> >
> > /boot
> > /usr
> > /usr/local
> > /var
> >
> > So even rebooting is dangerous.
> >
> > Container B seems OK and I have an external USB drive that is working fine
> > - it contains the snapshots so I have a backup of all the important stuff.
> >
> > What is the best way to return this to a working and reliable box?  I have
> > seen mention of replacing the raid card with a megaraid card.  Will Dell
> > do this sort of thing under warranty?  I have seen many reports of
> > problems with aacraid but so far it has been fine.
> >
> > Is there anything I can do without rebooting?
> 
> You can do nothing but rebooting.
> But your chances are best that it will work after booting just like 
> nothing has happened before...
> 
> I guess it is just the old old aacraid driver bug which is triggering the 
> old old firmware bug which did not get fixed, not get fixed again and 
> again.
> 
> > Advice please?
> 
> You will grow up and get old with this bug, I guess.
> 
> Cheers -e
> -- 
> Eberhard Moenkeberg (emoenke at gwdg.de, em at kki.org)
> 
> _______________________________________________
> Linux-PowerEdge mailing list
> Linux-PowerEdge at dell.com
> http://lists.us.dell.com/mailman/listinfo/linux-poweredge
> Please read the FAQ at http://lists.us.dell.com/faq
> 



More information about the Linux-PowerEdge mailing list