AW: PERC3/Di crashed again
mp at webfactory.de
Mon Oct 25 04:14:01 CDT 2004
I called the German DELL tech support today, and, as I expected, the technician was kind and anxious, but could not really help me.
The issue seems to be unknown to the German support team. Possible causes would be a hardware problem, wrong, incompatible or outdated drivers, or maybe a broken memory module on the controller.
Two hints were to install BIOS A19 (probably unnecessary, though) and/or the DELL-provided aacraid-1.1.4-2302 driver (provided as RedHat image) instead of Adaptec's 1.1-5.
They said only RedHat is officially supported; before any packages are released, they are carefully tested, and the problem would probably not occur then. If possible, we should try to setup RedHat on a spare disk, boot from that disk and see if the system runs stable then. As long as we're running Debian, they would hardly be able to perform any further diagnostics (they'd need the Server Administrator to be installed for that).
So, no real help. I think I'll try disabling hyperthreading as the next step, maybe that can work around any race conditions in the driver?
Can anyone provide links to mailing lists and the like to show that RedHat is also affected? I'd like to mail them to German Dell support so they can escalate it...
Can anyone confirm that logging through /dev/log will break if the system remounts the filesystem readonly due to controller problems, and thus we need to log /proc/kmsg as well?
> -----Ursprüngliche Nachricht-----
> Von: linux-poweredge-admin at dell.com
> [mailto:linux-poweredge-admin at dell.com] Im Auftrag von
> Matthias Pigulla
> Gesendet: Donnerstag, 21. Oktober 2004 09:27
> An: linux-poweredge at dell.com
> Betreff: PERC3/Di crashed again
> As Matt requested in
> .html, I'll give the German DELL tech support a call on
> friday and will try to feed this through the "official
> channels"; I'll report on this list how things proceed.
More information about the Linux-PowerEdge