My experiences with six new Dell 2650's and RedHat Linux 7.2
Ben.Russo at tnsi.com
Tue Aug 6 15:10:00 CDT 2002
I just got 6 Dell 2650's a few weeks ago
and have been setting them up and
playing with them before they become
I used the ROMB to set up a single raid group
and hot spare on each server, then I installed
RedHat 7.2 on all of them using a ks.cfg on a
floppy disk for unattended install setup.
Just don't do the Xconfig during the install, it will
hang the display and you will have to restart the install.
After you install 7.2 you can run Xconfigurator and it
will work just fine, but don't do it during the install process!
The only special thing I had to do was the
"noprobe aacraid_pciid=0x1028......." thing that was in
the Dell release notes to get the install procedure to recognize
the RAID CONTAINER as /dev/sda
However I run up2date and there are newer RedHat kernel's available
that supposedly fix some rare problems. I would like to avoid ever
encountering a problem and thought I would upgrade to 2.4.9-34.
However the RPM from RedHat updates for 2.4.9-34 kernel has an
aacraid module that does not like parm_aacraid_pciid ? Also I
found that there is no bcm5700 module support built into the
modules dir for the standard RedHat kernel?
Is there a way to take the standard RedHat kernel source and apply
the patches just for aacraid and bcm5700? If so where would I get those
patches? Or even better yet, does Dell have the newer stable kernels in
binary RPM form somewhere?
Also, just like everybody else I had problems with the broadcom
5701 NIC's on my RedHat 7.2 boxes. They would occasionally report
that the Link was down and then would require a reboot to get
working again. I tried changing the auto-negotiate and speed and
duplex settings in the switch and the /etc/modules.conf to no avail.
And I found that I could reproduce the problem by doing
"ifconfig eth0 down" followed by "ifconfig eth0 up"
a few times on all 6 of the servers.
I ended up disabling the broadcom adapters in the BIOS and putting
in a PCI 3Com card that works with no problems. But it is a shame to
waste those two on board NIC's and have to use a PCI slot for no
good reason. Then I had a problem on the 2650's with the PCI NIC.
But only on the 2650's with dual processors.....
I checked /proc/interrupts and found that the aacraid and the eth0
were both using the same interrupt, so I made sure that kudzu was
set to on with chkconfig and then went into the bios and
disabled serial port 2, USB and the broadcom NIC's and rebooted,
then when it came up kudzu reassigned the plug-n-pray IRQ's
for the PCI NIC's.
Now all is well, but I have really nice servers where I have to
disable many features and use other NIC's.... sigh.
I got the RAC working just fine through a web browser, but only with MS-IE
v6 using the MS-VM (if I used the SUN JRE it would crash the browser every
time, and it was very slow). Would be nice if Dell made them have SSH2
and tightVNC on them and forgot about the web interface.
The Serial Console redirection worked OK with TeraTerm especially if I told
TeraTerm to use precisely 80x24 VT100 9600,8n1 no flow control with full
I even edited the /boot/grub/grub.conf to get rid of the splash image
and told the kernel to use a serial console and then it worked all the way
through except that curses apps don't work well and when the boot process
gets to the section where it shows all the sysV init scripts and their
status the terminal would stop working until the login prompt came up.
But that isn't that bad.
The only inconsistent problem I found was when playing with the afacli
while the OS was running. I slapped a new hard disk into one of the boxes
and issued a "controller rescan" command and then the box locked up and
the Disks had flashing yellow lights and the LCD panel said that the ROMB
was having an error. However I removed power from the box for a few
minutes and then plugged it back in and everything was OK.
(hurray for journalling file systems).
I would say though that you should always configure RAID boxes with hot
spares that have auto-rebuild configured, and always if possible take an
outage window to swap disks while the server is down. I have seen
EMC, NetApp, HP, DG, and other RAID systems that have had similar problems
with hot swapping. (always very very rare problems, but still problems).
This e-mail message is for the sole use of the intended recipient(s) and may
contain confidential and privileged information of Transaction Network
Services. Any unauthorized review, use, disclosure or distribution is
prohibited. If you are not the intended recipient, please contact the
sender by reply e-mail and destroy all copies of the original message.
More information about the Linux-PowerEdge