Dell PE2950 - 16GB RAM - RHEL4 ES U4 - 2.6.9-5.ELsmp system freeze
Jeff Burke
jburke at redhat.com
Thu Sep 6 11:16:18 CDT 2007
Shannon Skerritt wrote:
> Hi all,
>
> Recently a client of mine has experienced a server freeze. The server
> is a new Dell PE2950 running RHEL4 ES U4 which is the 2.6.9-5.ELsmp
> kernel with 16GB of RAM.
One bit of information RHEL4 U4 shipped with 2.6.9-42.EL. Are you sure
about the kernel version and release?
>
> This server was delivered a few months ago, and on delivery but before
> production it was noted there may have been a memory problem as the
> server posted (booted) with less than 16GB of RAM.
If you are running the i386 distro you should really be running the
hugemem variant not the smp variant of the kernel. If you are running
the x86_64 distro then the smp variant will be fine.
>
> Reseating the DIMMs and running mpmemory diags (a Dell utility) did
> not find a problem so it was suspected the problem was possibly due
> movement in system transit to the customer.
>
> More recently, the server has completely frozen (no response to ping)
> or network connectivity and no access via the local console KVM - X
> was also frozen. Upon a hard reboot, examination of the system logs
> and key services (SAMBA, NFS) revealed nothing at all, the absence of
> which is starting to make me think of the previously suspected memory
> issue. It is not believed that the server was not under considerable
> load at the time. Re-running mpmemory and the Dell server diagnostics
> has again revealed nothing to date.
I understand this is a production server, But you may want to run memory
diagnostics
>
> Dell have advised the kernel (2.6.9-5.ELsmp) is quite old and there
> have been (via Red Hat Network) many VM/memory issues addressed in
> later kernels, the latest of which I can see is effectively RHEL 4 ES
> U5 which is 2.6.9-55.ELsmp. They have recommended a kernel upgrade.
>
> Can anyone knowledgeable or with experience comment on which direction
> I should be considering? The server in question is a key file server
> which is in production.
>
> Is (RHEL 4 ES U4) 2.6.9-5.ELsmp considered particularly old or
> problematic and can anyone comment if there are significant known
> issues which have been addressed? I have looked at the kernel change
> logs and there appears to be reference to at least a couple.
No it is not old, We are currently are at U5 and U6 is in Beta phase now.
>
> Regards,
> Shannon.
>
More information about the Linux-PowerEdge
mailing list