Clearing ECC error without rebooting?
Trond Hasle Amundsen
t.h.amundsen at usit.uio.no
Mon Dec 15 06:20:03 CST 2008
Dirk Taggesell <dirk.taggesell at proximic.com> writes:
> on one of our DELL servers here (a 1950) an ECC mem error occured, it
> was non-critical and I want to delete the event so that the NAGIOS check
> (check_dell_sensors) doesn't complain anymore (until another ECC error
> occurs).
>
> Of course I could reboot the machine, but that doesn't make sense whe
> the RAM error was only a single event and does not occur frequently.
>
> Yet I couldn't figure a way to "reset" the internal error log. I already
> deleted the event log - to no avail.
>
> How do I clear the error log without having to reboot the machine?
>
> The DELL software (5.2) is installed (omreport, omconfig, omexec,
> omupdate), the machine is running OpenSuSE 10.2 64Bit.
There are two logs, you may have to clear both:
omconfig system alertlog action=clear
omconfig system esmlog action=clear
Then restart the OM services:
srvadmin-services.sh restart
That should be enough to remove the error.
On the other hand, you should probably replace the failed memory
module instead of pretending that it never happened... ;)
Cheers,
--
Trond H. Amundsen <t.h.amundsen at usit.uio.no>
Center for Information Technology Services, University of Oslo
More information about the Linux-PowerEdge
mailing list