scanning for bad ram

Nick_Parrott at Dell.com Nick_Parrott at Dell.com
Thu Nov 8 04:55:01 CST 2007


Again, with reference to Kuba's response;

" To do proper RAM testing in an on-line system requires kernel support
(kernel pages need to be moved around in physical memory for the
duration of the test). I don't know that such a thing exists, or if it's
used by OMSA.

The only sane way to test RAM is by booting with memtest86. There are
really no alternatives."

I have to agree entirely, omdiag system memory will not be able to test
the DIMMs in full, and I don't believe it does as extreme a test as
MPmemory - if you want an accurate result, you have to do the test
offline.

NB: I've NEVER utilised results of omdiag system memory for
troubleshooting - always used the offline test, it's the only way to
know for sure..


-----Original Message-----
From: Kewley, David 
Sent: 07 November 2007 20:54
To: Parrott, Nick
Subject: RE: scanning for bad ram 

Nick,

OMSA 5.0.0 has 'omdiag system memory'.  Would you not recommend that to
customers?  I don't remember using it, only know about its existence.

David 

> -----Original Message-----
> From: linux-poweredge-bounces at dell.com 
> [mailto:linux-poweredge-bounces at dell.com] On Behalf Of 
> Nick_Parrott at dell.com
> Sent: Wednesday, November 07, 2007 7:55 AM
> To: razor at meganet.net; linux-poweredge-Lists
> Subject: RE: scanning for bad ram 
> 
> Hi Paul,
> 
> There are no diags within OMSA, you need to boot off some diagnostics
> media and run the MPmemory diagnostic. 
> 
> Clear the ESM/SEL log first, then run the test to see if 
> anything fails,
> if it does, you need to pop the lid and swap the faulty DIMM 
> with a good
> DIMM, clear the log again, run the diagnostic and confirm 
> that the fault
> follows the DIMM or the slot, then call support to get parts 
> replaced if
> under warranty.
> 
> MPmemory shows the SEL events, so clearing the log is a good 
> idea as to
> avoid confusion..
> 
> Regards,
> 
> Nick
> 
> -----Original Message-----
> From: Paul A [mailto:razor at meganet.net] 
> Sent: 06 November 2007 19:43
> To: Parrott, Nick; linux-poweredge-Lists
> Subject: RE: scanning for bad ram 
> 
> Nick, thanks for the information. 
> 
> The reason I'm asking is because we have 3 1900's bought 
> refurbished and
> one
> application is exiting with status 11 (SIGSEGV) on two of the servers.
> The
> provider of the software tells me it's probably due to hardware or ram
> failure.
> 
> Can run osma and test the ram hardware while the server is up, will it
> affect data stored in memory. I can always take one of the servers I'm
> testing offline if it does. 
> 
> paul
> ________________________________________
> From: Nick_Parrott at Dell.com [mailto:Nick_Parrott at Dell.com] 
> Sent: Tuesday, November 06, 2007 1:19 PM
> To: razor at meganet.net; linux-poweredge at lists.us.dell.com
> Subject: RE: scanning for bad ram 
> 
> It will indeed, both Single Bit errors and Multi Bit errors
> 
> Single bit's are ECC recoverable - so you won't see effect on
> applications,
> multi-bit you'll probably know about..
> 
> Any fault that the BMC (Baseboard Management Controller) logs will be
> displayed in OMSA under "Logs" > "BMC/SEL"
> 
> Use Dell MPmemory to test DIMMs offline, it's on the 32-bit 
> Diagnostics
> CD's. It's a modified memtest86, however memtest86 will fail 
> immediately
> as
> the system BIOS reserves a very small portion of memory space 
> upon boot
> 
> Nick
> 
> From: linux-poweredge-bounces at dell.com
> [mailto:linux-poweredge-bounces at dell.com] On Behalf Of Paul A
> Sent: 06 November 2007 17:49
> To: linux-poweredge-Lists
> Subject: scanning for bad ram 
> 
> I have yet to install osma on my 1950's I know it will 
> monitor disks but
> will it monitor and report problems with ram.
> Paul
> 
> _______________________________________________
> Linux-PowerEdge mailing list
> Linux-PowerEdge at dell.com
> http://lists.us.dell.com/mailman/listinfo/linux-poweredge
> Please read the FAQ at http://lists.us.dell.com/faq
> 



More information about the Linux-PowerEdge mailing list