[Linux-PowerEdge] C6145 ECC Error, how to find bad DIMM?
griznog at gmail.com
Mon Jan 28 15:15:21 CST 2013
I have an C6145 server that has all 32 DIMM slots filled and is
spontaneously rebooting several times per week. Each reboot shows up in the
SEL log as this (as viewed with ipmitool):
76 | 01/28/2013 | 13:03:35 | Unknown #0x81 |
77 | 01/28/2013 | 13:03:37 | Memory #0x60 | Uncorrectable ECC | Asserted
78 | 01/28/2013 | 13:04:55 | System Firmware Error #0x06 | Unknown Error |
79 | 01/28/2013 | 13:05:06 | System Event #0x85 | OEM System boot event |
Does anyone know hos I can map #0x60 back to a specific DIMM slot or even
to a specific bank/CPU? I'm really not looking forward to searching through
32 DIMMs, swapping them one at a time and waiting to see if I get another
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Linux-PowerEdge