PowerEdge 1900 IERR 1410 CPU 1 IERR

Jonathan Dill jonathan at nerds.net
Wed Dec 6 11:22:49 CST 2006

Peter Kjellstrom wrote:
> Yes I would guess hardware problem. Have you tried stressing the system with 
> memtest, cpuburn or the likes?
Good point, I am going to try those things tomorrow morning, and 
probably Dell diags from the util partition.  This is a production Samba 
server at another location, and it seems stable as long as BackupPC 
isn't running, so I don't want to bring it down right now.

The advice I have from the Dell tech so far is that typically this error 
occurs when there is a problem with a PCI card, which in this case could 
be Perc 5/i, Intel PRO 1000, or possibly the riser.  He suggested try 
removing the gigabit ethernet card and see if the problem still occurs.

For my part, I have determined that the problem occurs when 
BackupPC_dump runs, but seems otherwise stable, so the problem may be 
triggered by heavy disk or net I/O--I am going to try some stress tests 
in those areas tonight after business hours, then run BackupPC_dump 
manually with debug.  Also, I have switched back to the -server rather 
than -xeon kernel.  It's also possible the problem is some CPU race 
condition triggered by backuppc / perl, or maybe there is something that 
is not SMP safe.


