Clovertown HPL scores

Brent_Clements at Dell.com Brent_Clements at Dell.com
Tue Mar 27 04:08:26 CST 2007


Kilian,
 
My only two suggestions to you is that you should play around with the settings more and you will be rewarded. The other suggestion is not using lam-mpi. You will want to use mvapich-0.9.8 or higher. 
 
Brent Clements
 
 

________________________________

From: linux-poweredge-bounces at dell.com on behalf of Kilian CAVALOTTI
Sent: Mon 3/26/2007 9:38 PM
To: linux-poweredge-Lists
Subject: Clovertown HPL scores



Hi all,

I don't know if that's the right place to ask, but I guess some of the
readers of this mailing-list have a good knowledge of HPC benchmarking.

I'm trying to evaluate our cluster, and began to launch some Linpack runs.
Before scaling to the full-range cluster, I'd like to get the best
efficiency out of one single host. They are PowerEdge 1950, with two E5345
(Clovertown, quad-core, 2.33GHz) and 16GB memory each. So, if I made no
mistake, for a single host, the theoretical performance should be:
2 (CPUs) x 4 (cores) x 4 (ops/cycle) x 2.33G (cycles/s) = 74.56 Gflop/s

I compiled xhpl against the GotoBLAS library, and use LAM MPI. I played a
little bit with HPL.dat values (see below), and experimentally,
the best score I get, running 8 jobs on the same host, is 52 Gflops/s.
That's about 70% efficiency, which seems a little low to me. I would have
expected something more in the 80-90% range.

Is 70% efficiency reasonable for local jobs? Or should I try to get more?
And if so, what would you advise to improve the score?

-----------------------------------------------------------------------
# HPLinpack benchmark input file
# Innovative Computing Laboratory, University of Tennessee
HPL.out     output file name (if any)
6           device out (6=stdout,7=stderr,file)
1           # of problems sizes (N)
44000       Ns
1           # of NBs
200         NBs
0           PMAP process mapping (0=Row-,1=Column-major)
1           # of process grids (P x Q)
2           Ps
4           Qs
16.0        threshold
1           # of panel fact
2           PFACTs (0=left, 1=Crout, 2=Right)
1           # of recursive stopping criterium
8           NBMINs (>= 1)
1           # of panels in recursion
2           NDIVs
1           # of recursive panel fact.
0           RFACTs (0=left, 1=Crout, 2=Right)
1           # of broadcast
1           BCASTs (0=1rg,1=1rM,2=2rg,3=2rM,4=Lng,5=LnM)
1           # of lookahead depth
0           DEPTHs (>=0)
0           SWAP (0=bin-exch,1=long,2=mix)
64          swapping threshold
0           L1 in (0=transposed,1=no-transposed) form
0           U in (0=transposed,1=no-transposed) form
1           Equilibration (0=no,1=yes)
8           memory alignment in double (> 0)
-----------------------------------------------------------------------

Thanks for any hint you could provide,
--
Kilian

_______________________________________________
Linux-PowerEdge mailing list
Linux-PowerEdge at dell.com
http://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq 



More information about the Linux-PowerEdge mailing list