Clovertown HPL scores

Brent_Clements at Dell.com Brent_Clements at Dell.com
Sat Mar 31 13:20:48 CST 2007


My general rule of thumb is to set aside 1 month for doing Top500 tuning runs. Using the Intel HPL helps because you can use their ASYOUGO2 and ENDEARLY options together to do quick tuning runs. 
 
 

________________________________

From: linux-poweredge-bounces at dell.com on behalf of Robin Humble
Sent: Sat 3/31/2007 4:17 AM
To: linux-poweredge-Lists
Subject: Re: Clovertown HPL scores



On Tue, Mar 27, 2007 at 05:47:08PM +0200, Peter Kjellstrom wrote:
>On Tuesday 27 March 2007, Kilian CAVALOTTI wrote:
>...
>> I was rather looking for a way to determine HPL.dat parameters in a
>> rational way. Playing around with the settings is not really easy when you
>> don't know if the parameters are correlated, and if what direction you
>> should go. For instance, are N and NB related in a way, or are they
>> totally independent?
>I think that it would be nice if N/P and N/Q is evenly divisible by NB. But
>then again you have to keep N as high as possible to use as much RAM as you
>can... I typically loop over a bunch of NB for quite large N and then for the
>best candidate I optimize N. With that N/P/Q/NB config I try some other more
>exotic options (mostly by feeling or dice ;-).

cool! I've never tried tweaking N except to fill up ram... thanks for
that.

something to bear in mind that the best NB for a cluster might not be
the best NB on a core or single node, as NB is related to the MPI
message size. eg. NB=212 is a good choice for the 64 Xeon 5150 cores of
a 16-node quad-core IB cluster with 2G ram/core. OpenMPI works well.
threaded vs. non-threaded Goto library changes as you add more nodes -
generally threaded Goto is best within a node, and unthreaded on large
numbers of nodes - I'm not sure why that is exactly, but possibly it
changes the amount of asynchronous communication that can be done.
using the most recent Goto library (1.12 at the moment) can make a huge
difference as he continues to tweak it brilliantly for the newer cores.

all this sort of HPL fiddling can take a LONG time. for a benchmark
that's not very related to most real codes, it's hard to justify the
time spent doing it.... OTOH it's cool to get higher on the top500 :)

cheers,
robin

_______________________________________________
Linux-PowerEdge mailing list
Linux-PowerEdge at dell.com
http://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq




More information about the Linux-PowerEdge mailing list