[Wien] Benchmark on opteron - test on a fully loaded computer

Michael Gurnett michael.gurnett at kau.se
Fri Oct 20 16:51:57 CEST 2006


When I split the k points over two cores in lapw1 I max out both cores.
Using the benchmark test I get about 150% NUMTHREADS=2. So Wien is much
more efficient with kpoint splitting than 2 threads on a single k point.
I will test the benchmark with two k points and let you know how it
went.

conroe 6600 @ 3150 (fsb 350x9, ddr2 700)  benchmark test. 79 seconds.

On Fri, 2006-10-20 at 16:03 +0200, Florent Boucher wrote:
> Dear Wien user,
> I am doing actually some benchmarks on opteron system.
> Here is the configuration of the node:
> 
> - bi-opteron 2214 (dualcore à 2.2Ghz, 64 bits, 2*1Mo de cache) 
> - 4 Go de mémoire DDR2 ECC Reg PC5300 667 Mhz 
> 
> I did the compilation with the following options :
> Intel fortran for EM64T : 9.1.036
> FOPT =  -FR -w -mp1 -prec-div -pc80 -pad -ip -O3
> R_LIBS
> = ../SRC_lib/liblapack_lapw.a /opt/goto/64/libgoto_opteronp-r1.07.so
> -lpthread
> 
> I got for the official benchmark a CPU time of 215s
> (OMP_NUM_THREADS=1) that seems to me in agreement with what is publish
> for 
> AMD-Opteron, single cpu, 2.4 Ghz   196 sec    ifort64 + libgoto_opteron64p-r1.00.so
> (we have 2.2 and I use OMP_NUM_THREADS=1)
> 
> It is clear the pentium processor have the best performance for single
> CPU test but I think one should care about the behavior on a fully
> loaded computer.
> 
> In order to test the saturation of the memory band width on the node
> (that could be the great difference between AMD and Pentium
> architecture), I just did a parallel job on the k-points using the
> official WIEN benchmark. As I have access to 4 cores (2CPUs x 2
> cores), I just put 4 kpoints into the klist file (one can put 4 times
> the first k-point) and I run
> x lapw1 -c -p
> with the following options:
> USEREMOTE=0
> .machines files
> 1:localhost
> 1:localhost
> 1:localhost
> 1:localhost
> 
> The results are very good (efficiency is 91%).
> The calculation on 4cores for 4 kpoints took 237s (total CPU used
> 100%) compared to 211s for one kpoint (total CPU used 25%).
> 
> This really the type of benchmark we are interested in, when the CPU
> is fully loaded.
> 
> For those who have access to the new pentium processors, will it be
> possible to see how they behave when the node is fully loaded. Do they
> still perform better than AMD ? And what is the efficiency ?
> 
> I will soon have access to a node with 4CPU  AMD (2cores), so I will
> be able to test with 8 k-points at the same time.
> I will let you know about the results.
> Regard
> Florent
> 
> PS: I will be really interested by the test on charge on pentium
> architecture and I think such results should be mentioned on the
> WIEN2K page. Also, for the other users, all the compiler options
> should be mentioned precisely
> 
> 
> 
> -- 
>  -------------------------------------------------------------------------
> | Florent BOUCHER                    |                                    |
> | Institut des Matériaux Jean Rouxel | Mailto:Florent.Boucher at cnrs-imn.fr |
> | 2, rue de la Houssinière           | Phone: (33) 2 40 37 39 24          |
> | BP 32229                           | Fax:   (33) 2 40 37 39 95          |
> | 44322 NANTES CEDEX 3 (FRANCE)      | http://www.cnrs-imn.fr             |
>  -------------------------------------------------------------------------
> _______________________________________________
> Wien mailing list
> Wien at zeus.theochem.tuwien.ac.at
> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien


More information about the Wien mailing list