[Wien] Benchmark on opteron - test on a fully loaded computer
Michael Gurnett
michael.gurnett at kau.se
Fri Oct 20 16:51:57 CEST 2006
When I split the k points over two cores in lapw1 I max out both cores.
Using the benchmark test I get about 150% NUMTHREADS=2. So Wien is much
more efficient with kpoint splitting than 2 threads on a single k point.
I will test the benchmark with two k points and let you know how it
went.
conroe 6600 @ 3150 (fsb 350x9, ddr2 700) benchmark test. 79 seconds.
On Fri, 2006-10-20 at 16:03 +0200, Florent Boucher wrote:
> Dear Wien user,
> I am doing actually some benchmarks on opteron system.
> Here is the configuration of the node:
>
> - bi-opteron 2214 (dualcore à 2.2Ghz, 64 bits, 2*1Mo de cache)
> - 4 Go de mémoire DDR2 ECC Reg PC5300 667 Mhz
>
> I did the compilation with the following options :
> Intel fortran for EM64T : 9.1.036
> FOPT = -FR -w -mp1 -prec-div -pc80 -pad -ip -O3
> R_LIBS
> = ../SRC_lib/liblapack_lapw.a /opt/goto/64/libgoto_opteronp-r1.07.so
> -lpthread
>
> I got for the official benchmark a CPU time of 215s
> (OMP_NUM_THREADS=1) that seems to me in agreement with what is publish
> for
> AMD-Opteron, single cpu, 2.4 Ghz 196 sec ifort64 + libgoto_opteron64p-r1.00.so
> (we have 2.2 and I use OMP_NUM_THREADS=1)
>
> It is clear the pentium processor have the best performance for single
> CPU test but I think one should care about the behavior on a fully
> loaded computer.
>
> In order to test the saturation of the memory band width on the node
> (that could be the great difference between AMD and Pentium
> architecture), I just did a parallel job on the k-points using the
> official WIEN benchmark. As I have access to 4 cores (2CPUs x 2
> cores), I just put 4 kpoints into the klist file (one can put 4 times
> the first k-point) and I run
> x lapw1 -c -p
> with the following options:
> USEREMOTE=0
> .machines files
> 1:localhost
> 1:localhost
> 1:localhost
> 1:localhost
>
> The results are very good (efficiency is 91%).
> The calculation on 4cores for 4 kpoints took 237s (total CPU used
> 100%) compared to 211s for one kpoint (total CPU used 25%).
>
> This really the type of benchmark we are interested in, when the CPU
> is fully loaded.
>
> For those who have access to the new pentium processors, will it be
> possible to see how they behave when the node is fully loaded. Do they
> still perform better than AMD ? And what is the efficiency ?
>
> I will soon have access to a node with 4CPU AMD (2cores), so I will
> be able to test with 8 k-points at the same time.
> I will let you know about the results.
> Regard
> Florent
>
> PS: I will be really interested by the test on charge on pentium
> architecture and I think such results should be mentioned on the
> WIEN2K page. Also, for the other users, all the compiler options
> should be mentioned precisely
>
>
>
> --
> -------------------------------------------------------------------------
> | Florent BOUCHER | |
> | Institut des Matériaux Jean Rouxel | Mailto:Florent.Boucher at cnrs-imn.fr |
> | 2, rue de la Houssinière | Phone: (33) 2 40 37 39 24 |
> | BP 32229 | Fax: (33) 2 40 37 39 95 |
> | 44322 NANTES CEDEX 3 (FRANCE) | http://www.cnrs-imn.fr |
> -------------------------------------------------------------------------
> _______________________________________________
> Wien mailing list
> Wien at zeus.theochem.tuwien.ac.at
> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
More information about the Wien
mailing list