<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html;charset=ISO-8859-1" http-equiv="Content-Type">
</head>
<body bgcolor="#ffffff" text="#000000">
Dear Wien user,<br>
I am doing actually some benchmarks on opteron system.<br>
Here is the configuration of the node:<br>
<font face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial;"><br>
- bi-opteron 2214
(dualcore à 2.2Ghz, 64 bits, 2*1Mo de cache) <o:p></o:p></span></font><br>
<font face="Arial" size="2"><span
style="font-size: 10pt; font-family: Arial;">- 4 Go de mémoire DDR2
ECC Reg PC5300 667 Mhz <br>
<o:p></o:p></span></font><br>
I did the compilation with the following options :<br>
Intel fortran for EM64T : 9.1.036<br>
<tt>FOPT = -FR -w -mp1 -prec-div -pc80 -pad -ip -O3<br>
R_LIBS = ../SRC_lib/liblapack_lapw.a
/opt/goto/64/libgoto_opteronp-r1.07.so -lpthread<br>
</tt><br>
I got for the official benchmark a CPU time of 215s (OMP_NUM_THREADS=1)
that seems to me in agreement with what is publish for <br>
<pre>AMD-Opteron, single cpu, 2.4 Ghz 196 sec ifort64 + libgoto_opteron64p-r1.00.so
</pre>
(we have 2.2 and I use OMP_NUM_THREADS=1)<br>
<br>
It is clear the pentium processor have the best performance for single
CPU test but I think one should care about the behavior on a fully
loaded computer.<br>
<br>
In order to test the saturation of the memory band width on the node
(that could be the great difference between AMD and Pentium
architecture), I just did a parallel job on the k-points using the
official WIEN benchmark. As I have access to 4 cores (2CPUs x 2 cores),
I just put 4 kpoints into the klist file (one can put 4 times the first
k-point) and I run<br>
<tt>x lapw1 -c -p</tt><br>
with the following options:<br>
<tt>USEREMOTE=0<br>
.machines files<br>
1:localhost<br>
1:localhost<br>
1:localhost<br>
1:localhost<br>
<br>
</tt>The results are very good (efficiency is 91%).<br>
The calculation on 4cores for 4 kpoints took 237s (total CPU used 100%)
compared to 211s for one kpoint (total CPU used 25%).<br>
<br>
This really the type of benchmark we are interested in, when the CPU is
fully loaded.<br>
<br>
<big><b>For those who have access to the new pentium processors, will
it be possible to see how they behave when the node is fully loaded. Do
they still perform better than AMD ? And what is the efficiency ?</b></big><br>
<br>
I will soon have access to a node with 4CPU AMD (2cores), so I will be
able to test with 8 k-points at the same time.<br>
I will let you know about the results.<br>
Regard<br>
Florent<tt><br>
<br>
PS: I will be really interested by the test on charge on pentium
architecture and I think such results should be mentioned on the WIEN2K
page. Also, for the other users, all the compiler options should be
mentioned precisely<br>
<br>
<br>
<br>
</tt>
<pre class="moz-signature" cols="80">--
-------------------------------------------------------------------------
| Florent BOUCHER | |
| Institut des Matériaux Jean Rouxel | <a class="moz-txt-link-freetext"
href="Mailto:Florent.Boucher@cnrs-imn.fr">Mailto:Florent.Boucher@cnrs-imn.fr</a> |
| 2, rue de la Houssinière | Phone: (33) 2 40 37 39 24 |
| BP 32229 | Fax: (33) 2 40 37 39 95 |
| 44322 NANTES CEDEX 3 (FRANCE) | <a class="moz-txt-link-freetext"
href="http://www.cnrs-imn.fr">http://www.cnrs-imn.fr</a> |
-------------------------------------------------------------------------</pre>
</body>
</html>