[Wien] Quad Core Benchmark

Gerhard Fecher fecher at uni-mainz.de
Mon Jan 7 10:05:13 CET 2008


Similar things happen to me
4 threads seem to be very unstable (also if using the new MKL_NUM_THREADS instead of OMP_NUM_THREADS)

Do you have all memory slots filled ? What main board ?

My Timings are on 2 dual-core CPUs Xeon 5160 for Wien2k07.3 ifort 10.1.008 (-FR -mp1 -w -prec_div -pc80 -pad -ip -DINTEL_VML -xT) mkl 10.0.011
1 Job  1 Thread            115 Secs
2 Jobs 1 Thread Each   119 Secs Each
1 Job   2 Threads           75 Secs
2 Jobs 2 Threads Each   93 Secs Each
4 Jobs 1 Thread Each    178Secs Each

its slightly slower than expected from the pure clock rate (3 GHz to 2.13 GHz) compared to youre results (I did not yet check for the 08.1 version).

Ciao
Gerhard

PS.: Quad Core may be a little misleading because of the Quad Core Xeons

________________________________________
Von: wien-bounces at zeus.theochem.tuwien.ac.at [wien-bounces at zeus.theochem.tuwien.ac.at] im Auftrag von Laurence Marks [L-marks at northwestern.edu]
Gesendet: Montag, 7. Januar 2008 01:32
An: A Mailing list for WIEN2k users
Betreff: [Wien] Quad Core Benchmark

Minor format correction (for clarity)

Xeon X3210 2.13GHz Quad Core (2x2Duo Core)
ifort 10.1.008 cmkl 10.0.1.014
FOPT =  -FR -mp1 -w -prec_div -pc80 -pad -ip -DINTEL_VML -xT
-mtune=core2 -O3 -thread -fminshared
LDFLAGS = $(FOPT) -L/opt/intel/mkl/10.0.1.014/lib/em64t -static
R_LIBS = -lmkl_lapack -lguide -pthread

1 Job  1 Thread            140 Secs
2 Jobs 1 Thread Each   150 Secs Each
1 Job   2 Threads           88 Secs
2 Jobs 2 Threads Each  112 Secs Each
4 Jobs 1 Thread Each    228 Secs Each

MPI Performance
Times only (Note: CPU Times for 2 Threads are not correct, they are a
sum over threads)
1 MPI, 1 Node, 1 Thread     1423 Secs HAMILT (CPU )  =   223.0, HNS
=174.1, DIAG =  1021.4
1 MPI, 1 Node, 2 Threads   1038 Secs HAMILT (CPU )  =   385.4, HNS
=194.3, DIAG =  1430.8
2 MPI, 1 Node, 1 Thread     1242 Secs HAMILT (WALL) =   120.3, HNS
=129.8, DIAG =   988.4
2 MPI, 1 Node, 2 Threads   1105 Secs HAMILT (WALL) =   130.6, HNS
=112.1, DIAG =   859.5
4 MPI, 1 Node, 1 Thread     1175 Secs HAMILT (WALL) =    80.1, HNS
=116.8, DIAG =   977.9

Comments:
1) Intel has introduced a host of new environmental parameters so it
might be better to do better than this with the "right" options, but
probably not by much.
2) Even though OMP_NUM_THREADS=1 or 2 the documentation indicates that
this may not be honored.
3) 1 Job with 4 Threads is unstable. At best perhaps 80 seconds, at
worse it crashes.


--

Laurence Marks
Department of Materials Science and Engineering
MSE Rm 2036 Cook Hall
2220 N Campus Drive
Northwestern University
Evanston, IL 60208, USA
Tel: (847) 491-3996 Fax: (847) 491-7820
email: L-marks at northwestern dot edu
Web: www.numis.northwestern.edu
Commission on Electron Diffraction of IUCR
www.numis.northwestern.edu/IUCR_CED



--
Laurence Marks
Department of Materials Science and Engineering
MSE Rm 2036 Cook Hall
2220 N Campus Drive
Northwestern University
Evanston, IL 60208, USA
Tel: (847) 491-3996 Fax: (847) 491-7820
email: L-marks at northwestern dot edu
Web: www.numis.northwestern.edu
Commission on Electron Diffraction of IUCR
www.numis.northwestern.edu/IUCR_CED
_______________________________________________
Wien mailing list
Wien at zeus.theochem.tuwien.ac.at
http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien


More information about the Wien mailing list