[Wien] Quad Core Benchmark
Gerhard Fecher
fecher at uni-mainz.de
Mon Jan 7 10:05:13 CET 2008
Similar things happen to me
4 threads seem to be very unstable (also if using the new MKL_NUM_THREADS instead of OMP_NUM_THREADS)
Do you have all memory slots filled ? What main board ?
My Timings are on 2 dual-core CPUs Xeon 5160 for Wien2k07.3 ifort 10.1.008 (-FR -mp1 -w -prec_div -pc80 -pad -ip -DINTEL_VML -xT) mkl 10.0.011
1 Job 1 Thread 115 Secs
2 Jobs 1 Thread Each 119 Secs Each
1 Job 2 Threads 75 Secs
2 Jobs 2 Threads Each 93 Secs Each
4 Jobs 1 Thread Each 178Secs Each
its slightly slower than expected from the pure clock rate (3 GHz to 2.13 GHz) compared to youre results (I did not yet check for the 08.1 version).
Ciao
Gerhard
PS.: Quad Core may be a little misleading because of the Quad Core Xeons
________________________________________
Von: wien-bounces at zeus.theochem.tuwien.ac.at [wien-bounces at zeus.theochem.tuwien.ac.at] im Auftrag von Laurence Marks [L-marks at northwestern.edu]
Gesendet: Montag, 7. Januar 2008 01:32
An: A Mailing list for WIEN2k users
Betreff: [Wien] Quad Core Benchmark
Minor format correction (for clarity)
Xeon X3210 2.13GHz Quad Core (2x2Duo Core)
ifort 10.1.008 cmkl 10.0.1.014
FOPT = -FR -mp1 -w -prec_div -pc80 -pad -ip -DINTEL_VML -xT
-mtune=core2 -O3 -thread -fminshared
LDFLAGS = $(FOPT) -L/opt/intel/mkl/10.0.1.014/lib/em64t -static
R_LIBS = -lmkl_lapack -lguide -pthread
1 Job 1 Thread 140 Secs
2 Jobs 1 Thread Each 150 Secs Each
1 Job 2 Threads 88 Secs
2 Jobs 2 Threads Each 112 Secs Each
4 Jobs 1 Thread Each 228 Secs Each
MPI Performance
Times only (Note: CPU Times for 2 Threads are not correct, they are a
sum over threads)
1 MPI, 1 Node, 1 Thread 1423 Secs HAMILT (CPU ) = 223.0, HNS
=174.1, DIAG = 1021.4
1 MPI, 1 Node, 2 Threads 1038 Secs HAMILT (CPU ) = 385.4, HNS
=194.3, DIAG = 1430.8
2 MPI, 1 Node, 1 Thread 1242 Secs HAMILT (WALL) = 120.3, HNS
=129.8, DIAG = 988.4
2 MPI, 1 Node, 2 Threads 1105 Secs HAMILT (WALL) = 130.6, HNS
=112.1, DIAG = 859.5
4 MPI, 1 Node, 1 Thread 1175 Secs HAMILT (WALL) = 80.1, HNS
=116.8, DIAG = 977.9
Comments:
1) Intel has introduced a host of new environmental parameters so it
might be better to do better than this with the "right" options, but
probably not by much.
2) Even though OMP_NUM_THREADS=1 or 2 the documentation indicates that
this may not be honored.
3) 1 Job with 4 Threads is unstable. At best perhaps 80 seconds, at
worse it crashes.
--
Laurence Marks
Department of Materials Science and Engineering
MSE Rm 2036 Cook Hall
2220 N Campus Drive
Northwestern University
Evanston, IL 60208, USA
Tel: (847) 491-3996 Fax: (847) 491-7820
email: L-marks at northwestern dot edu
Web: www.numis.northwestern.edu
Commission on Electron Diffraction of IUCR
www.numis.northwestern.edu/IUCR_CED
--
Laurence Marks
Department of Materials Science and Engineering
MSE Rm 2036 Cook Hall
2220 N Campus Drive
Northwestern University
Evanston, IL 60208, USA
Tel: (847) 491-3996 Fax: (847) 491-7820
email: L-marks at northwestern dot edu
Web: www.numis.northwestern.edu
Commission on Electron Diffraction of IUCR
www.numis.northwestern.edu/IUCR_CED
_______________________________________________
Wien mailing list
Wien at zeus.theochem.tuwien.ac.at
http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
More information about the Wien
mailing list