<p>Hi, I ran serial benchmark after compiling with oneapi intel fortran.
The Eigen values are similar to the test cases, although there are
additional lines.</p><p>I'm still unsure if it is useful to run the
parallel benchmark on a single machine.</p><p>Intel Core i9-9900K (4.70
Ghz) ASUSTek PRIME Z390-P memory 32Gib 2 x DIMM DDR4
Synchronous 2666 MHz in bank 0 and 2 (HMA82GU6JJR8N-VK).</p><p>ifort
(IFORT) 2021.1 Beta 20201112</p><p>cc gcc (Debian 8.3.0-6)
8.3.0</p><p> </p><p>cores time(s)<br />
#N Wien21.1 Wien19.2<br />
1 15.45
15.39<br />
2
9.14 9.1<br />
3 7.64<br />
4
7.09 7.08<br />
6 6.57<br />
8
6.43 6.42<br />
16 6.53
6.54</p><p> </p><p>Wien2k 19.2<br />
<br />
<br />
OMP_NUM_THREADS=1<br />
15.234u 0.115s 0:15.39 99.6% 0+0k 704+37840io 3pf+0w<br />
<br />
OMP_NUM_THREADS=2<br />
17.708u 0.191s 0:09.10 196.5% 0+0k 0+37824io 0pf+0w<br />
<br />
OMP_NUM_THREADS=4<br />
27.267u 0.188s 0:07.08 387.5% 0+0k 0+37840io 0pf+0w<br />
<br />
OMP_NUM_THREADS=8<br />
48.819u 0.536s 0:06.42 768.5% 0+0k 0+37840io 0pf+0w<br />
<br />
OMP_NUM_THREADS=16<br />
56.134u 1.275s 0:06.54 877.6% 0+0k 0+37840io 0pf+0w<br />
<br />
<br />
<br />
WIEN2k_21.1 (Release 14/4/2021)<br />
<br />
serial benchmark test x lapw1, matrix size 3481<br />
OMP_NUM_THREADS=1<br />
15.230u 0.200s 0:15.45 99.8% 0+0k 0+37840io 0pf+0w<br />
<br />
OMP_NUM_THREADS=2<br />
17.746u 0.264s 0:09.14 196.9% 0+0k 0+37840io 0pf+0w<br />
<br />
OMP_NUM_THREADS=3<br />
22.057u 0.344s 0:07.64 293.0% 0+0k 0+37840io 0pf+0w<br />
<br />
OMP_NUM_THREADS=4<br />
27.145u 0.399s 0:07.09 388.2% 0+0k 0+37840io 0pf+0w<br />
<br />
OMP_NUM_THREADS=6<br />
37.519u 0.567s 0:06.57 579.4% 0+0k 0+37840io 0pf+0w<br />
<br />
OMP_NUM_THREADS=8<br />
49.004u 0.620s 0:06.43 771.6% 0+0k 0+37840io 0pf+0w<br />
<br />
OMP_NUM_THREADS=10<br />
50.762u 0.939s 0:06.49 796.4% 0+0k 0+37840io 0pf+0w<br />
<br />
OMP_NUM_THREADS=12<br />
53.143u 1.075s 0:06.54 828.8% 0+0k 0+37840io 0pf+0w<br />
<br />
OMP_NUM_THREADS=16<br />
55.914u 1.493s 0:06.53 879.0% 0+0k 0+37840io 0pf+0w<br />
<br />
<br />
grep HORB *output1*<br />
test_case.output1_10core: TIME HAMILT
(CPU) = 7.6, HNS =
7.3, HORB = 0.0, DIAG = 36.7,
SYNC = 0.0<br />
test_case.output1_10core: TIME HAMILT
(WALL) = 0.8, HNS = 0.9,
HORB = 0.0, DIAG = 4.7,
SYNC = 0.0<br />
test_case.output1_16core: TIME HAMILT
(CPU) = 11.5, HNS = 8.9,
HORB = 0.0, DIAG = 36.7, SYNC
= 0.0<br />
test_case.output1_16core: TIME HAMILT
(WALL) = 0.8, HNS = 0.9,
HORB = 0.0, DIAG = 4.7,
SYNC = 0.0<br />
test_case.output1_1core: TIME HAMILT
(CPU) = 2.7, HNS =
1.6, HORB = 0.0, DIAG = 11.0,
SYNC = 0.0<br />
test_case.output1_1core: TIME HAMILT
(WALL) = 2.7, HNS = 1.6,
HORB = 0.0, DIAG = 10.9, SYNC
= 0.0<br />
test_case.output1_2core: TIME HAMILT
(CPU) = 2.8, HNS =
2.0, HORB = 0.0, DIAG = 12.9,
SYNC = 0.0<br />
test_case.output1_2core: TIME HAMILT
(WALL) = 1.4, HNS = 1.0,
HORB = 0.0, DIAG = 6.5,
SYNC = 0.0<br />
test_case.output1_4core: TIME HAMILT
(CPU) = 3.5, HNS =
3.4, HORB = 0.0, DIAG = 20.3,
SYNC = 0.0<br />
test_case.output1_4core: TIME HAMILT
(WALL) = 0.9, HNS = 0.9,
HORB = 0.0, DIAG = 5.1,
SYNC = 0.0<br />
test_case.output1_8core: TIME HAMILT
(CPU) = 5.5, HNS =
7.0, HORB = 0.0, DIAG = 36.7,
SYNC = 0.0<br />
test_case.output1_8core: TIME HAMILT
(WALL) = 0.7, HNS = 0.9,
HORB = 0.0, DIAG = 4.7,
SYNC = 0.0<br />
</p><p>Best wishes,</p><p>Mathew</p>