[Wien] benchmark test withi9-12900k

SANDEEP ARORA sndparora4 at gmail.com
Tue Jan 31 11:02:29 CET 2023


Hi, We have run the test case on machines having following configuration
processor : i9-12900k 16 cores 8 cores @3.2GHz and 8 cores @2.4 GHz
ram : 64 gb
compiler: intel oneapi-2023
wien2k_21.1
os : ubuntu 20.04.5

the results obtained are
omp=1
12.937u 0.332s 0:13.30 99.6% 0+0k 75712+37848io 316pf+0w

omp=2
13.334u 0.403s 0:07.44 184.5% 0+0k 76192+37840io 332pf+0w

omp=4
14.894u 0.513s 0:04.64 331.8% 0+0k 0+37840io 2pf+0w

omp=8
19.320u 0.775s 0:03.65 550.4% 0+0k 0+37840io 11pf+0w



omp=8 k=2
     dop4(1) 49.984u 2.287s 10.17 513.67%      0+0k 0+0io 0pf+0w
     dop4(1) 50.970u 2.312s 10.21 521.40%      0+0k 0+0io 0pf+0w
        6.74s/k-point
   Summary of lapw1para:
   dop4 k=2 user=100.954 wallclock=2257.87
0.295u 0.403s 0:13.49 5.1% 0+0k 0+440io 0pf+0w

omp=4 k=4
     dop4(1) 41.603u 1.267s 13.96 306.98%      0+0k 0+0io 0pf+0w
     dop4(1) 41.659u 0.931s 13.50 315.32%      0+0k 0+0io 0pf+0w
     dop4(1) 43.051u 1.581s 14.07 317.15%      0+0k 0+0io 0pf+0w
     dop4(1) 43.629u 1.245s 14.16 316.82%      0+0k 0+0io 0pf+0w
   Summary of lapw1para:
   dop4 k=4 user=169.942 wallclock=4597.67
0.450u 0.581s 0:17.06 6.0% 0+0k 0+776io 0pf+0w

omp=2 k=8 jobs=8
     dop4(1) 43.421u 0.613s 25.02 175.95%      0+0k 0+0io 0pf+0w
     dop4(1) 47.340u 0.553s 26.99 177.40%      0+0k 0+0io 0pf+0w
     dop4(1) 44.956u 0.978s 26.10 175.95%      0+0k 0+0io 0pf+0w
     dop4(1) 47.209u 1.005s 26.85 179.56%      0+0k 0+0io 0pf+0w
     dop4(1) 45.912u 0.917s 26.68 175.47%      0+0k 0+0io 0pf+0w
     dop4(1) 44.177u 0.588s 25.29 177.01%      0+0k 0+0io 0pf+0w
     dop4(1) 46.621u 0.839s 26.91 176.31%      0+0k 0+0io 0pf+0w
     dop4(1) 45.358u 0.680s 25.89 177.77%      0+0k 0+0io 0pf+0w
   Summary of lapw1para:
   dop4 k=8 user=364.994 wallclock=13999.2
0.710u 1.170s 0:29.86 6.2% 0+0k 0+1448io 1pf+0w


omp=1 k=16 jobs=16
dop4(1) 49.945u 0.344s 50.31 99.95%      0+0k 0+0io 0pf+0w
     dop4(1) 50.970u 0.360s 51.35 99.94%      0+0k 0+0io 0pf+0w
     dop4(1) 51.441u 0.344s 51.80 99.96%      0+0k 0+0io 0pf+0w
     dop4(1) 50.983u 0.356s 51.37 99.93%      0+0k 0+0io 0pf+0w
     dop4(1) 45.376u 0.284s 45.68 99.95%      0+0k 0+0io 0pf+0w
     dop4(1) 48.617u 0.367s 49.04 99.87%      0+0k 0+0io 0pf+0w
     dop4(1) 50.989u 0.404s 51.40 99.98%      0+0k 0+0io 0pf+0w
     dop4(1) 48.594u 0.416s 49.01 99.99%      0+0k 0+0io 0pf+0w
     dop4(1) 51.466u 0.355s 51.82 100.00%      0+0k 0+0io 0pf+0w
     dop4(1) 50.587u 0.492s 51.08 100.00%      0+0k 0+0io 0pf+0w
     dop4(1) 51.218u 0.300s 51.51 100.00%      0+0k 0+0io 0pf+0w
     dop4(1) 50.410u 0.411s 50.81 100.01%      0+0k 0+0io 0pf+0w
     dop4(1) 50.112u 0.376s 50.49 99.99%      0+0k 0+0io 0pf+0w
     dop4(1) 50.595u 0.396s 50.99 99.99%      0+0k 0+0io 0pf+0w
     dop4(1) 49.458u 0.380s 49.84 99.98%      0+0k 0+0io 0pf+0w
     dop4(1) 50.372u 0.352s 50.75 99.94%      0+0k 0+0io 0pf+0w
   Summary of lapw1para:
   dop4 k=16 user=801.133 wallclock=50034.5
1.301u 1.935s 0:55.18 5.8% 0+0k 0+2840io 2pf+0w

With a small cluster of 4 cpu's connected via 1gb/s networking switch

     dop4(1) 46.301u 2.303s 9.51 510.71%      0+0k 0+0io 0pf+0w
     dop4(1) 48.897u 2.177s 9.69 527.08%      0+0k 0+0io 0pf+0w
     dop1(1) 49.091u 2.452s 10.16 507.21%      0+0k 0+0io 0pf+0w
     dop1(1) 48.378u 2.567s 10.00 509.35%      0+0k 0+0io 0pf+0w
     dop2(1) 49.628u 2.146s 10.34 500.47%      0+0k 0+0io 0pf+0w
     dop2(1) 48.050u 2.626s 10.21 496.14%      0+0k 0+0io 0pf+0w
     dop3(1) 47.159u 2.655s 9.86 504.91%      0+0k 0+0io 0pf+0w
     dop3(1) 48.033u 2.089s 9.72 515.50%      0+0k 0+0io 0pf+0w
   Summary of lapw1para:
   dop4 k=2 user=95.198 wallclock=2189.79
   dop1 k=2 user=97.469 wallclock=2226.16
   dop2 k=2 user=97.678 wallclock=2229.61
   dop3 k=2 user=95.192 wallclock=2195.21
0.686u 1.194s 0:14.27 13.1% 0+0k 0+1456io 0pf+0w

I have a query regarding this.
While performing serial or parallel calculations, on increasing  omp from 1
to 8 , %age use of cpu's does not increase in the same scale (omp=2, 170to
180% , omp=4 ,300 to 330%  omp=8 only 500 to 550%).
is something wrong in configuring or compiling the softwares or due to some
limitations in hardware.
Any suggestions?
regard
Sandeep Arora
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://zeus.theochem.tuwien.ac.at/pipermail/wien/attachments/20230131/c5d67c23/attachment.htm>


More information about the Wien mailing list