[Wien] Parallel execution on new Intel CPUs

Peter Blaha peter.blaha at tuwien.ac.at
Tue Feb 14 10:23:20 CET 2023


I have no experience for such a CPU with fast and slow cores.

Simply test it out how you get the fastest turnaround for a fixed number 
of k-points and different number of processes (should be compatible with 
your k-points) and OMP=1-2 (4).

Previously, overloading (using more cores than the physical cores) was 
NOT a good idea, but I don't know how this "fused" CPU behaves. Maybe 
some "small" overloading is ok. This all depends on #-kpoints and 
available cores.

PS:

I cannot verify your omp_lapwso:2 failure. My tests run fine and the 
omp-setting is taken over properly.




> I am now using a machine with i7-13700K. This CPU has 8 performance 
> cores (P-cores) and 8 efficient cores (E-cores). In addition each 
> P-core has 2 threads, so there is 24 threads alltogether. It is hard 
> to find some reasonable info online, but probably a P-core is approx. 
> 2x faster than an E-core:
> https://www.anandtech.com/show/17047/the-intel-12th-gen-core-i912900k-review-hybrid-performance-brings-hybrid-complexity/10 
>
> This will of course depend on what is being calculated...
>
> Do you have suggestions on how to optimize the .machines file for the 
> parallel execution of an scf cycle?
>
> On my machine using OMP_NUM_THREADS leads to oscillations of the CPU 
> use (for a large slab maybe 40% of time is spent on a single thread), 
> suggesting that large OMP is not the optimal strategy.
>
> Some examples of strategies:
>
> One strategy would be to repeat the line
> 1:localhost
> 24 times, to have all the threads busy, and set OMP_NUM_THREADS=1.
>
> Another would be set the line
> 1:localhost
> 8 times and set OMP_NUM_THREADS=2, this would mean using all 16 
> physical cores.
>
> Or perhaps one should better "overload" the CPU e.g. by doing 
> 1:localhost 16 times and OMP=2 ?
>
> Over time I will try to benchmark some the different options, but 
> perhaps there is some logic of how one should think about this.
>
> In addition I have a comment on .machines file. It seems that for the 
> FM+SOC (runsp -so) calculations the
>
> omp_global
>
> setting in .machines is ignored. The
>
> omp_lapw1
> omp_lapw2
>
> settings seem to work fine. So, I tried to set OMP for lapwso 
> separately, by including the line like:
>
> omp_lapwso:2
>
> but this gives an error when executing parallel scf.
>
> Best,
> Lukasz
> _______________________________________________
> Wien mailing list
> Wien at zeus.theochem.tuwien.ac.at
> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
> SEARCH the MAILING-LIST at: 
> http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html

-- 
-----------------------------------------------------------------------
Peter Blaha,  Inst. f. Materials Chemistry, TU Vienna, A-1060 Vienna
Phone: +43-158801165300
Email: peter.blaha at tuwien.ac.at
WWW:   http://www.imc.tuwien.ac.at      WIEN2k: http://www.wien2k.at
-------------------------------------------------------------------------



More information about the Wien mailing list