[Wien] Parallel execution on new Intel CPUs
pluto
pluto at physics.ucdavis.edu
Wed Feb 22 11:43:47 CET 2023
Dear Prof. Blaha, Prof. Marks, dear All,
Below some benchmark results. It seems that for a serial calculation
using 8 OMP threads is optimal. This probably has something to do with
having 8 fast and 8 slow cores.
Hardware:
13th Gen Intel(R) Core(TM) i7-13700K
64 GB of RAM DDR4-3600
2 TB drive Samsung NVMe
ASUS Z690-P D4 mainboard
I also looked at mpi-benchmark, but I don't have mpi, so I think these
tests make no sense.
Let me know if I shoud add something to this.
Best,
Lukasz
bash-5.1$ pwd
(...)/WIEN2k_benchmark/Serial/test_case
bash-5.1$ export OMP_NUM_THREADS=1
bash-5.1$ echo $OMP_NUM_THREADS
1
bash-5.1$ x lapw1
LAPW1 END
12.567u 0.216s 0:12.82 99.6% 0+0k 464+37840io 2pf+0w
bash-5.1$ export OMP_NUM_THREADS=2
bash-5.1$ echo $OMP_NUM_THREADS
2
bash-5.1$ x lapw1
LAPW1 END
14.844u 0.248s 0:07.65 197.1% 0+0k 0+37840io 2pf+0w
bash-5.1$ export OMP_NUM_THREADS=4
bash-5.1$ echo $OMP_NUM_THREADS
4
bash-5.1$ x lapw1
LAPW1 END
21.091u 0.372s 0:05.51 389.4% 0+0k 0+37840io 10pf+0w
bash-5.1$ export OMP_NUM_THREADS=6
bash-5.1$ echo $OMP_NUM_THREADS
6
bash-5.1$ x lapw1
LAPW1 END
27.765u 0.490s 0:04.87 580.0% 0+0k 0+37824io 19pf+0w
bash-5.1$ export OMP_NUM_THREADS=8
bash-5.1$ echo $OMP_NUM_THREADS
8
bash-5.1$ x lapw1
LAPW1 END
34.099u 0.605s 0:04.51 769.1% 0+0k 0+37824io 27pf+0w
bash-5.1$ x lapw1
LAPW1 END
34.087u 0.616s 0:04.51 769.1% 0+0k 0+37824io 33pf+0w
bash-5.1$ x lapw1
LAPW1 END
34.119u 0.629s 0:04.52 768.3% 0+0k 0+37824io 26pf+0w
bash-5.1$ x lapw1
LAPW1 END
34.234u 0.579s 0:04.53 768.2% 0+0k 0+37824io 26pf+0w
bash-5.1$ export OMP_NUM_THREADS=12
bash-5.1$ echo $OMP_NUM_THREADS
12
bash-5.1$ x lapw1
LAPW1 END
61.638u 2.193s 0:05.54 1151.9% 0+0k 0+37840io 44pf+0w
bash-5.1$ export OMP_NUM_THREADS=16
bash-5.1$ echo $OMP_NUM_THREADS
16
bash-5.1$ x lapw1
LAPW1 END
82.629u 2.636s 0:05.55 1536.0% 0+0k 0+37840io 63pf+0w
bash-5.1$ export OMP_NUM_THREADS=24
bash-5.1$ echo $OMP_NUM_THREADS
24
bash-5.1$ x lapw1
LAPW1 END
86.794u 3.724s 0:05.48 1651.6% 0+0k 0+37840io 57pf+0w
bash-5.1$ pwd
(...)/WIEN2k_benchmark/mpi-benchmark
bash-5.1$ export OMP_NUM_THREADS=1
bash-5.1$ echo $OMP_NUM_THREADS
1
bash-5.1$ x lapw1
LAPW1 END
117.827u 0.921s 1:58.88 99.8% 0+0k 432+162616io 2pf+0w
On 2023-02-15 01:11, Laurence Marks wrote:
> Two things:
>
> 1) The CPU you have looks interesting. Can you please run and post the
> benchmark from the Wien2k page for different omp (and mpi would be
> good). It would be good to know what the "Hybrid Core" architecture
> does with Wien2k. For mpi elpa is much better -- it can also be better
> for non-mpi.
>
> 2) It is established lore in the DFT community that increasing the
> "smearing" assists convergence. However, not all lore is true. I am
> aware of zero evidence for this with the current Wien2k mixer, so I
> suggest sticking with room temperature rather than 1500K. More
> important is a well-posed problem. For more see
> http://www.numis.northwestern.edu/Presentations/DFT_Mixing_For_Dummies.pdf
>
> On Tue, Feb 14, 2023 at 5:18 PM pluto via Wien
> <wien at zeus.theochem.tuwien.ac.at> wrote:
>
>> Dear Prof. Blaha,
>>
>> Thank you for comments.
>>
>> At the moment I have 56 k-points in a big slab of one of the ternary
>>
>> magnetic 2D materials. Perhaps I can reduce k-points, something to
>> test.
>> Also now I see that my 56 k-points are compatible with 1:localhost
>> lines
>> :-)
>>
>> Also, for now it does not want to converge after 40 iterations with
>> TEMP
>> 0.002, for a while I was trying TEMP 0.004, and now I am trying TEMP
>>
>> 0.01. Maybe I should start with a smaller slab...
>>
>> Some info you asked for:
>>
>> The i7-13700K CPU has 8 P-cores (fast) and 8 E-cores (slow), so 16
>> total
>> physical cores. Each P-core has 2 threads, so there are total of 24
>> threads. Many other new Intel CPUs are the same. I don't think there
>> is
>> an easy way to enforce certain task on a certain core, and probably
>> it
>> makes no sense, because the CPU for sure has thermal control over
>> different cores etc.
>
> --
>
> Professor Laurence Marks
> Department of Materials Science and Engineering
> Northwestern University
> https://scholar.google.com/citations?user=zmHhI9gAAAAJ&hl=en [1]
> "Research is to see what everybody else has seen, and to think what
> nobody else has thought", Albert Szent-Györgyi
>
> Links:
> ------
> [1] http://www.numis.northwestern.edu
> _______________________________________________
> Wien mailing list
> Wien at zeus.theochem.tuwien.ac.at
> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
> SEARCH the MAILING-LIST at:
> http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
More information about the Wien
mailing list