[Wien] [WIEN2k] abort of CPU core parallel jobs in NMR calculations of the current
Michael Fechtelkord
Michael.Fechtelkord at ruhr-uni-bochum.de
Fri May 10 18:15:23 CEST 2024
Dear all,
the following problem occurs to me using the NMR part of WIEN2k (23.2)
on a opensuse LEAP 15.5 Intel platform. WIEN2k was compiled using
one-api 2024.1 ifort and gcc 13.2.1. I am using ELPA 2024.03.01, Libxc
6.22, fftw 3.3.10 and MPICH 4.2.1 and the one-api 2024.1 MKL libraries.
The CPU is a I9 14900k with 24 cores where I use eight for the
calculations. The RAM is 130 Gb and a swap file of 16 GB on a Samsung
PCIE 4.0 NVME SSD. The BUS width is 5600 MT / s.
The structure is a layersilicate and to simulate the ratio of Si:Al =
3:1 I use a 1:1:2 supercell currently. The monoclinic symmetry of the
new structure (original is C 2/c) is P 2/c and contains 40 atoms (K, Al,
Si, O, and F).
I use 3 NMR LOs for K and O and 10 for Si, Al, and F (where I need the
chemical shifts). The k mesh is 40k points.
The interesting thing is that the RAM is sufficient during NMR vector
calculations (always under 100 Gb RAM occupied) and at the beginning of
the electron current calculation. However, the RAM use increases to a
critical point in the calculation and more and more data is outsourced
into the SWAP File which is sometimes 80% occupied.
As you see this time only one core failed because of memory overflow.
But using 48k points 3 cores crashed and so the whole current
calculation. The reason is of the crash clear to me. But I do not
understand, why the current calculation reacts so sensitive with so few
atoms and a small k mesh. I made calculations with more atoms and a
1000K point mesh on 4 cores .. they worked fine. So can it be that the
Intel MKL library is the source of failure? So I better get back to 4
cores, even with longer calculation times?
Have all a nice weekend!
Best wishes from
Michael Fechtelkord
-----------------------------------------------
cd ./ ... x lcore -f MS_2M1_Al2
CORE END
0.685u 0.028s 0:00.71 98.5% 0+0k 2336+16168io 5pf+0w
lcore .... ready
EXECUTING: /usr/local/WIEN2k/nmr_mpi -case MS_2M1_Al2 -mode
current -green -scratch /scratch/WIEN2k/ -noco
[1] 20253
[2] 20257
[3] 20261
[4] 20265
[5] 20269
[6] 20273
[7] 20277
[8] 20281
[8] + Abgebrochen ( cd $dir; $exec2 >>
nmr.out.${loop} ) >& nmr.err.$loop
[7] + Fertig ( cd $dir; $exec2 >>
nmr.out.${loop} ) >& nmr.err.$loop
[6] + Fertig ( cd $dir; $exec2 >>
nmr.out.${loop} ) >& nmr.err.$loop
[5] + Fertig ( cd $dir; $exec2 >>
nmr.out.${loop} ) >& nmr.err.$loop
[4] + Fertig ( cd $dir; $exec2 >>
nmr.out.${loop} ) >& nmr.err.$loop
[3] + Fertig ( cd $dir; $exec2 >>
nmr.out.${loop} ) >& nmr.err.$loop
[2] + Fertig ( cd $dir; $exec2 >>
nmr.out.${loop} ) >& nmr.err.$loop
[1] + Fertig ( cd $dir; $exec2 >>
nmr.out.${loop} ) >& nmr.err.$loop
EXECUTING: /usr/local/WIEN2k/nmr -case MS_2M1_Al2 -mode sumpara
-p 8 -green -scratch /scratch/WIEN2k/
current .... ready
EXECUTING: mpirun -np 1 -machinefile .machine_nmrinteg
/usr/local/WIEN2k/nmr_mpi -case MS_2M1_Al2 -mode integ -green
nmr: integration ... done in 4032.3s
stop
--
Dr. Michael Fechtelkord
Institut für Geologie, Mineralogie und Geophysik
Ruhr-Universität Bochum
Universitätsstr. 150
D-44780 Bochum
Phone: +49 (234) 32-24380
Fax: +49 (234) 32-04380
Email: Michael.Fechtelkord at ruhr-uni-bochum.de
Web Page: https://www.ruhr-uni-bochum.de/kristallographie/kc/mitarbeiter/fechtelkord/
More information about the Wien
mailing list