[Wien] [WIEN2k] abort of CPU core parallel jobs in NMR calculations of the current

Michael Fechtelkord Michael.Fechtelkord at ruhr-uni-bochum.de
Fri May 10 18:15:23 CEST 2024


Dear all,


the following problem occurs to me using the NMR part of WIEN2k (23.2) 
on a opensuse LEAP 15.5 Intel platform. WIEN2k was compiled using 
one-api 2024.1 ifort and gcc 13.2.1. I am using ELPA 2024.03.01, Libxc 
6.22, fftw 3.3.10 and MPICH 4.2.1 and the one-api 2024.1 MKL libraries. 
The CPU is a I9 14900k with 24 cores where I use eight for the 
calculations. The RAM is 130 Gb and a swap file of 16 GB on a Samsung 
PCIE 4.0 NVME SSD. The BUS width is 5600 MT / s.

The structure is a layersilicate and to simulate the ratio of Si:Al = 
3:1 I use a 1:1:2 supercell currently. The monoclinic symmetry of the 
new structure (original is C 2/c) is P 2/c and contains 40 atoms (K, Al, 
Si, O, and F).

I use 3 NMR LOs for K and O and 10 for Si, Al, and F (where I need the 
chemical shifts). The k mesh is 40k points.

The interesting thing is that the RAM is sufficient during NMR vector 
calculations (always under 100 Gb RAM occupied) and at the beginning of 
the electron current calculation. However, the RAM use increases to a 
critical point in the calculation and more and more data is outsourced 
into the SWAP File which is sometimes 80% occupied.

As you see this time only one core failed because of memory overflow. 
But using 48k points 3 cores crashed and so the whole current 
calculation. The reason is of the crash clear to me. But I do not 
understand, why the current calculation reacts so sensitive with so few 
atoms and a small k mesh. I made calculations with more atoms and a 
1000K point mesh on 4 cores .. they worked fine. So can it be that the 
Intel MKL library is the source of failure? So I better get back to 4 
cores, even with longer calculation times?

Have all a nice weekend!


Best wishes from

Michael Fechtelkord

-----------------------------------------------

cd ./  ...  x lcore  -f MS_2M1_Al2
  CORE  END
0.685u 0.028s 0:00.71 98.5%     0+0k 2336+16168io 5pf+0w

lcore        ....  ready


  EXECUTING:     /usr/local/WIEN2k/nmr_mpi -case MS_2M1_Al2 -mode 
current    -green         -scratch /scratch/WIEN2k/ -noco

[1] 20253
[2] 20257
[3] 20261
[4] 20265
[5] 20269
[6] 20273
[7] 20277
[8] 20281
[8]  + Abgebrochen                   ( cd $dir; $exec2 >> 
nmr.out.${loop} ) >& nmr.err.$loop
[7]  + Fertig                        ( cd $dir; $exec2 >> 
nmr.out.${loop} ) >& nmr.err.$loop
[6]  + Fertig                        ( cd $dir; $exec2 >> 
nmr.out.${loop} ) >& nmr.err.$loop
[5]  + Fertig                        ( cd $dir; $exec2 >> 
nmr.out.${loop} ) >& nmr.err.$loop
[4]  + Fertig                        ( cd $dir; $exec2 >> 
nmr.out.${loop} ) >& nmr.err.$loop
[3]  + Fertig                        ( cd $dir; $exec2 >> 
nmr.out.${loop} ) >& nmr.err.$loop
[2]  + Fertig                        ( cd $dir; $exec2 >> 
nmr.out.${loop} ) >& nmr.err.$loop
[1]  + Fertig                        ( cd $dir; $exec2 >> 
nmr.out.${loop} ) >& nmr.err.$loop

  EXECUTING:     /usr/local/WIEN2k/nmr -case MS_2M1_Al2 -mode sumpara  
-p 8    -green -scratch /scratch/WIEN2k/


current        ....  ready


  EXECUTING:     mpirun -np 1 -machinefile .machine_nmrinteg 
/usr/local/WIEN2k/nmr_mpi -case MS_2M1_Al2 -mode integ -green


nmr:  integration  ... done in   4032.3s


stop

-- 
Dr. Michael Fechtelkord

Institut für Geologie, Mineralogie und Geophysik
Ruhr-Universität Bochum
Universitätsstr. 150
D-44780 Bochum

Phone: +49 (234) 32-24380
Fax:  +49 (234) 32-04380
Email: Michael.Fechtelkord at ruhr-uni-bochum.de
Web Page: https://www.ruhr-uni-bochum.de/kristallographie/kc/mitarbeiter/fechtelkord/



More information about the Wien mailing list