[Wien] [WIEN2k] abort of CPU core parallel jobs in NMR calculations of the current

Peter Blaha peter.blaha at tuwien.ac.at
Sat May 11 14:58:37 CEST 2024


Hmm. ?

Are you using   k-parallel  AND  mpi-parallel ??  This could overload 
the machine.

How does the .machines file look like ?


Am 10.05.2024 um 18:15 schrieb Michael Fechtelkord via Wien:
> Dear all,
>
>
> the following problem occurs to me using the NMR part of WIEN2k (23.2) 
> on a opensuse LEAP 15.5 Intel platform. WIEN2k was compiled using 
> one-api 2024.1 ifort and gcc 13.2.1. I am using ELPA 2024.03.01, Libxc 
> 6.22, fftw 3.3.10 and MPICH 4.2.1 and the one-api 2024.1 MKL 
> libraries. The CPU is a I9 14900k with 24 cores where I use eight for 
> the calculations. The RAM is 130 Gb and a swap file of 16 GB on a 
> Samsung PCIE 4.0 NVME SSD. The BUS width is 5600 MT / s.
>
> The structure is a layersilicate and to simulate the ratio of Si:Al = 
> 3:1 I use a 1:1:2 supercell currently. The monoclinic symmetry of the 
> new structure (original is C 2/c) is P 2/c and contains 40 atoms (K, 
> Al, Si, O, and F).
>
> I use 3 NMR LOs for K and O and 10 for Si, Al, and F (where I need the 
> chemical shifts). The k mesh is 40k points.
>
> The interesting thing is that the RAM is sufficient during NMR vector 
> calculations (always under 100 Gb RAM occupied) and at the beginning 
> of the electron current calculation. However, the RAM use increases to 
> a critical point in the calculation and more and more data is 
> outsourced into the SWAP File which is sometimes 80% occupied.
>
> As you see this time only one core failed because of memory overflow. 
> But using 48k points 3 cores crashed and so the whole current 
> calculation. The reason is of the crash clear to me. But I do not 
> understand, why the current calculation reacts so sensitive with so 
> few atoms and a small k mesh. I made calculations with more atoms and 
> a 1000K point mesh on 4 cores .. they worked fine. So can it be that 
> the Intel MKL library is the source of failure? So I better get back 
> to 4 cores, even with longer calculation times?
>
> Have all a nice weekend!
>
>
> Best wishes from
>
> Michael Fechtelkord
>
> -----------------------------------------------
>
> cd ./  ...  x lcore  -f MS_2M1_Al2
>  CORE  END
> 0.685u 0.028s 0:00.71 98.5%     0+0k 2336+16168io 5pf+0w
>
> lcore        ....  ready
>
>
>  EXECUTING:     /usr/local/WIEN2k/nmr_mpi -case MS_2M1_Al2 -mode 
> current    -green         -scratch /scratch/WIEN2k/ -noco
>
> [1] 20253
> [2] 20257
> [3] 20261
> [4] 20265
> [5] 20269
> [6] 20273
> [7] 20277
> [8] 20281
> [8]  + Abgebrochen                   ( cd $dir; $exec2 >> 
> nmr.out.${loop} ) >& nmr.err.$loop
> [7]  + Fertig                        ( cd $dir; $exec2 >> 
> nmr.out.${loop} ) >& nmr.err.$loop
> [6]  + Fertig                        ( cd $dir; $exec2 >> 
> nmr.out.${loop} ) >& nmr.err.$loop
> [5]  + Fertig                        ( cd $dir; $exec2 >> 
> nmr.out.${loop} ) >& nmr.err.$loop
> [4]  + Fertig                        ( cd $dir; $exec2 >> 
> nmr.out.${loop} ) >& nmr.err.$loop
> [3]  + Fertig                        ( cd $dir; $exec2 >> 
> nmr.out.${loop} ) >& nmr.err.$loop
> [2]  + Fertig                        ( cd $dir; $exec2 >> 
> nmr.out.${loop} ) >& nmr.err.$loop
> [1]  + Fertig                        ( cd $dir; $exec2 >> 
> nmr.out.${loop} ) >& nmr.err.$loop
>
>  EXECUTING:     /usr/local/WIEN2k/nmr -case MS_2M1_Al2 -mode sumpara  
> -p 8    -green -scratch /scratch/WIEN2k/
>
>
> current        ....  ready
>
>
>  EXECUTING:     mpirun -np 1 -machinefile .machine_nmrinteg 
> /usr/local/WIEN2k/nmr_mpi -case MS_2M1_Al2 -mode integ -green
>
>
> nmr:  integration  ... done in   4032.3s
>
>
> stop
>
-- 
-----------------------------------------------------------------------
Peter Blaha,  Inst. f. Materials Chemistry, TU Vienna, A-1060 Vienna
Phone: +43-158801165300
Email: peter.blaha at tuwien.ac.at
WWW:   http://www.imc.tuwien.ac.at      WIEN2k: http://www.wien2k.at
-------------------------------------------------------------------------



More information about the Wien mailing list