[Wien] Fwd: Re: MPI stuck at lapw0

Tue Nov 7 16:58:12 CET 2017

The different runtime fractions for small and large systems is due to 
the scaling of the time.
lapw0 scales basically linear with the number of atoms, but lapw1 scales 
cubically with the basisset.

And here is the second problem: for your nanowire you get a matix size 
of about 130000x130000, and this for just 97 atoms.
It is not the number of atoms which is determining the memory, but the 
plane wave basis set. This info is printed in the :RKM line of the scf 
file and you can even get it using

x lapw1 -nmat_only

So your cell dimensions / RMT settings must be very bad. Remember: Also 
"vacuum" costs a lot in plane wave methods. You have to optimize your 
RMT and reduce cell parameters (vacuum).

lapw2: you can set a line in .machines:

lapw2_vector_split:4  (or 8 or 16)

which will reduce the memory consumption of lapw2.

On 11/07/2017 04:09 PM, Luigi Maduro - TNW wrote:
>>>There are 2 different things:
>
>
>
>>>lapw0para  executes:
>
>>>    $remote  $machine "cd $PWD;$t $exe $def.def"
>
>
>
>>>where $remote is either ssh or rsh (depending on your configuration setup)
>
>
>
>>>once this is defined, it goes to the remote node and executes
>
>>>$exe, which usually refers to   mpirun
>
>
>
>>>mpirun is a script on your system, and it may acknowledge this
>
>>>I_MPI_HYDRA_BOOTSTRAP=rsh variable, while by default it seems to do ssh (even if your system does not support this). WIEN2k does not know about such variable and assumes that a plain mpirun will do the correct thing. The sysadmin should >>setup the
> system such that rsh is used by default with mpirun, or should tell
> people, which mpi-commands/variables they should set.
>
>
>
>>>PS: I do not quite understand how it can happen that you get rsh in lapw1para, but ssh in lapw0para ??
>
> I do not understand either, because when I check the lapw2para script I
> see that “set remote = rsh”
>
>
>
>
>
> I have a couple of questions concerning the parallel version of WIEN2k,
> one concerning insufficient virtual memory and the other concerning lapw1.
>
> I’ve been trying to do simulations of MoS2 in two types of
> configurations. One is a monolayer calculation (4x4x1 unit cells) with
> 48 atoms,
>
> and another calculation deals with a “nanowire” (13x2x1 unit cells) with
> 97 atoms.
>
>
>
> For the 4x4x1 unit cell  I have an rkmax of 6.0 and a 10 k-point mesh.
> For the calculation I used 2 nodes and 20 processors per node (so 40 in
> total).
> The command run is: run_lapw –p –nlvdw –ec 0.0001.
>
> What I noticed is that both lapw1 and nlvdw take a long time to  run.
> Lapw0 takes about a minute, as does lapw2. Lapw1 and nlvdw take about
> 16-19 minutes to run.
> When I log into the nodes and use the ‘top’ command to check the CPU% I
> see that all processors are at 100%, however I’ve been notified that
> only 2% of the requested CPU time is actually used.
>
> I don’t really understand why there is such a big discrepancy of the
> computation time between lapw1 and lapw2. In smaller calculations lapw1
> and lapw2 are in the same order of magnitude in computation time.
>
>
>
>
>
>
>
> For the nanowire calculation I chose an rkmax of 6.0 and a single
> k-point and only used LDA because I want to compare LDA with NLVDW later
> on. I always get an “forrtl: severe (41): insufficient virtual memory”
> error at lapw1 or lapw2 at the first SCF cycle no matter the amount of
> nodes I request, from 1 node to 20 nodes.
>
> Each time I requested 20 processors per node. Only with the 20 nodes and
> 20 processors did the SCF cycle make it to lapw2, but it crashed not
> long after reaching lapw2. Each node is equipped with 128 Gb of memory,
> and the end of output1_1 looks like this:
>
>
>
> MPI-parallel calculation using   400 processors
>
> Scalapack processors array (row,col):  20  20
>
> Matrix size       136632
>
> Nice Optimum Blocksize  112 Excess %  0.000D+00
>
>           allocate H       712.2 MB          dimensions  6832  6832
>
>           allocate S       712.2 MB          dimensions  6832  6832
>
>      allocate spanel        11.7 MB          dimensions  6832   112
>
>      allocate hpanel        11.7 MB          dimensions  6832   112
>
>    allocate spanelus        11.7 MB          dimensions  6832   112
>
>        allocate slen         5.8 MB          dimensions  6832   112
>
>          allocate x2         5.8 MB          dimensions  6832   112
>
>    allocate legendre        75.9 MB          dimensions  6832    13   112
>
> allocate al,bl (row)         2.3 MB          dimensions  6832    11
>
> allocate al,bl (col)         0.0 MB          dimensions   112    11
>
>          allocate YL         1.7 MB          dimensions    15  6832     1
>
> Time for al,bl    (hamilt, cpu/wall) :         14.7        14.7
>
> Time for legendre (hamilt, cpu/wall) :          4.1         4.1
>
> Time for phase    (hamilt, cpu/wall) :         29.7        30.2
>
> Time for us       (hamilt, cpu/wall) :         38.8        39.2
>
> Time for overlaps (hamilt, cpu/wall) :        115.6       116.3
>
> Time for distrib  (hamilt, cpu/wall) :          0.3         0.3
>
> Time sum iouter   (hamilt, cpu/wall) :        203.5       205.7
>
> number of local orbitals, nlo (hamilt)      749
>
>        allocate YL          33.4 MB          dimensions    15136632     1
>
>        allocate phsc         2.1 MB          dimensions136632
>
> Time for los      (hamilt, cpu/wall) :          0.4         0.4
>
> Time for alm         (hns) :          1.0
>
> Time for vector      (hns) :          7.2
>
> Time for vector2     (hns) :          6.8
>
> Time for VxV         (hns) :        114.8
>
> Wall Time for VxV    (hns) :          1.2
>
> Scalapack Workspace size   100.38 and   804.35 Mb
>
>
>
> Any help is appreciated.
> Kind regards,
> Luigi
>

-- 

                                       P.Blaha
--------------------------------------------------------------------------
Peter BLAHA, Inst.f. Materials Chemistry, TU Vienna, A-1060 Vienna
Phone: +43-1-58801-165300             FAX: +43-1-58801-165982
Email: blaha at theochem.tuwien.ac.at    WIEN2k: http://www.wien2k.at
WWW:   http://www.imc.tuwien.ac.at/TC_Blaha
--------------------------------------------------------------------------