[Wien] Large memory consumption of MPI k-point parallel version

Peter Blaha pblaha at theochem.tuwien.ac.at
Mon Apr 7 12:22:20 CEST 2008


The program does of course NOT use more memory.

My guess is that your queuing system limits the memory used by
ALL !! processes spanned by this job.

And of course, when 1 k-point needs 3 GB, 4 k-parallel jobs need in
total 14 GB (distributed on 16 instead of just 4 nodes).


Oleg Rubel schrieb:
> Dear Wien2k Community,
> 
> I have a question related to the memory consumption of the MPI-version of 
> LAPW1, which in the case of k-point parallelization is much larger than 
> without k-point parallelization.
> 
> I am calculating GaAs surface passivated by pseudohydrogen from one side. 
> The number of nonequivalent atoms in the suprcell totals 40. I have 12 
> k-points in the irreducible BZ; RKMAX is 2.10 (because of small Rmt of 
> hydrogen); there is no inversion; the matrix size is approx. 14200.
> 
> I run WIEN2k_08.1 (Release 14/12/2007) using the command
> 
>     min -i 100 -s 10 -j 'run_lapw -p -I -i 40 -fc 0.5 -ec 0.0001 -cc 0.001'
> 
> once using the following .machines file
> 
>     marc-hn:~/wien_work/GaAsBeta2_2x4> cat .machines
>     granularity:1
>     1:node009 node008 node126 node128
>     lapw0:node009:1 node008:1 node126:1 node128:1
> 
> In that case the program needs about 3 GB of memory, but takes a lot of 
> time (12 k-points). So I decided to include the k-point parallelization 
> and used the new .machines file
> 
>     marc-hn:~/wien_work/GaAsBeta2_2x4> cat .machines
>     granularity:1
>     1:node009 node008 node126 node128
>     1:node130 node127 node131 node118
>     1:node124 node136 node132 node135
>     1:node120 node119 node134 node125
>     lapw0:node009:1 node008:1 node126:1 node128:1 node130:1 node127:1 node131:1 node118:1 node124:1 node136:1 node132:1 node135:1 node120:1 node119:1 node134:1 node125:1
> 
> My naive thought was that 4 GB would be enough. But it turns out that 9 GB 
> was not enough. Why???
> 
> With the limit of 9 GB the job is killed by the queuing system, since 
> LAPW1 wants to have more than 9 GB, although the sequential version runs 
> fine with 4 GB limit. This is a tail of GaAsBeta2_2x4.output1_1 file:
> 
>     Matrix size        14201
>     scalapack processors array (row,col):   2   2
>               allocate H       782.5 MB          dimensions  7161  7161
>               allocate S       782.5 MB          dimensions  7161  7161
>          allocate spanel        14.0 MB          dimensions  7161   128
>          allocate hpanel        14.0 MB          dimensions  7161   128
>        allocate spanelus        14.0 MB          dimensions  7161   128
>            allocate slen         7.0 MB          dimensions  7161   128
>              allocate x2         7.0 MB          dimensions  7161   128
>        allocate legendre        90.9 MB          dimensions  7161    13   128
>     allocate al,bl (row)         2.4 MB          dimensions  7161    11
>     allocate al,bl (col)         0.0 MB          dimensions   128    11
>              allocate YL         3.5 MB          dimensions    15  7161     2
>     Time for al,bl    (hamilt) :         12.4
>     Time for legendre (hamilt) :          6.4
>     Time for phase    (hamilt) :        124.7
>     Time for us       (hamilt) :         79.9
>     Time for overlaps (hamilt) :        275.4
>     Time for distrib  (hamilt) :          2.0
>     Time for iouter   (hamilt) :        504.4
>      number of local orbitals, nlo (hamilt)      744
>     Time for los      (hamilt) :          8.7
>     Time for alm         (hns) :          5.3
>     Time for vector      (hns) :         38.7
>     Time for vector2     (hns) :         37.0
>     Time for VxV         (hns) :        811.1
>     Wall Time for VxV    (hns) :        885.8
>     ********* end of GaAsBeta2_2x4.output1_1 ************
> 
> I do not see where 9 GB comes from and why the memory requirement of the 
> k-point parallel version is so different from the sequential one?
> 
> I will be thankful for any pointers.
> 
> Oleg Rubel
> 
> 
> P.S.
> Some system details:
> CPU(s): Dual Opteron 270 (DualCore 2.0GHz)
> Operating System: Debian GNU/Linux v4.0 ("etch")
> Queuing System: SUN GridEngine 6.0u9
> Compiler: ifort 10.0
> Libraries: ScaLAPACK-1.8.0 from netlib; the rest is MKL 10.0
> _______________________________________________
> Wien mailing list
> Wien at zeus.theochem.tuwien.ac.at
> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien

-- 

                                       P.Blaha
--------------------------------------------------------------------------
Peter BLAHA, Inst.f. Materials Chemistry, TU Vienna, A-1060 Vienna
Phone: +43-1-58801-15671             FAX: +43-1-58801-15698
Email: blaha at theochem.tuwien.ac.at    WWW: http://info.tuwien.ac.at/theochem/
--------------------------------------------------------------------------


More information about the Wien mailing list