[Wien] RAM usage for large k-list of a big slab

Peter Blaha peter.blaha at tuwien.ac.at
Mon Jun 10 09:54:51 CEST 2024


Yes, I can confirm that RAM increases from k-point to k-point in lapw1 
using ifort+mkl.

However, this happens ONLY  with   ifort-mkl,  not with gfortran+openblas.

Thus I conclude it is an "mkl-problem".

However, switching to gfortran+openblas is not the best solution, 
because it seems that on the latest Intel I9-cores, the mkl is 
fundamentally faster than openblas (below with omp_lapw:8,but it happens 
also without omp):
gfortran:
        TIME HAMILT (CPU)  =    23.2, HNS =    31.9, HORB =     0.0, 
DIAG =   144.3, SYNC =     0.0
        TIME HAMILT (WALL) =     3.1, HNS =     5.8, HORB =     0.0, 
DIAG =    47.8, SYNC =     0.0
ifort:
        TIME HAMILT (CPU)  =    29.4, HNS =    43.6, HORB =     0.0, 
DIAG =    89.4, SYNC =     0.0
        TIME HAMILT (WALL) =     3.7, HNS =     5.5, HORB =     0.0, 
DIAG =    20.5, SYNC =     0.0

As you can see from these numbers, gfortran is as good (or even better) 
as ifort for hamilt and hns, but the diagonalization with openblas is 
much slower.

The solution to your problem, is however, quite simple.  Use
granularity:2 (or 3)    in your .machines file (you have to use a global 
scratch directory, i.e. SCRATCH=./).
This will not span 4 lapw1 jobs with 650 k-points each, but decomposes 
the k-list further, so that each lapw1 run calculates less k-points (use 
testpara to check, but note that there will still be max 4 jobs run at 
the same time). This way, the memory increase can be limited.

Best regards
Peter Blaha




Am 07.06.2024 um 09:27 schrieb pluto via Wien:
> Dear All,
> 
> I would appreciate if you could comment on the RAM use during the band 
> calculation.
> 
> I attach a graph of RAM use over several hours (this is now on i9 14900k 
> with approx. 110 GB of RAM). This is during the calculation of 
> 51x51=2601 k-points for a very large slab (60 non-equivalent atoms). 
> This is running x lapw1 -band -up -p with 4x localhost, and without omp:
> 
> Wed Jun  5 01:25:27 PM CEST 2024> (x) lapw1 -band -up -p
> 
> You can see that at some point swap activates, then actually after a 
> some time one of the localhost runs crashes.
> 
> Is the behavior normal? Can something be changed by adjusting some 
> settings? Possible problem with the WIEN2k compilation or with this 
> particular calculation?
> 
> Best,
> Lukasz
> 
> _______________________________________________
> Wien mailing list
> Wien at zeus.theochem.tuwien.ac.at
> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
> SEARCH the MAILING-LIST at:  http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html

-- 
--------------------------------------------------------------------------
Peter BLAHA, Inst.f. Materials Chemistry, TU Vienna, A-1060 Vienna
Phone: +43-1-58801-165300
Email: peter.blaha at tuwien.ac.at    WIEN2k: http://www.wien2k.at
WWW:   http://www.imc.tuwien.ac.at
-------------------------------------------------------------------------


More information about the Wien mailing list