[Wien] [Extern] Re: RAM issues in lapw1 -bands
Coriolan TIUSAN
coriolan.tiusan at phys.utcluj.ro
Thu Nov 29 13:10:52 CET 2018
Thanks for the suggestion of dividing the band calculation.
Actually, I would like to make a 'zoom' around the Gamma point (for
X-G-X direction) with a resolution of about 0.001 Bohr-1 (to get enough
accuracy for small Rasba splittings, k_0< 0.01 Bohr-1). I guess I could
simply make the 'zoom' calculation?
The .machines, file, having in view that I have only one node (computer)
with 48 available CPUs is:
-------------------------------------
1:localhost:48
granularity:1
extrafine:1
lapw0:localhost:48
dstart:localhost:48
nlvdw:localhost:48
--------------------------------------
For a supercell here attached, I was trying to make a bandstructure
calculations along the X-G-X direction with at least 200 points....which
corresponds to a step of only 0.005 Bohr-1, not enough for Rashba in
same order of magnitude.
For my calculations I get: MATRIX SIZE 2606LOs: 138 RKM= 6.99 and the
RAM of 64Gk is 100% filles plus about 100G of swap...
Beyond all aspects, what I would like to understand is also why in scf
calculation I have no memory 'overload' FOR 250K POINTS (13 13 1)...
while when running 'lapw1para_mpi -p -band ' the memory issue seem more
dramatic?
If necessary, my struct file is:
------------------
VFeMgO-vid s-o calc. M|| 1.00 0.00 0.00
P 14
RELA
5.725872 5.725872 61.131153 90.000000 90.000000 90.000000
ATOM -1: X=0.50000000 Y=0.50000000 Z=0.01215444
MULT= 1 ISPLIT= 8
V 1 NPT= 781 R0=.000050000 RMT= 2.18000 Z: 23.00000
LOCAL ROT MATRIX: 1.0000000 0.0000000 0.0000000
0.0000000 1.0000000 0.0000000
0.0000000 0.0000000 1.0000000
ATOM -2: X=0.00000000 Y=0.00000000 Z=0.05174176
MULT= 1 ISPLIT= 8
V 2 NPT= 781 R0=.000050000 RMT= 2.18000 Z: 23.00000
LOCAL ROT MATRIX: 1.0000000 0.0000000 0.0000000
0.0000000 1.0000000 0.0000000
0.0000000 0.0000000 1.0000000
ATOM -3: X=0.50000000 Y=0.50000000 Z=0.09885823
MULT= 1 ISPLIT= 8
V 3 NPT= 781 R0=.000050000 RMT= 2.18000 Z: 23.00000
LOCAL ROT MATRIX: 1.0000000 0.0000000 0.0000000
0.0000000 1.0000000 0.0000000
0.0000000 0.0000000 1.0000000
ATOM -4: X=0.00000000 Y=0.00000000 Z=0.13971867
MULT= 1 ISPLIT= 8
Fe1 NPT= 781 R0=.000050000 RMT= 1.95000 Z: 26.00000
LOCAL ROT MATRIX: 1.0000000 0.0000000 0.0000000
0.0000000 1.0000000 0.0000000
0.0000000 0.0000000 1.0000000
ATOM -5: X=0.50000000 Y=0.50000000 Z=0.18164479
MULT= 1 ISPLIT= 8
Fe2 NPT= 781 R0=.000050000 RMT= 1.95000 Z: 26.00000
LOCAL ROT MATRIX: 1.0000000 0.0000000 0.0000000
0.0000000 1.0000000 0.0000000
0.0000000 0.0000000 1.0000000
ATOM -6: X=0.00000000 Y=0.00000000 Z=0.22284885
MULT= 1 ISPLIT= 8
Fe3 NPT= 781 R0=.000050000 RMT= 1.95000 Z: 26.00000
LOCAL ROT MATRIX: 1.0000000 0.0000000 0.0000000
0.0000000 1.0000000 0.0000000
0.0000000 0.0000000 1.0000000
ATOM -7: X=0.50000000 Y=0.50000000 Z=0.26533335
MULT= 1 ISPLIT= 8
Fe4 NPT= 781 R0=.000050000 RMT= 1.95000 Z: 26.00000
LOCAL ROT MATRIX: 1.0000000 0.0000000 0.0000000
0.0000000 1.0000000 0.0000000
0.0000000 0.0000000 1.0000000
ATOM -8: X=0.00000000 Y=0.00000000 Z=0.30245527
MULT= 1 ISPLIT= 8
Fe5 NPT= 781 R0=.000050000 RMT= 1.95000 Z: 26.00000
LOCAL ROT MATRIX: 1.0000000 0.0000000 0.0000000
0.0000000 1.0000000 0.0000000
0.0000000 0.0000000 1.0000000
ATOM -9: X=0.00000000 Y=0.00000000 Z=0.36627712
MULT= 1 ISPLIT= 8
O 1 NPT= 781 R0=.000100000 RMT= 1.68000 Z: 8.00000
LOCAL ROT MATRIX: 1.0000000 0.0000000 0.0000000
0.0000000 1.0000000 0.0000000
0.0000000 0.0000000 1.0000000
ATOM -10: X=0.50000000 Y=0.50000000 Z=0.36416415
MULT= 1 ISPLIT= 8
Mg1 NPT= 781 R0=.000100000 RMT= 1.87000 Z: 12.00000
LOCAL ROT MATRIX: 1.0000000 0.0000000 0.0000000
0.0000000 1.0000000 0.0000000
0.0000000 0.0000000 1.0000000
ATOM -11: X=0.50000000 Y=0.50000000 Z=0.43034285
MULT= 1 ISPLIT= 8
O 2 NPT= 781 R0=.000100000 RMT= 1.68000 Z: 8.00000
LOCAL ROT MATRIX: 1.0000000 0.0000000 0.0000000
0.0000000 1.0000000 0.0000000
0.0000000 0.0000000 1.0000000
ATOM -12: X=0.00000000 Y=0.00000000 Z=0.43127365
MULT= 1 ISPLIT= 8
Mg2 NPT= 781 R0=.000100000 RMT= 1.87000 Z: 12.00000
LOCAL ROT MATRIX: 1.0000000 0.0000000 0.0000000
0.0000000 1.0000000 0.0000000
0.0000000 0.0000000 1.0000000
ATOM -13: X=0.00000000 Y=0.00000000 Z=0.49684798
MULT= 1 ISPLIT= 8
O 3 NPT= 781 R0=.000100000 RMT= 1.68000 Z: 8.00000
LOCAL ROT MATRIX: 1.0000000 0.0000000 0.0000000
0.0000000 1.0000000 0.0000000
0.0000000 0.0000000 1.0000000
ATOM -14: X=0.50000000 Y=0.50000000 Z=0.49541730
MULT= 1 ISPLIT= 8
Mg3 NPT= 781 R0=.000100000 RMT= 1.87000 Z: 12.00000
LOCAL ROT MATRIX: 1.0000000 0.0000000 0.0000000
0.0000000 1.0000000 0.0000000
0.0000000 0.0000000 1.0000000
4 NUMBER OF SYMMETRY OPERATIONS
-1 0 0 0.00000000
0 1 0 0.00000000
0 0 1 0.00000000
1 A 1 so. oper. type orig. index
1 0 0 0.00000000
0 1 0 0.00000000
0 0 1 0.00000000
2 A 2
-1 0 0 0.00000000
0-1 0 0.00000000
0 0 1 0.00000000
3 B 3
1 0 0 0.00000000
0-1 0 0.00000000
0 0 1 0.00000000
4 B 4
---------------------------
La 29/11/2018 13:05, Peter Blaha a scris:
> You never listed your .machines file, nor do we know how many k-points
> are in the scf and the bandstructure cases and what the matrix
> size(:RKM)/ real/ complex details are.
>
> The memory leakage of intels mpi seems to be very version dependent,
> but there's nothing we can do against from the wien2k side.
>
> Besides installing a different mpi version, one could more easily run
> the bandstructure in pieces. Simply divide your klist_band file into
> several pieces and calculate one after the other.
>
> The resulting case.outputso_1,2,3.. files can simply be concatenated
> (cat file1 file2 file3 > file) together.
>
>
>
> On 11/28/18 1:41 PM, Coriolan TIUSAN wrote:
>> Dear wien2k users,
>>
>> I am running wien 18.2 on Ubuntu 18.04 , installed on a HP station:
>> 64GB, Intel® Xeon(R) Gold 5118 CPU @ 2.30GHz × 48.
>>
>> The fortran compiler/math library are ifc and intel mkl library. For
>> parallel execution I have MPI+SCALAPACK, FFTW.
>>
>> For parallel execution (-p options +.machines), I have dimensioned
>> NMATMAX/NUME according to user guide. Therefore, standard
>> calculations in SCF loops turn well, without any memory paging
>> issues, about 90% of physical RAM being used.
>>
>> However, in supercells, once getting case.vector files, when
>> calculating bands (lapw1c -bands -p) with fine k structure (e.g.
>> above 150-200k on line X-G-X), necessary because I am looking to
>> small Rashba shifts at metel-insulator interfaces...all available
>> physical memory plus a huge amount of swap (>100G) are filled/used...
>>
>> Any suggestion/ideea for overcoming this issue...without adding
>> additional RAM?
>>
>> Why in lapw1 -p for selfonsistance memory looks enough while with
>> switch -band overload memory?
>>
>> With thanks in advance,
>>
>> C. Tiusan
>>
>>
>>
>> _______________________________________________
>> Wien mailing list
>> Wien at zeus.theochem.tuwien.ac.at
>> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
>> SEARCH the MAILING-LIST at:
>> http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
>
--
La 29/11/2018 13:05, Peter Blaha a scris:
> You never listed your .machines file, nor do we know how many k-points
> are in the scf and the bandstructure cases and what the matrix
> size(:RKM)/ real/ complex details are.
>
> The memory leakage of intels mpi seems to be very version dependent,
> but there's nothing we can do against from the wien2k side.
>
> Besides installing a different mpi version, one could more easily run
> the bandstructure in pieces. Simply divide your klist_band file into
> several pieces and calculate one after the other.
>
> The resulting case.outputso_1,2,3.. files can simply be concatenated
> (cat file1 file2 file3 > file) together.
>
>
>
> On 11/28/18 1:41 PM, Coriolan TIUSAN wrote:
>> Dear wien2k users,
>>
>> I am running wien 18.2 on Ubuntu 18.04 , installed on a HP station:
>> 64GB, Intel® Xeon(R) Gold 5118 CPU @ 2.30GHz × 48.
>>
>> The fortran compiler/math library are ifc and intel mkl library. For
>> parallel execution I have MPI+SCALAPACK, FFTW.
>>
>> For parallel execution (-p options +.machines), I have dimensioned
>> NMATMAX/NUME according to user guide. Therefore, standard
>> calculations in SCF loops turn well, without any memory paging
>> issues, about 90% of physical RAM being used.
>>
>> However, in supercells, once getting case.vector files, when
>> calculating bands (lapw1c -bands -p) with fine k structure (e.g.
>> above 150-200k on line X-G-X), necessary because I am looking to
>> small Rashba shifts at metel-insulator interfaces...all available
>> physical memory plus a huge amount of swap (>100G) are filled/used...
>>
>> Any suggestion/ideea for overcoming this issue...without adding
>> additional RAM?
>>
>> Why in lapw1 -p for selfonsistance memory looks enough while with
>> switch -band overload memory?
>>
>> With thanks in advance,
>>
>> C. Tiusan
>>
>>
>>
>> _______________________________________________
>> Wien mailing list
>> Wien at zeus.theochem.tuwien.ac.at
>> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
>> SEARCH the MAILING-LIST at:
>> http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
>
--
__________________________________________________________________
| Prof. Dr. Eng. Habil. Coriolan Viorel TIUSAN |
|--------------------------------------------------------------- |
| |
| Department of Physics and Chemistry |
| Technical University of Cluj-Napoca |
| |
| Center of Superconductivity, Spintronics and Surface Science |
| Str. Memorandumului No. 28, RO-400114 Cluj-Napoca, ROMANIA |
| |
| Tel: +40-264-401-465 Fax: +40-264-592-055 |
| Cell: +40-732-893-750 |
| e-mail: coriolan.tiusan at phys.utcluj.ro |
| web: http://www.c4s.utcluj.ro/ |
|_______________________________________________________________ |
| |
| Senior Researcher |
| National Center of Scientific Research - FRANCE |
| e-mail: coriolan.tiusan at ijl.nancy-universite.fr |
| web: http://www.c4s.utcluj.ro/webperso/tiusan/welcome.html |
|_________________________________________________________________|
More information about the Wien
mailing list