[Wien] Wien post from pascal.boulet at univ-amu.fr (Errors with DGEMM and hybrid calculations)

Laurence Marks laurence.marks at gmail.com
Sat Aug 15 14:40:09 CEST 2020


It would be good to have a brief summary of what matters with hf:
a) For speed
b) For accuracy

It is probably somewhere in the docu, but another cite for the list would
be useful (to me as well as others).

_____
Professor Laurence Marks
"Research is to see what everybody else has seen, and to think what nobody
else has thought", Albert Szent-Gyorgi
www.numis.northwestern.edu

On Sat, Aug 15, 2020, 07:36 Tran, Fabien <fabien.tran at tuwien.ac.at> wrote:

> Since hybrids are expensive, you should do the SCF calculation with let's
> say a 6x6x6 k-mesh (from experience such mesh should be enough for
> converged electron density), and then do just one iteration on a denser
> k-mesh if needed (e.g., for DOS or transport properties). To do this one
> iteration with hybrids, you have to use the options "-newklist -i 1" (see
> user's guide).
>
>
> Concerning case.inhf, you should check the convergence of the results with
> nband in particular.
>
>
>
> ------------------------------
> *From:* Wien <wien-bounces at zeus.theochem.tuwien.ac.at> on behalf of
> pboulet <pascal.boulet at univ-amu.fr>
> *Sent:* Saturday, August 15, 2020 2:22 PM
> *To:* A Mailing list for WIEN2k users
> *Subject:* Re: [Wien] Wien post from pascal.boulet at univ-amu.fr (Errors
> with DGEMM and hybrid calculations)
>
> Hi Fabien,
>
> Mg2Si is a small structure with 2 irreducible positions in the IBZ. Here
> is the struct file:
> Mg2Si cubic 225 Fm-3m
> F   LATTICE,NONEQUIV.ATOMS   2  225 Fm-3m
> MODE OF CALC=RELA unit=bohr
>  11.999761 11.999761 11.999761 90.000000 90.000000 90.000000
> ATOM   1: X=0.00000000 Y=0.00000000 Z=0.00000000
>           MULT= 1          ISPLIT= 2
> Si         NPT=  781  R0=.000050000 RMT= 2.39        Z:  14.00000
> LOCAL ROT MATRIX:    1.0000000 0.0000000 0.0000000
>                      0.0000000 1.0000000 0.0000000
>                      0.0000000 0.0000000 1.0000000
> ATOM   2: X=0.25000000 Y=0.25000000 Z=0.25000000
>           MULT= 2          ISPLIT= 2
>        2: X=0.25000000 Y=0.25000000 Z=0.75000000
> Mg         NPT=  781  R0=.000050000 RMT= 2.50000     Z:  12.00000
> LOCAL ROT MATRIX:    1.0000000 0.0000000 0.0000000
>                      0.0000000 1.0000000 0.0000000
>                      0.0000000 0.0000000 1.0000000
>   48      NUMBER OF SYMMETRY OPERATIONS
>
> The number of k-points is 72 (12x12x12),  RmtKmax is 7, GMAX=12 (for test
> purpose, usually I set it to 24).
>
> Best,
> Pascal
>
>
> Pascal Boulet
>> *Professor in computational materials - DEPARTMENT OF CHEMISTRY*
> University of Aix-Marseille - Avenue Escadrille Normandie Niemen -
> F-13013 Marseille - FRANCE
> Tél: +33(0)4 13 55 18 10 - Fax : +33(0)4 13 55 18 50
> Email : pascal.boulet at univ-amu.fr
>
>
>
>
>
>
>
>
> Le 15 août 2020 à 11:40, Tran, Fabien <fabien.tran at tuwien.ac.at> a écrit :
>
> Hi,
>
> For calculations on small cells with many k-points it is more preferable
> (for speediness) to use k-point paralllelization instead of MPI
> parallelization. And, as mentioned by PB, MPI applied to small matrices may
> not work.
>
> Of course, if the number of cores that you have at disposal is larger
> (twice for instance) than the number of k-points in the IBZ, then you can
> combine the k-point and MPI parallelizations (two cores for each k-point).
>
> An example of .machines file for k-point parallelization is (supposing
> that you have 6 k-points in the IBZ and want to use one machine having 6
> cores):
>
> lapw0: n1071 n1071 n1071 n1071 n1071 n1071
> dstart: n1071 n1071 n1071 n1071 n1071 n1071
> nlvdw: n1071 n1071 n1071 n1071 n1071 n1071
> 1:1071
> 1:1071
> 1:1071
> 1:1071
> 1:1071
> 1:1071
> granularity:1
> extrafine:1
>
> The six lines "1:1071" mean that lapw1, lapw2 and hf are k-point (no MPI)
> parallelized (one line fore each core). lapw0, dstart and nlvdw are MPI
> parallelized. In this example, the omp parallelization is ignored by
> supposing that OMP_NUM_THREADS is set to 1.
>
> Besides, I find your computational time with HF as very large. What is the
> number of atoms in the cell, the number of k-points (plz, specify th
> n1xn2nx3 k-mesh), RKmax, etc.?
>
> Best,
> FT​
>
> From: Wien <wien-bounces at zeus.theochem.tuwien.ac.at> on behalf of pboulet
> <pascal.boulet at univ-amu.fr>
> Sent: Saturday, August 15, 2020 10:35 AM
> To: A Mailing list for WIEN2k users
> Subject: Re: [Wien] Wien post from pascal.boulet at univ-amu.fr (Errors with
> DGEMM and hybrid calculations)
>
> Dear Peter,
>
> Thank you for your response. It clarifies some points for me.
>
> I have run another calculation for Mg2Si, which is a small system, on 12
> cores (no MKL errors!). The job ran for 12 hours  (CPU time limit I set)
> and made only 3 SCF cycles without converging.
>
> The .machines file I use looks like this:
>
> 1:1071:12
> lapw0: n1071 n1071
> dstart: n1071 n1071
> nlvdw: n1071 n1071
> granularity:1
> extrafine:1
>
> I guess I am not optimising the number of cores w.r.t. the size of the
> problem (72 k-points, 14 HF bands, 12 occupied +2 unoccupied).
>
> I changed the number of processors for 72, hoping for a 1 k-point/core
> parallelisation and commenting all the lines of .machines except
> granularity and extrafine. I got less than 1 cycle in 12 hours.
>
> What should I do to run the HF part on k-points parallelisation only (no
> mpi)? This point that is not clear for me from the manual.
>
> Thank you
> Best regards
> Pascal
>
> Pascal Boulet
>> Professor in computational materials - DEPARTMENT OF CHEMISTRY
>
> University of Aix-Marseille - Avenue Escadrille Normandie Niemen - F-13013
> Marseille - FRANCE
> Tél: +33(0)4 13 55 18 10 - Fax : +33(0)4 13 55 18 50
> Email : pascal.boulet at univ-amu.fr <pascal.boulet at univ-amu.fr>
>
> Le 12 août 2020 à 14:12, Peter Blaha <pblaha at theochem.tuwien.ac.at> a
> écrit :
>
> Your message is too big to be accepted.
>
> Anyway, the DGEMM messages seem to be a relict of the mkl you are using,
> and most likely is related to the use of too many mpi-cores for such a
> small matrix. At least when I continue your Mg2Si calculations (in
> k-parallel mode) the :DIS and :ENE are continuous,  which means that the
> previous results are ok.
>
> Concerning hf, I don't know. Again, running this in sequential (k-point
> parallel) mode is no problems and converges quickly.
>
> I suggest that you change your setup to a k-parallel run for such small
> systems.
>
> Best regards
> Peter Blaha
>
> ---------------------------------------------------------------------
> Subject:
> Errors with DGEMM and hybrid calculations
> From:
> pboulet <pascal.boulet at univ-amu.fr>
> Date:
> 8/11/20, 6:31 PM
> To:
> A Mailing list for WIEN2k users <wien at zeus.theochem.tuwien.ac.at>
>
> Dear all,
>
> I have a strange problem with LAPACK. I get an error message with wrong
> parameters sent to DGEMM, but still wien2k (19.2) seems to converge the
> scf. Is that possible? What could be the "problem"?
>
> I have attached an archive containing the summary of the SCF + compilation
> options + SLURM output file. The error message is in the dayfile.
> The same error shows up with Wien2k 18.1.
>
> Actually this case is a test case for testing hybrid calculations as I
> have problems with my real case, which is found to be metallic with PBE. At
> least Mg2Si is a small band gap semiconductor.
>
> When I go ahead with the HSE06 functional and Mg2Si I get a different
> error: segmentation fault during the hf run. As Mg2Si is a small system I
> guess this is not a memory problem: the node is 128GB.
> Note that the same problem occurs for my real case file, but the LAPACK
> problem does not occur.
>
> The second archive contains some files related to the hybrid calculation.
>
> Some hints would be welcome as I am completely lost in these (unrelated)
> errors!
>
> Thank you.
> Best regards,
> Pascal
>
>
> Pascal Boulet
> --
>
>                                      P.Blaha
> --------------------------------------------------------------------------
> Peter BLAHA, Inst.f. Materials Chemistry, TU Vienna, A-1060 Vienna
> Phone: +43-1-58801-165300             FAX: +43-1-58801-165982
> Email: blaha at theochem.tuwien.ac.at    WIEN2k: http://www.wien2k.at
> WWW:   http://www.imc.tuwien.ac.at/TC_Blaha
> --------------------------------------------------------------------------
> _______________________________________________
> Wien mailing list
> Wien at zeus.theochem.tuwien.ac.at
> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
> SEARCH the MAILING-LIST at:
> http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
> _______________________________________________
> Wien mailing list
> Wien at zeus.theochem.tuwien.ac.at
> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
> SEARCH the MAILING-LIST at:
> http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
>
>
> _______________________________________________
> Wien mailing list
> Wien at zeus.theochem.tuwien.ac.at
>
> https://urldefense.com/v3/__http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien__;!!Dq0X2DkFhyF93HkjWTBQKhk!D_jOLQXK_C3fGFJQYomOw3JDdIajaXrWeuomuO8f1qRIDbOqVHcsGyb0iJXYHhWJae5mEg$
> SEARCH the MAILING-LIST at:
> https://urldefense.com/v3/__http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html__;!!Dq0X2DkFhyF93HkjWTBQKhk!D_jOLQXK_C3fGFJQYomOw3JDdIajaXrWeuomuO8f1qRIDbOqVHcsGyb0iJXYHhU3ic6GGg$
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://zeus.theochem.tuwien.ac.at/pipermail/wien/attachments/20200815/4f03fc30/attachment.htm>


More information about the Wien mailing list