[Wien] Large cell instgen_lapw : word too long

Chouaib AHMANI FERDI ahmaniferdichouaib at gmail.com
Fri Nov 10 16:26:03 CET 2017


Dear all,

Thank you for the replies.

I installed intel parallel studio 2017, which contains a fortran compiler,
MKL, Scalapack, blacs, intelmpi, then I configured FFTW3 so that F77
compiler is ifort and enabled mpi. After that i compiled the wien2k
programs with no error in sight, so far so good.

I wanted now to test it on a 4 cores machine Corei3 (I did not want to risk
anything with the program already installed in the 32 cores shared memory
machine).
I set 10 kpoints (3 inequivalent atoms for the test) which left me with
klist = 1
I set .machines file like so :
1:localhost:2     #so that 2 cores would take the job

The SCF stops with this error in the dayfile :

running LAPW1 in parallel mode (using .machines)
1 number_of_parallel_jobs
       localhost localhost(1)---------------------------------
Primary job terminated normally, but 1 process returned a non-zero exit
code.. Per user-direction, the job has been aborted.-----------
---------
mpirun detected that one or more processes exited with non-zero status,
thus the job to be terminated. The first process to do so was:

     Exit code: 127
...
LAPW2 crashed!

FYI : I tried the OMP_NUM_THREADS=2 and it worked ~ 189 % CPU
But giving 1 kpoint job to 2 cores has not.

I red all the mailing list I could find about this subject but none of them
really helped.

May someone point out where is the problem ?

Faithfylly,


On Mon, Nov 6, 2017 at 11:37 AM, Laurence Marks <L-marks at northwestern.edu>
wrote:

> Many, many points:
>
> 1) mpi and k-point parallel are different. For mpi you need software, e.g.
> openmpi or the Intel impi and you need to include this in the
> configuration. You will need to learn some Linux/computing basics and/or
> get help from someone local. It is not really appropriate to ask this list
> to help install mpi.
>
> 2) Increasing rkmax is not as efficient as mpi in my experience, although
> this depends upon the computer architecture (Peter has the opposite
> conclusion for small problems).
>
> 3) mpi allows you to share the load, for instance have 16 cores working on
> the job not one. 16 cores can be ten times faster than one, although note
> 2) above. For 300 atoms without mpi your problem may not converge before
> 2018.
>
> 4) 1 Re atom in fcc Fe has at least cubic symmetry. If you do not enter
> positions right, symmetry/sgroup won't find the symmetry. I would use
> cryscon which is cheap, and allows one to generate CIF files with symmetry.
> There are other codes.
>
> 5) For 300 atoms probably only 1 k-point is needed.
>
> 6) Walk before you run. Understand a small model first.
>
> _____
> Professor Laurence Marks
> "Research is to see what everybody else has seen, and to think what nobody
> else has thought", Albert Szent-Gyorgi
> www.numis.northwestern.edu
>
>
> On Nov 6, 2017 3:54 AM, "Chouaib AHMANI FERDI" <
> ahmaniferdichouaib at gmail.com> wrote:
>
> Thank you for your feedback.
>
> Mr Gavin Abo,
> Unfortunately, I do not have access to supercomputer nor to a cluster. I
> will have to stick with my PC for a while. I thought that MPI was for
> running jobs on multiple (actual) machines, but since I can run lapw1 in
> multiple cores in my Single PC (edit .machine file), I guess I already have
> MPI installed.
>
> Mr Peter Blaha,
> I realize that 300 atoms is quite a lot including magnetic atoms such as
> Fe and RE elements (4f), in fact, 1 cycle took 5 hours to complete, with 40
> min for Lapw0, 1h50min for Lapw1 (on 2 cores) for up and down and 16min for
> Lapw2 (2 cores) for up and down.
> I think the problem is that I am stuck with klist : 2
>
> and if I increase kpoints, so that for example 10 machines would take the
> job, (klist 10) I think that Lapw0 will take too long since I cannot run it
> in multiple cores.
>
> About the structure, the unit cell without doping consists of 56 atoms
> with 16 atoms of Fe (fcc lattice) the doping consists of 0.01 RE + 0.01 Re
> element doping so that the smallest supercell would be one with 96 Fe
> atoms.
> I will consider your advice and hope that results from doping smaller
> supercell would help explaining the experimental data we obtained.
>
> Faithfully,
>
>
>
>
> _______________________________________________
> Wien mailing list
> Wien at zeus.theochem.tuwien.ac.at
> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
> SEARCH the MAILING-LIST at:  http://www.mail-archive.com/
> wien at zeus.theochem.tuwien.ac.at/index.html
>
>


-- 
AHMANI FERDI Chouaïb
"Laboratoire Matériaux Nanomatériaux Nanomagnétisme
  et Enseignement des Sciences"
Ecole Normale Supérieure
Université Mohammed V, Rabat.
Tel : +212 6 94 59 57 60
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://zeus.theochem.tuwien.ac.at/pipermail/wien/attachments/20171110/97f13a77/attachment-0001.html>


More information about the Wien mailing list