[Wien] .machines for several nodes

Christian Søndergaard Pedersen chrsop at dtu.dk
Thu Oct 15 20:44:41 CEST 2020


Deat Professors Blaha and Marks


Thank you kindly for your explanations, the code is running smoothly - and efficiently! - now. For the sake of future WIEN2k newbies, I will summarize the mistakes I made; hopefully this can save someone else's time:


1: calling 'mpirun run_lapw -p'

As Professor Marks explained, calling mpirun explicitly led to overload of the compute node I was using. mpirun spawned one process per CPU on the node, each of which spawned additional processes as specified in the .machines file. The correct way is calling 'run_lapw -p'.


2: Issuing only one job in the .machines file, regardless of how many nodes/cores the job was using. For instance, for a Xeon16 node, I would write:


1:node1:4


which uses 4 cores for lapw1/lapw2 while leaving the remaining 12 cores idle. I corrected this to:


1:node1:4

1:node1:4

1:node1:4

1:node1:4


... which works, and which is explained in the example for OMP on page 86 of the manual.


Best regards

Christian

________________________________
Fra: Wien <wien-bounces at zeus.theochem.tuwien.ac.at> på vegne af Laurence Marks <laurence.marks at gmail.com>
Sendt: 15. oktober 2020 17:15:30
Til: A Mailing list for WIEN2k users
Emne: Re: [Wien] .machines for several nodes

Let me expand why not to use mpirun yourself, unless you are doing something "special"

Wien2k uses the .machines file to setup how to use mpi and (in the most recent versions) omp. As discussed by Peter, in most cases mpi is best with close to square matrices and often powers of 2. OMP is good for having 2-4 cores collaborate, not more. Depending upon your architecture OMP may be better than mpi or worse. (On my nodes mpi is always best; I know on some of Peter's that OMP is better.)

The code internally sets the number of threads to use (for omp), and will call mpirun or its equivalent depending upon what you have in parallel_options. While many codes/programs are structured so they operate mpi via "mpirun MyCode", Wien2k does not. The danger is that you will end up with multiple copies of run_lapw running which is not what you want.

There might be special cases where you would want to use "mpirun run_lapw" to remotely start a single version, but until you know how to use Wien2k do not go this complicated, it is likely to create problems.


On Thu, Oct 15, 2020 at 5:01 AM Laurence Marks <laurence.marks at gmail.com<mailto:laurence.marks at gmail.com>> wrote:
As an addendum to what Peter said, "mpirun run_lapw" is totally wrong. Remove the mpirun.

_____
Professor Laurence Marks
"Research is to see what everybody else has seen, and to think what nobody else has thought", Albert Szent-Gyorgi
www.numis.northwestern.edu<http://www.numis.northwestern.edu>

On Thu, Oct 15, 2020, 03:35 Peter Blaha <pblaha at theochem.tuwien.ac.at<mailto:pblaha at theochem.tuwien.ac.at>> wrote:
Well, 99% cpu efficiency does not mean that you run efficiently, but my
estimat is that you run at least 2 times slower than what is possible.

Anyway, please save the dayfile and compare the wall time of the
different parts with a different setup.

At least now we know that you have 24 cores/node. So the lapw0/dstart
lines are perfectly ok.

However, lapw1 you run on 3 mpi cores. This is "maximally inefficient".
This gives a division of your matrix into 3x1, but it should be as close
as possible to an even decomposition. So 4x4=16 or 8x8=64 cores is
optimal. With your 24 cores and 96 atom/cell I'd probably go for 12
cores in mpi and 2-kparallel jobs per node:

1:x073:12
1:x082:12
1:x073:12
1:x082:12

Maybe one can even overload the nodes a bit using 16 instead of 12
cores, but this could be dangerous on some machines because of your
admins might have forced cpu-binding, .... (You can even change the
.machines file (12-->16) "by hand" while your job is running (and maybe
change it back once you have seen whether timing is better or worse).

In any case, compare the timeings in the dayfile in order to find the
optimal setup.
_______________________________________________
Wien mailing list
Wien at zeus.theochem.tuwien.ac.at<mailto:Wien at zeus.theochem.tuwien.ac.at>
https://urldefense.com/v3/__http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien__;!!Dq0X2DkFhyF93HkjWTBQKhk!A8sB9-qFfbOGiCLPnA6iSE84ZZQy6mW4l0zuzz3NpWm1Wmn2GKqNPUMWg1UBjmQOGPID6g$
SEARCH the MAILING-LIST at:  https://urldefense.com/v3/__http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html__;!!Dq0X2DkFhyF93HkjWTBQKhk!A8sB9-qFfbOGiCLPnA6iSE84ZZQy6mW4l0zuzz3NpWm1Wmn2GKqNPUMWg1UBjmSyxhK3Ng$


--
Professor Laurence Marks
Department of Materials Science and Engineering
Northwestern University
www.numis.northwestern.edu<http://www.numis.northwestern.edu/>
Corrosion in 4D: www.numis.northwestern.edu/MURI<http://www.numis.northwestern.edu/MURI>
Co-Editor, Acta Cryst A
"Research is to see what everybody else has seen, and to think what nobody else has thought"
Albert Szent-Gyorgi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://zeus.theochem.tuwien.ac.at/pipermail/wien/attachments/20201015/7ffcba78/attachment-0001.htm>


More information about the Wien mailing list