[Wien] Pathscale+OpenMPI support
Scott Beardsley
scott at cse.ucdavis.edu
Wed Nov 19 19:42:14 CET 2008
Peter Blaha wrote:
> The description given at your wiki-site is really outstanding and I believe it
> could be very helpful for others, even serve as general template for systemadmins.
> If you agree, I'd like to add this link at our "faq" pages.
No problem, link away... I also took some advice from the NERSC page[1]
(we have a similar setup).
> 2 small comments:
> SRC_tetra: I've already noticed this problem, but I'd expect that one needs only
> the modification in the first line (since it is a continuation within a string),
> but not the second one, i.e.
> & " for Atom",i4," col=",i3," Energy=",2f8.4,/)') &
> IDOS(1,1),IDOS(1,2),emin,emax
Makes sense. IANA Fotran Programmer ;)
BTW, reproducing a build of WIEN2k seems somewhat difficult. The more
traditional ./configure, make, make install would help a lot.
> Parallel runs: Since I do not know how your $machines variable looks like, I cannot judge your
> script in detail, however I do not see any tools for the following task:
Right now the .machines file that gets generated for a 16cpu job
*always* looks like this:
lapw0: compute-0-0:8 compute-0-1:8
1:compute-0-0:8
1:compute-0-1:8
granularity:1
extrafine:1
A 32cpu job would always look like this:
lapw0: compute-0-0:8 compute-0-1:8 compute-0-2:8 compute-0-3:8
1:compute-0-0:8
1:compute-0-1:8
1:compute-0-2:8
1:compute-0-3:8
granularity:1
extrafine:1
AFAICT I'm currently using N MPI jobs on 8 cores each on N nodes (where
N = NSLOTS/8). Our setup doesn't prevent users from generating their own
.machines file, but it would be nice to test a few methods out to
determine (and provide) the best generic rule.
> WIEN2k has two parallel modes, a "k-point" and a "fine-grain mpi" mode.
> In many "real" applications one has more than ONE k-point in the case.klist file and
> then k-point parallelism (sometimes in addition to mpi-parallelism) is very useful and
> efficient.
We have a very low latency interconnect (~1.5us) and a lot of cpus per
machine (8) so I'd like to exploit those features as much as possible.
> If one requestes 16 cores (#$ -pe mpi 16)
>
> a machines file like
>
> 1:node1:4 node2:4 node3:4 node4:4 (or 1:node1 node1 node1 node1 node2 node2 ....)
>
> would run ONE mpi job on 16 nodes (doing one k-point after the other).
>
> However, since mpi-parallelization is often not so perfect (in particular for smaller cases),
> and one often has many k-points, a .machines file like
>
> 1:node1:4 node2:4
> 1:node3:4 node4:4
>
> can be more efficient. It will run 2 mpi-jobs on 8 cores each, working on the k-point list in
> case.klist in parallel.
Ahh ok. Thanks for the explanation. I now understand the WIEN2k machines
format a bit more.
Also, I'm curious if w2web is batch queue aware? If not, I'll have to
discourage people from using it.
Thanks,
Scott
----------------------
[1]
http://www.nersc.gov/nusers/resources/software/apps/materials_science/wien2k/
More information about the Wien
mailing list