[Wien] Pathscale+OpenMPI support

Scott Beardsley scott at cse.ucdavis.edu
Wed Nov 19 19:42:14 CET 2008


Peter Blaha wrote:
> The description given at your wiki-site is really outstanding and I believe it
> could be very helpful for others, even serve as general template for systemadmins.
> If you agree, I'd like to add this link at our "faq" pages.

No problem, link away... I also took some advice from the NERSC page[1] 
(we have a similar setup).

> 2 small comments:
> SRC_tetra: I've already noticed this problem, but I'd expect that one needs only
> the modification in the first line (since it is a continuation within a string),
> but not the second one, i.e.
>                     & " for Atom",i4,"  col=",i3,"  Energy=",2f8.4,/)') &
>                       IDOS(1,1),IDOS(1,2),emin,emax

Makes sense. IANA Fotran Programmer ;)

BTW, reproducing a build of WIEN2k seems somewhat difficult. The more 
traditional ./configure, make, make install would help a lot.

> Parallel runs: Since I do not know how your $machines variable looks like, I cannot judge your
> script in detail, however I do not see any tools for the following task:

Right now the .machines file that gets generated for a 16cpu job 
*always* looks like this:

lapw0: compute-0-0:8 compute-0-1:8
1:compute-0-0:8
1:compute-0-1:8
granularity:1
extrafine:1

A 32cpu job would always look like this:

lapw0: compute-0-0:8 compute-0-1:8 compute-0-2:8 compute-0-3:8
1:compute-0-0:8
1:compute-0-1:8
1:compute-0-2:8
1:compute-0-3:8
granularity:1
extrafine:1

AFAICT I'm currently using N MPI jobs on 8 cores each on N nodes (where 
N = NSLOTS/8). Our setup doesn't prevent users from generating their own 
.machines file, but it would be nice to test a few methods out to 
determine (and provide) the best generic rule.

> WIEN2k has two parallel modes, a "k-point" and a "fine-grain mpi" mode.
> In many "real" applications one has more than ONE k-point in the case.klist file and
> then k-point parallelism (sometimes in addition to mpi-parallelism) is very useful and
> efficient.

We have a very low latency interconnect (~1.5us) and a lot of cpus per 
machine (8) so I'd like to exploit those features as much as possible.

> If one requestes 16 cores (#$ -pe mpi 16)
> 
> a machines file like
> 
> 1:node1:4 node2:4 node3:4 node4:4     (or 1:node1 node1 node1 node1 node2 node2 ....)
> 
> would run ONE mpi job on 16 nodes (doing one k-point after the other).
> 
> However, since mpi-parallelization is often not so perfect (in particular for smaller cases),
> and one often has many k-points, a .machines file like
> 
> 1:node1:4 node2:4
> 1:node3:4 node4:4
> 
> can be more efficient. It will run 2 mpi-jobs on 8 cores each, working on the k-point list in
> case.klist in parallel.

Ahh ok. Thanks for the explanation. I now understand the WIEN2k machines 
format a bit more.

Also, I'm curious if w2web is batch queue aware? If not, I'll have to 
discourage people from using it.

Thanks,
Scott
----------------------
[1] 
http://www.nersc.gov/nusers/resources/software/apps/materials_science/wien2k/


More information about the Wien mailing list