[Wien] Wien on Cray XT4
Matyáš Novák
novakmat at fzu.cz
Wed Oct 10 11:01:02 CEST 2007
Dear Wien users,
I implemented WIEN on computer cluster Cray XT4
(http://www.csc.fi/english/pages/louhi_guide/hardware/index_html,
http://www.cray.com/products/xt4/) so I'm sending some experience.
Best Regards,
Matyas Novak
---------------------------------------------------------------------
There are two sorts of nodes in the Cray XT4: few login (or service)
ones and many computational (so-called catamount by the name of their
operating system). Each node contain one doublecore Opteron and some
(in my case 2GB) memory.
The main problem is, that catamount microkernel on computing nodes doesn't
allow to run scripts nor launching subprocess -- the only way (that I
find) how
to run process on computing node is by command yod from service node.
So the point is: I must run lapw_scripts on service node, how to run
executables
on computational ones? (Using some kind of "ssh replacement script" is
not possible,
because many executables are called directly from lapw_scripts)
So I think the one from the better ways how to implement Wien on Cray
XT4 (without
rewriting all *_lapw scripts) is to replace all executables in Wien by
script, that
will call yod (to call the executables itself on the computational
nodes). The steps
to do it are here:
1) Compile Wien
My compiler switches with PGI fortran compiler 7.0 are
-fastsse -Mfree
With exception lapw1 and lapw2, that requires only
-O1 -Mfree
-fastsse or -O2 causes wrong results in lapw1 and segmentation fault in
lapw2
2) Split Wien into two directories. In the first one there will be
scripts and configuration
files (we will call them <script directory>), in the second one all the
(compiled)
executables (<executables directory>).
The splitting can be done e.g. by midnight commander (the script with
the biggest size is
x_lapw, all bigger files with are executables), or by simple shell script.
3) In the <script directory> make instead of each (removed) executables
symbolic link to
the following script
yod -sz 1 <executables directory>/`basename $0` $@
(We will call this script YodScript.)
It can be easily done by following script (Using the fact that size of
the biggest script
has lower size than 100kB and smaller compiled executable is greater
than 100kB):
ls -l <executables directory> | awk '
{ if(NF >= 9) next; #get rid off symlinks
if(int($5) < 100000) next; #rid of scripts
system("ln --s <pathAndNameOfYodScript> <script directory>/$8)}'
4) Set shared memory mode in Wien (so the YodScript won't be called by ssh)
5) Generate proper .machine file:
Because yod manages launching processes on proper hosts automatically,
the .machines file
in fact need inform Wien only about the number of assigned processors.
So in the .machines
file there can be "dummy" names of assigned machines, the only important
thing here is
the right number of machines.
.machines file can be generated by this script (argument is number of
assigned nodes).
#!/bin/sh
hnm=`hostname`
for i in `seq 1 $1` ; do
echo "1:$hnm-$i";
done
echo "extrafine: 1"
6) There can be problem on catamount nodes with memory -- catamount
microkernel needs to know,
how it should divide memory between heap and stack (default behavior
seems not to be optimal
to Wien). Yod switches '-heap' and '-stack' (see man yod) will help, add
them to YodScript. I've
got good experiences with -heap 18000000
The disadvantage of Cray architecture is, that there is no way how to
run two independent
executables (without mpi) on one node - so one must use whole node (two
cores) for one
process and so one core is "unemployed". (There is workaround by using
yod -F parameter, but
it would require to complete rewrite *_para scripts.) However, Cray is
created for launching
mpi jobs (or more generally parallel jobs that requires a lot of
communication between nodes)
and to launch others jobs there is a bit wasting of pricy hardware - so
if you want to use
both cores, you must use mpi.
(There is a fast lustra filesystem on Cray, but I find no difference
against NFS (in the view
of implementing Wien)).
MPI
-----
On the Cray where I implemented WIEN there was no need to add specific
option for compiler to
use mpi. But one must ensure the proper startup of mpi processes. So one
must modify previous
steps with this one:
1) Enable mpi by ./siteconfig (parallel execution) and compile mpi
version of WIEN
(It was sufficent for me to add only -DParallel switch)
2) Generate proper .machines file:
Same as before, but the structure is a bit complicated and one should
generate line for lapw0 (it's
not so necessarily because scaling of lapw0 is not so good), so the
script can look like following
script (argument is number of chunks and number of processors in chunk).
#!/bin/sh
hnm=`hostname`
seq=`seq 1 $2`
lapw1=lapw0:
for i in `seq 1 $1` ; do
t=1:
for y in $seq ; do
t="$t $hnm-$i-$y:1";
lapw1="$lapw1 $hnm:1"
done
echo $t
done
echo $lapw1
echo "extrafine: 1"
It's noticeable, that if we want to use two chunks with nine processors,
we must allocate ten nodes,
because yod can't launch two different mpi processes in one node (so one
chunk require five nodes).
lapw2_vector_split not work to me.
3) Set the proper mpirun command
There is probably simple (but no so flexible) way: set MPIRUN to
MPIRUN=yod -sz $WIEN_MPISIZE <executables directory>/_EXEC_ $@
but I don't try it. I use the same way as by nonmpi job: to replace all
mpi-executables (in
<script directory>) by symbolic link to following script:
yod -sz $WIEN_MPISIZE <executables directory>/`basename $0` $@
and MPIRUN set to
MPIRUN=_EXEC_
All mpi chunks should has the same size and the number of cores in one
chunk should be contained in
exported variable (in csh family shells setenv variable) $WIEN_MACHINES)
The script for replacing both
mpi and nonmpi executables can look like this:
ls -l <executables directory> | awk '
{ if(NF >= 9) next;
if(int($5) < 100000) next;
if(int($8) ~ "mpi")
system("ln --s <pathAndNameOfMpiYodScript> <script directory>/$8)}'
else
system("ln --s <pathAndNameOfYodScript> <script directory>/$8)}'
More information about the Wien
mailing list