[Wien] Parallel Options
Ghosh SUDDHASATTWA
ssghosh at igcar.gov.in
Wed Sep 7 07:36:17 CEST 2011
Dear Wien2k users,
We have compiled Wien2k_11.1 with the following parallel options
setenv USE_REMOTE 1
setenv MPI_REMOTE 1
setenv WIEN_GRANULARITY 1
setenv WIEN_MPIRUN "mpirun -np _NP_ -machinefile _HOSTS_ _EXEC_"
The k-point parallel start up script is given by
#!/bin/bash
#
# RJ: Startup for Wien2k-kpoint parallel conforming with Grid Engine
# parallel environment interface
#
# usage: start_kpoint.sh <pe_hostfile>
#
PeHostfile2Wien2kMachineFile()
{
cat $1 | while read line; do
# echo $line
host=`echo $line|cut -f1 -d" "|cut -f1 -d"."`
nslots=`echo $line|cut -f2 -d" "`
# add here code to map regular hostnames into IB hostnames
for ((i=0; i < $nslots; i=i+1)); do
echo 1:i$host
done
done
echo 'granularity:1'
echo 'extrafine:1'
}
# useful to control parameters passed to us
echo $*
SLEEPTIME=5
RETRIES=10
me=`basename $0`
# test number of args
if [ $# -lt 1 ]; then
echo "$me: got wrong number of arguments" >&2
exit 1
fi
# get arguments
pe_hostfile=$1
# ensure pe_hostfile is readable
if [ ! -r $pe_hostfile ]; then
echo "$me: can't read $pe_hostfile" >&2
exit 1
fi
# create machine-file
# remove column with number of slots per queue
# mpi does not support them in this form
machines="$TMPDIR/machines.wien2k-kpoint"
pwdir=`pwd`
PeHostfile2Wien2kMachineFile $pe_hostfile >> $machines
cat $machines
hostname
#scp $machines nx0:$pwdir/machines
The SGE job script is given by
#!/bin/bash
#
#$ -cwd
#$ -j y
#$ -S /bin/bash
#$ -V
#$ -pe kpoint 2-
# RJ: Script to run Wien2k-kpoint parallel job thru SGE
# use kpoint PE
#echo "Hostname: "
#hostname
#echo "No. of Slots"
#echo $NSLOTS
# machines.wien2k-kpoint would be created by
# start_kpoint.sh PE script at $TMPDIR
echo "Wien2k Machine file $TMPDIR/machines"
mf=`cat $TMPDIR/machines.wien2k-kpoint`
echo $mf
cp $TMPDIR/machines.wien2k-kpoint .machines
# RJ: command for kpoint parallel run
runsp_lapw -cc 0.0001 -ec 0.00001 -in1ef -i 200 -p
Now, we have 12 processors in 1 node.
When we do
Qsub -pe kpoint 12 kpoint.sh
The script works
But when we do
Qsub -pe kpoint 16 kpoint.sh
It doesn't
Can anybody suggest what the problem is and if any changes in the job script
is required
Thanks in advance
Suddhasattwa
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://zeus.theochem.tuwien.ac.at/pipermail/wien/attachments/20110907/6732acbe/attachment.htm>
More information about the Wien
mailing list