[Wien] Trouble setting up parallel jobs
PGanesh
pganesh at ciw.edu
Fri Dec 19 19:53:05 CET 2008
HI,
We have installed WIEN2K on our local cluster (it has ROCKS ) and can
run it in serial beautifully. But when running it in parallel
environment I get the following errors in jobs.err:
Got 11 slots.
Without hostfile option, hostnames must be specified on command line.
usage: mpirun_rsh [-verbose] [-v] [-rsh|-ssh] [-paramfile=pfile]
[-timeout=N][-debug] -[tv] [-xterm] [-show] -np N (-machinefile mfile |
-hostfile hfile | h1 h2 ... hN) [a.out args]
Where:
verbose => verbose
v => Show version and exit
rsh => to use rsh for connecting
ssh => to use ssh for connecting
paramfile => file containing run-time MVICH parameters
debug => run each process under the control of gdb
tv => run each process under the control of totalview
xterm => run remote processes under xterm
show => show command for remote execution but dont run it
np => specify the number of processes
h1 h2... => names of hosts where processes should run
or hostfile => name of file contining hosts, one per line
or machinefile => name of file contining host and MPI binary, one per
line. If MPI binary is empty for 1 or many hosts then
the default is executed
timeout => Timeout for child processes to terminate
a.out => name of (default) MPI binary.
It is a mandatory parameter if machinefile is not specified
OR if machinefile has empty MPI Binary entries for 1 or
more hosts
args => arguments for MPI binary
and this is what I get in my *.dayfile:
Calculating BSCCO in /home/pganesh/WIEN2k/BSCCO
on compute-0-27.local with PID 20199
start (Fri Dec 19 13:43:53 EST 2008) with lapw0 (40/99 to go)
cycle 1 (Fri Dec 19 13:43:53 EST 2008) (40/99 to go)
> lapw0 -p (13:43:53) starting parallel lapw0 at Fri Dec 19
13:43:53 EST 2008
-------- .machine1 : 3 processors
compute-0-27 compute-0-2 compute-0-17
--------
0.008u 0.032s 0:00.20 15.0% 0+0k 0+0io 6pf+0w
error: command /home/pganesh/WIEN2k/lapw0para lapw0.def failed
> stop error
I copied the scipt on the WIEN2K website that would make the .machines
file and then executes the command: run_lapw -NI -p -fc 3
This is how my .machines file looks like:
#
lapw0:compute-0-27 compute-0-2 compute-0-17
1:compute-0-27
1:compute-0-27
1:compute-0-27
1:compute-0-27
1:compute-0-2
1:compute-0-2
1:compute-0-2
1:compute-0-2
1:compute-0-17
1:compute-0-17
1:compute-0-17
granularity:1
extrafine:1
Thank you for the help.
regards,
Ganesh
More information about the Wien
mailing list