[Wien] Parallel Wien2k using Intel MPI?

Stefan Becuwe stefan.becuwe at ua.ac.be
Sun Nov 14 10:19:17 CET 2010


Hello,

Our problem is more or less related to Wei Xie's postings of two weeks 
ago.  We can't get Wien2k 10.1 running using the MPI setup.  Serial 
versions and parallel versions based on ssh do work.  Since his solution 
does not seem to work for us, I'll describe our problem/setup.

FYI: the Intel MPI setup does work for lots of other programs on our 
cluster, so I guess it must be an Intel MPI-Wien2k(-Torque-MOAB) specific 
problem.

Software environment:

icc/ifort: 11.1.073
impi:      4.0.0.028
imkl:      10.2.6.038
FFTW:      2.1.5
Torque/MOAB


$ cat parallel_options
setenv USE_REMOTE 1
setenv MPI_REMOTE 1
setenv WIEN_GRANULARITY 1
setenv WIEN_MPIRUN "mpirun -r ssh -np _NP_ _EXEC_"


Call:

clean_lapw -s
run_lapw -p -ec 0.00001 -i 1000


$ cat .machines
lapw0: cn002:8 cn004:8 cn016:8 cn018:8
1: cn002:8
1: cn004:8
1: cn016:8
1: cn018:8
granularity:1
extrafine:1


Also, the appropriate .machine1, .machine2, etc are generated.


$ cat TiC.dayfile
[...]
>   lapw0 -p    (09:59:34) starting parallel lapw0 at Sun Nov 14 09:59:34 CET 2010
-------- .machine0 : 32 processors
0.428u 0.255s 0:05.12 13.0%     0+0k 0+0io 0pf+0w
>   lapw1  -p   (09:59:39) starting parallel lapw1 at Sun Nov 14 09:59:39 CET 2010
->  starting parallel LAPW1 jobs at Sun Nov 14 09:59:39 CET 2010
running LAPW1 in parallel mode (using .machines)
4 number_of_parallel_jobs
      cn002 cn002 cn002 cn002 cn002 cn002 cn002 cn002(1) WARNING: Unable to read mpd.hosts or list of hosts isn't provided. MPI job will be run on the current machine only.
rank 5 in job 1  cn002_55855   caused collective abort of all ranks
   exit status of rank 5: killed by signal 9
rank 4 in job 1  cn002_55855   caused collective abort of all ranks
   exit status of rank 4: killed by signal 9
rank 3 in job 1  cn002_55855   caused collective abort of all ranks
   exit status of rank 3: killed by signal 9
[...]


Specifying -hostfile in the WIEN_MPIRUN variable results in the following 
error

invalid "local" arg: -hostfile


Thanks in advance for helping us running Wien2k in an MPI setup ;-)

Regards


Stefan Becuwe


More information about the Wien mailing list