[Wien] mpif90
Saeid Jalali
s_jalali_a at yahoo.com
Thu May 31 16:54:28 CEST 2012
Dear All,
I compiled the latest version of the code on a cluster made up of several Dual Core AMD Opteron nodes by ifort and mpif90.
There is no any error or warning in the SRC_*/compile.msg files. The code runs well on the nodes, if we only parallelize the k-points by for example the following .machines file:
lapw0:node3 node20
1:node3
1:node20
granularity:1
extrafine:1
But, the program stops with the following error:
ifort: command line warning #10159: invalid argument for option '-m'
ifort: command line error: option '-n' is ambiguous
once we use the fine grained parallelization by for example the following .machines file:
lapw0:node3:2 node20:2
1:node3:2
1:node20:2
granularity:1
extrafine:1
I have used l_cprof_p_11.1.073_intel64, fftw-2.1.5, mpich2 for compiling the code, and the following settings.
-----------------------------------------
Current settings:
O Compiler options: -FR -mp1 -w -prec_div -pc80 -pad -ip -DINTEL_VML -traceback
L Linker Flags: $(FOPT) -L/home/softs/intel/ifort11/mkl/lib/em64t -pthread
P Preprocessor flags '-DParallel'
R R_LIB (LAPACK+BLAS): -lmkl_lapack -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -lpthread -lguide
-------------------------------------------
RP RP_LIB(SCALAPACK+PBLAS): -lmkl_scalapack_lp64 -lmkl_solver_lp64 -lmkl_blacs_lp64 -L/home/softs/mpich2/lib -lmpich -L/home/softs/fftw-2.1.5/mpi/.libs/ -lfftw_mpi -L/home/softs/fftw-2.1.5/fftw/.libs/ -lfftw $(R_LIBS)
FP FPOPT(par.comp.options): -FR -mp1 -w -prec_div -pc80 -pad -ip -DINTEL_VML -traceback -I/home/softs/mpch2/include
MP MPIRUN commando : /home/softs/mpich2/bin/mpif90 -machinefile _HOSTS_ -n _NP_ _EXEC_
---------------------------------------------
The parallel_options is:
---------------------------------------------
setenv USE_REMOTE 1
setenv MPI_REMOTE 1
setenv WIEN_GRANULARITY 1
setenv WIEN_MPIRUN "/home/softs/mpich2/bin/mpif90 -machinefile _HOSTS_ -n _NP_ _EXEC_"
-------------------------------
I changed the mpif90 to mpirun only in the parallel options (just as a test) but I did not recompile the code by mpirun.
The result is as follows:
LAPW0 END
LAPW0 END
Fatal error in PMPI_Comm_size: Invalid communicator, error stack:
PMPI_Comm_size(111): MPI_Comm_size(comm=0x5b, size=0x8aa96c) failed
PMPI_Comm_size(69).: Invalid communicator
real0m0.050s
user0m0.010s
sys0m0.038s
test.scf1_1: No such file or directory.
FERMI - Error
cp: cannot stat `.in.tmp': No such file or directory
rm: cannot remove `.in.tmp': No such file or directory
rm: cannot remove `.in.tmp1': No such file or directory
Similar error was occurred when I used mpiexec in the parallel options without recompiling the code.
I found that the "Invalid communicator" originates from incompatible mpi.h of mpirun or mpiexec with that of mpif90.
So I changed back it to mpif90.
Since I guessed that the problem originates from the version of mpi, I tried different versions of mpi, i.e., mpich2-1.0.6, mpich2-1.3.1, mpich2-1.4, openmpi-1.4.2.
Any comment on why the code is compiled with no error or warning, but it stops with error is highly appreciated.
Is there any restrictions for compiling the code by mpif90 on AMD systems, as discussed above?
Sincerely yours,
S. Jalali
/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/
Saeid Jalali Asadabadi,
Department of Physics, Faculty of Science,
University of Isfahan (UI), Hezar Gerib Avenue,
81744 Isfahan, Iran.
Phones:
Dep. of Phys. :+98-0311-793 2435
Office :+98-0311-793 4776
Fax No. :+98-0311-793 4800
E-mail :sjalali at phys.ui.ac.ir
:sjalali at sci.ui.ac.ir
:sjalali at mailaps.org
:saeid.jalali.asadabadi at gmail.com
:s_jalali_a at yahoo.com
Homepage :http://sci.ui.ac.ir/~sjalali
www :http://www.ui.ac.ir
/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://zeus.theochem.tuwien.ac.at/pipermail/wien/attachments/20120531/06000eb4/attachment.htm>
More information about the Wien
mailing list