[Wien] Two kinds of MPI run error
Laurence Marks
L-marks at northwestern.edu
Sat Aug 10 22:07:17 CEST 2013
By default, openmpi does not export LD_LIBRARY_PATH or any
environmental variables, and you have to tell it to do this, e.g. in
parallel_options use
setenv WIEN_MPIRUN "mpirun -x LD_LIBRARY_PATH -x PATH -np _NP_
-machinefile _HOSTS_ _EXEC_"
That may solve part of the problem associated with the shared library.
I prefer to avoid shared libraries as much as possible as this avoids
such problems, e.g. use -static or -i-static when compiling.
The ".machine5" may go away when you correct the LD_LIBRARY_PATH
issue, or it could be an error in your .machines file.
On Sat, Aug 10, 2013 at 12:12 PM, 贾亚磊 <jia_yalei at 163.com> wrote:
> Dear all,
> I compile wien2k 11 on linux centos 5.5 with icc , ifort 11.1, openmpi
> mpif90, and mkl(combined with ifort compiler,) with the following
> parameter in $WIENROOT/OPTIONS:
>
> current:FOPT:-FR -mp1 -w -prec_div -pc80 -pad -ip -DINTEL_VML -traceback
> -i-static
> current:FPOPT:-FR -mp1 -w -prec_div -pc80 -pad -ip -DINTEL_VML -traceback
> -i-static
> current:LDFLAGS:$(FOPT) -L/home/yljia/intel/Compiler/11.1/072/mkl/lib/em64t
> -pthread
> current:DPARALLEL:'-DParallel'
> current:R_LIBS:$(MKLROOT)/lib/em64t/libmkl_lapack95_lp64.a -Wl,--start-group
> $(MKLROOT! )/lib/em64t/libmkl_intel_lp64.a
> $(MKLROOT)/lib/em64t/libmkl_intel_thread.a
> $(MKLROOT)/lib/em64t/libmkl_core.a -Wl,--end-group -openmp -lpthread -lm
> -lguide
> current:RP_LIBS:$(MKLROOT)/lib/em64t/libmkl_scalapack_lp64.a
> $(MKLROOT)/lib/em64t/libmkl_solver_lp64.a -Wl,--start-group
> $(MKLROOT)/lib/em64t/libmkl_intel_lp64.a
> $(MKLROOT)/lib/em64t/libmkl_intel_thread.a
> $(MKLROOT)/lib/em64t/libmkl_core.a
> $(MKLROOT)/lib/em64t/libmkl_blacs_openmpi_lp64.a -Wl,--end-group -lpthread
> -lm /home/yljia/compiler_library/fftw-2.1.5/lib/libfftw_mpi.a
> /home/yljia/compiler_library/fftw-2.1.5/lib/libfftw.a $(R_LIBS)
> current:MPIRUN:mpirun -np _NP_ -machinefile _HOSTS_ _EXEC_
>
> and in my submitted shell script I add
>
> source ~/.cshrc
> source /home/yljia/intel/Compiler/11.1/073/bin/iccvars.csh intel64
> source /home/yljia/intel/Compiler/11.1/072/bin/ifortvars.csh intel64
> source /home!
> /yljia/intel/Compiler/11.1/072/mkl/tools/environment/mklvarsem64t.csh
> setenv LD_LIBRARY_PATH
> /home/yljia/compiler_library/fftw-2.1.5/lib:$LD_LIBRARY_PATH
> set path = (/home/yljia/compiler_library/openmpi-1.6.1/bin $path)
> setenv LD_LIBRARY_PATH
> /home/yljia/compiler_library/openmpi-1.6.1/lib:$LD_LIBRARY_PATH
> setenv OMP_NUM_THREADS 1
> setenv MKL_NUM_THREADS 1
>
> The program can run in non parallel mode, k point paralle(one node and multi
> nodes, USE_REMOTE= 0 and 1). But in mpi parallel mode , there are two case:
> 1). On one node--The run_lapw program can run with MPI_REMOTE=0, but can not
> run at lapw1 when MPI_REMOTE=1 with error messages in STDOUT like(NOTE:there
> is libmpi_f90.so.1 in $OpenmpiRoot/lib/):
>
> /home/yljia/software/wien2k_11/lapw1_mpi: error while loading shared
> libraries: libmpi_f90.so.1: cannot open shared object file: No such file or
> directory
> /home/yljia/software/wien2k_11/lapw1_mpi: error while loading shared
> libraries: libmpi_f90.so.1: cannot open shared object file: No such file or
> directory
>
> 2). On two nodes--The run_lapw program can not run at lapw1 with
> MPI_REMOTE=0 or MPI_REMOTE=1. When MPI_REMOTE=0 the error messages are like:
>
> There are no allocated resources for the application
> /home/yljia/software/wien2k_11/lapw1_mpi
> that match the requested mapping:
> .machine5
> Verify that you have mapped the allocated resources properly using the
> --host or --hostfile specification.
>
> When MPI_REMOTE=1 the error messages are like:
>
> /home/yljia/software/wien2k_11/lapw1_mpi: error while loading shared
> libraries: libmpi_f90.so.1: cannot open shared object file: No such file or
> directory
> /home/yljia/software/wien2k_11/lapw1_mpi: error while loading shared
> libraries: libmpi_f90.so.1: cannot open share! d object file: No such file
> or directory
>
> Best regards,
> Jia Yalei
--
Professor Laurence Marks
Department of Materials Science and Engineering
Northwestern University
www.numis.northwestern.edu 1-847-491-3996
"Research is to see what everybody else has seen, and to think what
nobody else has thought"
Albert Szent-Gyorgi
More information about the Wien
mailing list