[Wien] MPI parallelization failure for lapw1
Peter Blaha
pblaha at theochem.tuwien.ac.at
Wed Nov 27 09:20:27 CET 2019
When using the srun setup of WIEN2k it means that you are tightly
integrated into your system and have to follow all your systems default
settings.
For instance you configured CORES_PER_NODE =1; but I very much doubt
that you cluster has only one core per node and srun will probably make
certain assumptions about that.
Two suggestions for tests:
a) run it on only ONE node, but on all cores of this node. The
corresponding .machines-file should have
1:machine1:YY where YY is the number of cores (16 or 24, ..)
b) If your queuing system setup allows to use mpirun, reconfigure WIEN2k
(siteconfig) with the default intel+mkl option (not the srun option). It
will then suggest to use mpirun ... for starting jobs.
Make sure that in your batch job (I assume you are using it) the proper
modules are loaded (intel, mkl, intel-mpi).
On 11/26/19 7:07 PM, Hanning Chen wrote:
> Dear WIEN2K community,
>
> I am a new user of WIEN2K, and just compiled it using the following
> options:
>
> current:FOPT:-O -FR -mp1 -w -prec_div -pc80 -pad -ip -DINTEL_VML
> -traceback -assume buffered_io -I$(MKLROOT)/include
>
> current:FPOPT:-O -FR -mp1 -w -prec_div -pc80 -pad -ip -DINTEL_VML
> -traceback -assume buffered_io -I$(MKLROOT)/include
>
> current:OMP_SWITCH:-qopenmp
>
> current:LDFLAGS:$(FOPT) -L$(MKLROOT)/lib/$(MKL_TARGET_ARCH) -lpthread
> -lm -ldl -liomp5
>
> current:DPARALLEL:'-DParallel'
>
> current:R_LIBS:-lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core
>
> current:FFTWROOT:/home/ec2-user/FFTW338/
>
> current:FFTW_VERSION:FFTW3
>
> current:FFTW_LIB:lib
>
> current:FFTW_LIBNAME:fftw3
>
> current:LIBXCROOT:
>
> current:LIBXC_FORTRAN:
>
> current:LIBXC_LIBNAME:
>
> current:LIBXC_LIBDNAME:
>
> current:SCALAPACKROOT:$(MKLROOT)/lib/
>
> current:SCALAPACK_LIBNAME:mkl_scalapack_lp64
>
> current:BLACSROOT:$(MKLROOT)/lib/
>
> current:BLACS_LIBNAME:mkl_blacs_intelmpi_lp64
>
> current:ELPAROOT:
>
> current:ELPA_VERSION:
>
> current:ELPA_LIB:
>
> current:ELPA_LIBNAME:
>
> current:MPIRUN:srun -K -N_nodes_ -n_NP_ -r_offset_ _PINNING_ _EXEC_
>
> current:CORES_PER_NODE:1
>
> current:MKL_TARGET_ARCH:intel64
>
> setenv TASKSET "no"
>
> if ( ! $?USE_REMOTE ) setenv USE_REMOTE 1
>
> if ( ! $?MPI_REMOTE ) setenv MPI_REMOTE 0
>
> setenv WIEN_GRANULARITY 1
>
> setenv DELAY 0.1
>
> setenv SLEEPY 1
>
> setenv WIEN_MPIRUN "srun -K -N_nodes_ -n_NP_ -r_offset_ _PINNING_ _EXEC_"
>
> if ( ! $?CORES_PER_NODE) setenv CORES_PER_NODE1
>
> # if ( ! $?PINNING_COMMAND) setenv PINNING_COMMAND "--cpu_bind=map_cpu:"
>
> # if ( ! $?PINNING_LIST ) setenv PINNING_LIST
> "0,8,1,9,2,10,3,11,4,12,5,13,6,14,7,15"
>
> Then, I ran a k-point parallelization with the .machines file below,
> and it worked perfectly:
>
> granularity:1
>
> 1:machine1
>
> 2:machine2
>
> extrafine:1
>
> But, when I tried to parallelize it over MPI with the new .machines file:
>
> granularity:1
>
> 1:machine1 machine2
>
> extrafine:1
>
> lapw1 crashed with the error message as
>
> ** Error in Parallel LAPW1
>
> **. LAPW1 STOPPED
>
> ** check ERROR FILES!
>
> SEP INFO = -21
>
> ‘SECLR4’. -SYEVX (Scalapack/LAPACK) failed
>
> Although I understand that the 21st parameter of the SYEVX subroutine is
> incorrect, I am not sure how to fix the problem. I actually have linked
> WIEN2K with NETLIB’s SCALAPACK/LAPACK/BLAS instead of MKL. But the same
> error appeared again.
>
> Please help me out. Thanks.
>
> Hanning Chen, Ph.D.
>
> Department of Chemistry
>
> American University
>
> Washington, DC 20016
>
>
> _______________________________________________
> Wien mailing list
> Wien at zeus.theochem.tuwien.ac.at
> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
> SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
>
--
P.Blaha
--------------------------------------------------------------------------
Peter BLAHA, Inst.f. Materials Chemistry, TU Vienna, A-1060 Vienna
Phone: +43-1-58801-165300 FAX: +43-1-58801-165982
Email: blaha at theochem.tuwien.ac.at WIEN2k: http://www.wien2k.at
WWW: http://www.imc.tuwien.ac.at/TC_Blaha
--------------------------------------------------------------------------
More information about the Wien
mailing list