[Wien] MPI parallelization failure for lapw1

Peter Blaha pblaha at theochem.tuwien.ac.at
Wed Nov 27 09:20:27 CET 2019


When using the srun setup of WIEN2k it means that you are tightly 
integrated into your system and have to follow all your systems default 
settings.

For instance you configured CORES_PER_NODE =1; but I very much doubt 
that you cluster has only one core per node and srun will probably make 
certain assumptions about that.

Two suggestions for tests:

a) run it on only ONE node, but on all cores of this node. The 
corresponding .machines-file should have
1:machine1:YY    where YY is the number of cores (16 or 24, ..)

b) If your queuing system setup allows to use mpirun, reconfigure WIEN2k 
(siteconfig) with the default intel+mkl option (not the srun option). It 
will then suggest to use mpirun ... for starting jobs.

Make sure that in your batch job (I assume you are using it) the proper 
modules are loaded (intel, mkl, intel-mpi).


On 11/26/19 7:07 PM, Hanning Chen wrote:
> Dear WIEN2K community,
> 
>    I am a new user of WIEN2K, and just compiled it using the following 
> options:
> 
> current:FOPT:-O -FR -mp1 -w -prec_div -pc80 -pad -ip -DINTEL_VML 
> -traceback -assume buffered_io -I$(MKLROOT)/include
> 
> current:FPOPT:-O -FR -mp1 -w -prec_div -pc80 -pad -ip -DINTEL_VML 
> -traceback -assume buffered_io -I$(MKLROOT)/include
> 
> current:OMP_SWITCH:-qopenmp
> 
> current:LDFLAGS:$(FOPT) -L$(MKLROOT)/lib/$(MKL_TARGET_ARCH) -lpthread 
> -lm -ldl -liomp5
> 
> current:DPARALLEL:'-DParallel'
> 
> current:R_LIBS:-lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core
> 
> current:FFTWROOT:/home/ec2-user/FFTW338/
> 
> current:FFTW_VERSION:FFTW3
> 
> current:FFTW_LIB:lib
> 
> current:FFTW_LIBNAME:fftw3
> 
> current:LIBXCROOT:
> 
> current:LIBXC_FORTRAN:
> 
> current:LIBXC_LIBNAME:
> 
> current:LIBXC_LIBDNAME:
> 
> current:SCALAPACKROOT:$(MKLROOT)/lib/
> 
> current:SCALAPACK_LIBNAME:mkl_scalapack_lp64
> 
> current:BLACSROOT:$(MKLROOT)/lib/
> 
> current:BLACS_LIBNAME:mkl_blacs_intelmpi_lp64
> 
> current:ELPAROOT:
> 
> current:ELPA_VERSION:
> 
> current:ELPA_LIB:
> 
> current:ELPA_LIBNAME:
> 
> current:MPIRUN:srun -K -N_nodes_ -n_NP_ -r_offset_ _PINNING_ _EXEC_
> 
> current:CORES_PER_NODE:1
> 
> current:MKL_TARGET_ARCH:intel64
> 
> setenv TASKSET "no"
> 
> if ( ! $?USE_REMOTE ) setenv USE_REMOTE 1
> 
> if ( ! $?MPI_REMOTE ) setenv MPI_REMOTE 0
> 
> setenv WIEN_GRANULARITY 1
> 
> setenv DELAY 0.1
> 
> setenv SLEEPY 1
> 
> setenv WIEN_MPIRUN "srun -K -N_nodes_ -n_NP_ -r_offset_ _PINNING_ _EXEC_"
> 
> if ( ! $?CORES_PER_NODE) setenv CORES_PER_NODE1
> 
> # if ( ! $?PINNING_COMMAND) setenv PINNING_COMMAND "--cpu_bind=map_cpu:"
> 
> # if ( ! $?PINNING_LIST ) setenv PINNING_LIST 
> "0,8,1,9,2,10,3,11,4,12,5,13,6,14,7,15"
> 
>    Then, I ran a k-point parallelization with the .machines file below, 
> and it worked perfectly:
> 
>      granularity:1
> 
> 1:machine1
> 
> 2:machine2
> 
> extrafine:1
> 
>    But, when I tried to parallelize it over MPI with the new .machines file:
> 
>        granularity:1
> 
>        1:machine1 machine2
> 
> extrafine:1
> 
> lapw1 crashed with the error message as
> 
> **   Error in Parallel LAPW1
> 
> **.  LAPW1 STOPPED
> 
> ** check ERROR FILES!
> 
>    SEP INFO = -21
> 
> ‘SECLR4’. -SYEVX (Scalapack/LAPACK) failed
> 
> Although I understand that the 21st parameter of the SYEVX subroutine is 
> incorrect, I am not sure how to fix the problem. I actually have linked 
> WIEN2K with NETLIB’s SCALAPACK/LAPACK/BLAS instead of MKL. But the same 
> error appeared again.
> 
> Please help me out. Thanks.
> 
> Hanning Chen, Ph.D.
> 
> Department of Chemistry
> 
> American University
> 
> Washington, DC 20016
> 
> 
> _______________________________________________
> Wien mailing list
> Wien at zeus.theochem.tuwien.ac.at
> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
> SEARCH the MAILING-LIST at:  http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
> 

-- 

                                       P.Blaha
--------------------------------------------------------------------------
Peter BLAHA, Inst.f. Materials Chemistry, TU Vienna, A-1060 Vienna
Phone: +43-1-58801-165300             FAX: +43-1-58801-165982
Email: blaha at theochem.tuwien.ac.at    WIEN2k: http://www.wien2k.at
WWW:   http://www.imc.tuwien.ac.at/TC_Blaha
--------------------------------------------------------------------------


More information about the Wien mailing list