[Wien] slurm mpi
webfinder at ukr.net
webfinder at ukr.net
Tue May 7 16:30:47 CEST 2019
Dear Prof. Blaha,
I'm using intel mpi 2019.3.199
the scalapack and blacs libs are located in the intel compilers_and_libraries_2019.3.199 folder
OPTIONS file:
current:FOPT:-O1 -FR -mp1 -w -prec_div -pc80 -pad -ip -DINTEL_VML -traceback -assume buffered_io -I$(MKLROOT)/include
current:FPOPT:-O1 -FR -mp1 -w -prec_div -pc80 -pad -ip -DINTEL_VML -traceback -assume buffered_io -I$(MKLROOT)/include
current:LDFLAGS:$(FOPT) -L$(MKLROOT)/lib/intel64 -lpthread -lm -ldl -liomp5
current:DPARALLEL:-DParallel
current:R_LIBS:-lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core
current:FFTWROOT:/gpfs/home/ser/Install/FFTW/
current:FFTW_VERSION:FFTW3
current:FFTW_LIB:lib
current:FFTW_LIBNAME:fftw3
current:LIBXCROOT:/gpfs/home/ser/Install/LIBXC/
current:LIBXC_FORTRAN:xcf03
current:LIBXC_LIBNAME:xc
current:LIBXC_LIBDNAME:lib/
current:SCALAPACKROOT:/gpfs/softs/cluster/intel/psxe/2019.3/compilers_and_libraries_2019.3.199/linux/mkl/lib/
current:SCALAPACK_LIBNAME:mkl_scalapack_lp64
current:BLACSROOT:/gpfs/softs/cluster/intel/psxe/2019.3/compilers_and_libraries_2019.3.199/linux/mkl/lib/
current:BLACS_LIBNAME:mkl_blacs_intelmpi_lp64
current:ELPAROOT:
current:ELPA_VERSION:
current:MPIRUN:srun -K -N_nodes_ -n_NP_ -r_offset_ _PINNING_ _EXEC_ (changed to mpirun)
current:CORES_PER_NODE:1
current:MKL_TARGET_ARCH:intel64
part of .bashrc
module load gcc/7.3.0
module add compiler/intel/2019.3.199
module load mpi/intel/2019.3.199
source /gpfs/softs/cluster/intel/psxe/2019.3/compilers_and_libraries_2019.3.199/linux/bin/compilervars.sh intel64
source /gpfs/softs/cluster/intel/psxe/2019.3/compilers_and_libraries_2019.3.199/linux/mpi/intel64/bin/mpivars.sh
in the interactive mode
mpirun -np 4 $WIENROOT/lapw0_mpi lapw0.def
results in LAPW0 END
Actually, after I commented the following line in my script
"if($natom < $nproc) set nproc0=$natom"
the "permission denied" error disappeared and mpi starts with the following output:
32 nodes for this job: n073 n073 n073 n073 n073 n073 n073 n073 n073 n073 n073 n073 n073 n073 n073 n073 n074 n074 n074 n074 n074 n074 n074 n074 n074 n074 n074 n074 n074 n074 n074 n074
LAPW0 END
[1] Done mpirun -n 32 -machinefile .machine0 /gpfs/home/ser/wienroot_v18/lapw0_mpi lapw0.def >> .time00
Force-convergence not possible. Forces not present.
LAPW1 END
[1] + Done ( cd $PWD; $t $ttt; rm -f .lock_$lockfile[$p] ) >> .time1_$loop
LAPW1 END
[1] + Done ( cd $PWD; $t $ttt; rm -f .lock_$lockfile[$p] ) >> .time1_$loop
LAPW2 - FERMI; weights written
LAPW2 - FERMI; weights written
CORE END
CORE END
MIXER END
At the same time in dayfile:
Intel MKL ERROR: Parameter 3 was incorrect on entry to DGEMM .
Intel MKL ERROR: Parameter 3 was incorrect on entry to DGEMM .
Intel MKL ERROR: Parameter 3 was incorrect on entry to DGEMM .
Intel MKL ERROR: Parameter 8 was incorrect on entry to DGEMM .
Intel MKL ERROR: Parameter 8 was incorrect on entry to DGEMM .
....
and in scf file
:seclit_par: estimate of singular value, factor: 0.3125E-01 0.1000E-14
:WARN :seclit_par-stability trick active for: eigenvalue, sproj_ii 1 0.8207E-20
:WARN :seclit_par-stability trick active for: eigenvalue, sproj_ii 2 0.6228E-19
:WARN :seclit_par-stability trick active for: eigenvalue, sproj_ii 3 0.6073E-19
:WARN :seclit_par-stability trick active for: eigenvalue, sproj_ii 4 0.6256E-19
:WARN :seclit_par-stability trick active for: eigenvalue, sproj_ii 5 0.9136E-19
:WARN :seclit_par-stability trick active for: eigenvalue, sproj_ii 6 0.9098E-19
:WARN :seclit_par-stability trick active for: eigenvalue, sproj_ii 7 0.7724E-19
:WARN :seclit_par-stability trick active for: eigenvalue, sproj_ii 8 0.7724E-19
:WARN :seclit_par-stability trick active for: eigenvalue, sproj_ii 9 0.7534E-19
:WARN :seclit_par-stability trick active for: eigenvalue, sproj_ii 10 0.2265E-19
:WARN :seclit_par-stability trick active for: eigenvalue, sproj_ii 11 0.2059E-19
:WARN :seclit_par-stability trick active for: eigenvalue, sproj_ii 12 0.2059E-19
:WARN :seclit_par-stability trick active for: eigenvalue, sproj_ii 13 0.8401E-18
:WARN :seclit_par-stability trick active for: eigenvalue, sproj_ii 14 0.8294E-18
:WARN :seclit_par-stability trick active for: eigenvalue, sproj_ii 15 0.1019E-19
:WARN :seclit_par-stability trick active for: eigenvalue, sproj_ii 16 0.2041E-19
This messages are absent in case of k-point parallelization
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://zeus.theochem.tuwien.ac.at/pipermail/wien/attachments/20190507/05beb891/attachment.html>
More information about the Wien
mailing list