[Wien] delays in parallel work

Lyudmila Dobysheva lyuka17 at mail.ru
Tue Oct 6 17:09:56 CEST 2020


Dear all,

I have started working at supercomputer and sometimes I see some delays 
during execution. They occur randomly, more frequently during lapw0, but 
in other programs also (extra 7-20 min). Administrators say that there 
can be sometimes problems with the net's speed.
But I cannot understand: now I take only one node with 16 processors. 
I'd say that if I send the task to one node the problems of the net 
between computers should not affect till the whole task ends.
Maybe I have wrongly set scratch variable?
In .bashrc:
export SCRATCH=./

During execution I see how the cycle is fulfilled, that is, after lapw0 
I see its output files. This means that after lapw0 the calculating node 
sends to the governing computer the files, and, maybe, here it waits? Is 
this behavior correct? I expected that I should not see the intermediate 
stages, till the work ends.
And the very programs lapw0, lapw1, lapw2, lcore, mixer - maybe they are 
reloaded to the calculating computer every cycle anew?

Best regards
Lyudmila Dobysheva

some details WIEN2k_19.2
ifort 64 19.1.0.166
---------------
parallel_options:
setenv TASKSET "srun "
if ( ! $?USE_REMOTE ) setenv USE_REMOTE 1
if ( ! $?MPI_REMOTE ) setenv MPI_REMOTE 0
setenv WIEN_GRANULARITY 1
setenv DELAY 0.1
setenv SLEEPY 1
if ( ! $?WIEN_MPIRUN) setenv WIEN_MPIRUN "srun -K -N_nodes_ -n_NP_ 
-r_offset_ _PINNING_ _EXEC_"
if ( ! $?CORES_PER_NODE) setenv CORES_PER_NODE  16
--------------
WIEN2k_OPTIONS:
current:FOPT:-O -FR -mp1 -w -prec_div -pc80 -pad -ip -DINTEL_VML 
-traceback -assume buffered_io -I$(
MKLROOT)/include
current:FPOPT:-O -FR -mp1 -w -prec_div -pc80 -pad -ip -DINTEL_VML 
-traceback -assume buffered_io -I$
(MKLROOT)/include
current:OMP_SWITCH:-qopenmp
current:LDFLAGS:$(FOPT) -L$(MKLROOT)/lib/$(MKL_TARGET_ARCH) -lpthread 
-lm -ldl -liomp5
current:DPARALLEL:'-DParallel'
current:R_LIBS:-lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core
current:FFTWROOT:/home/uffff/.local/
current:FFTW_VERSION:FFTW3
current:FFTW_LIB:lib
current:FFTW_LIBNAME:fftw3
current:LIBXCROOT:
current:LIBXC_FORTRAN:
current:LIBXC_LIBNAME:
current:LIBXC_LIBDNAME:
current:SCALAPACKROOT:$(MKLROOT)/lib/
current:SCALAPACK_LIBNAME:mkl_scalapack_lp64
current:BLACSROOT:$(MKLROOT)/lib/
current:BLACS_LIBNAME:mkl_blacs_intelmpi_lp64
current:ELPAROOT:
current:ELPA_VERSION:
current:ELPA_LIB:
current:ELPA_LIBNAME:
current:MPIRUN:srun -K -N_nodes_ -n_NP_ -r_offset_ _PINNING_ _EXEC_
current:CORES_PER_NODE:16
current:MKL_TARGET_ARCH:intel64

------------------
http://ftiudm.ru/content/view/25/103/lang,english/
Physics-Techn.Institute,
Udmurt Federal Research Center, Ural Br. of Rus.Ac.Sci.
426000 Izhevsk Kirov str. 132
Russia
---
Tel. +7 (34I2)43-24-59 (office), +7 (9I2)OI9-795O (home)
Skype: lyuka18 (office), lyuka17 (home)
E-mail: lyuka17 at mail.ru (office), lyuka17 at gmail.com (home)


More information about the Wien mailing list