[Wien] MPI Error

Peter Blaha pblaha at theochem.tuwien.ac.at
Sat May 29 16:37:15 CEST 2021


The difference beteen lapw0para and lapw1para is that
lapw0para always executes mpirun on the original node, lapw1para maybe not.

The behavior of lapw1para depends on MPI_REMOTE (set in 
WIEN2k_parallel_options in w2k21.1 (or parallel_options earlier).
With MPI_REMOTE=1 it will first issue a   ssh nodename and there it does 
the mpirun. This does not work with your settings (probably because you 
do not load the modules in your .bashrc (or .cshrc), but only in your 
slurm-job and your system does not transfer the environment with ssh.

The recommended option for mpi version 2 (all modern mpis) is to set 
MPI_REMOTE to zero. The mpirun command will be issued on the original 
node, but the lapw1_mpi executables will run as given in .machines.

This should solve your problem.

Am 29.05.2021 um 08:39 schrieb leila mollabashi:
> Dear all wien2k users,
> Following the previous comment referring me to the admin, I contacted 
> the cluster admin. By the comment of the admin, I recompiled Wien2k 
> successfully using the cluster modules.
>>Once the blacs problem has been fixed,
> For example, is the following correct?
> libmkl_blacs_openmpi_lp64.so => 
> /opt/exp_soft/local/generic/intel/mkl/lib/intel64/libmkl_blacs_openmpi_lp64.so 
> (0x00002b21efe03000)
>>the next step is to run lapw0 in
> sequential and parallel mode.
>>Add:
> x lapw0     and check the case.output0 and case.scf0 files (copy them to
> a different name) as well as the message from the queuing system. ...
> The “x lapw0” and “mpirun -np 4 $WIENROOT/lapw0_mpi lapw0.def” are 
> interactively executed correctly.
> The “x lapw0 -p” is also correctly executed using the following 
> “.machines” file:
> lapw0:e0017:4
>>The same thing could be made with lapw1
> The “x lapw1” and “mpirun -np 4 $WIENROOT/lapw1_mpi lapw1.def” are also 
> correctly executed interactively with no problem. But “x lapw1 -p” stops 
> when I use the following “.machines” file:
> 1:e0017:2
> 1:e0017:2
> bash: mpirun: command not found
> The output files are gathered into https://files.fm/u/7cssehdck 
> <https://files.fm/u/7cssehdck>.
> Would you, please, help me to fix the parallel problem too?
> Sincerely yours,
> Leila
> 
> 
> _______________________________________________
> Wien mailing list
> Wien at zeus.theochem.tuwien.ac.at
> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
> SEARCH the MAILING-LIST at:  http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
> 

-- 
--------------------------------------------------------------------------
Peter BLAHA, Inst.f. Materials Chemistry, TU Vienna, A-1060 Vienna
Phone: +43-1-58801-165300             FAX: +43-1-58801-165982
Email: blaha at theochem.tuwien.ac.at    WIEN2k: http://www.wien2k.at
WWW:   http://www.imc.tuwien.ac.at
-------------------------------------------------------------------------


More information about the Wien mailing list