[Wien] Error in MPI run

Laurence Marks laurence.marks at gmail.com
Mon Mar 29 00:11:42 CEST 2021


Almost certainly the issue is in the error messages. For the remote casd:

"bash: lapw0: command not found"

Presumably you have not exported WIENROOT when you started your job, and/or
it is not exported by openmpi. Check how to use mpi on your system
including exporting PATH.

___

For the manual case:

"There are not enough slots available in the system to satisfy the 4 slots
 that were requested by the application:"

Since this is system specific you will need to look at what is needed
locally to allocate cores for an interactive job.

_____
Professor Laurence Marks
"Research is to see what everybody else has seen, and to think what nobody
else has thought", Albert Szent-Györgyi
www.numis.northwestern.edu

On Sun, Mar 28, 2021, 15:38 leila mollabashi <le.mollabashi at gmail.com>
wrote:

> Dear Wien2k users,
>
> I have a problem with MPI parallelization while I have compiled the code
> with no error. The v19.2 version of WIEN2k has been compiled with ifort, cc
> and openmpi. The mkl and FFTW libraries were also used. On the SLURM queue
> cluster I can run with the k-point parallel mode. But I could not run it
> mpi parallel mode even on 1 node.  I used this script to run:
>
> Sbatch submit_script.sl
> <https://urldefense.com/v3/__http://submit_script.sl__;!!Dq0X2DkFhyF93HkjWTBQKhk!GBjaGLxhtn9Pc7_neae0XGGQKa19d7Zchpp-25dfdrC1HpsVqnVQ7-WFBCfeOwPEeJq5iA$>
>
> Which submit_script.sl
> <https://urldefense.com/v3/__http://submit_script.sl__;!!Dq0X2DkFhyF93HkjWTBQKhk!GBjaGLxhtn9Pc7_neae0XGGQKa19d7Zchpp-25dfdrC1HpsVqnVQ7-WFBCfeOwPEeJq5iA$>
> file is for example as follows:
>
> #! /bin/bash -l
>
> hostname
>
> rm -fr .machines
>
> # for 4 cpus and kpoints (in input file)
>
> nproc=4
>
> #write .machines file
>
> echo '#' .machines
>
> # example for an MPI parallel lapw0
>
> echo 'lapw0:'`hostname`'
>
> #:'$nproc >> .machines
>
> # k-point and mpi parallel lapw1/2
>
> echo '1:'`hostname`':2' >> .machines
>
> echo '1:'`hostname`':2' >> .machines
>
> echo 'granularity:1' >>.machines
>
> echo 'extrafine:1' >>.machines
>
>  run_lapw –p
>
> Then this error appears:
>
> error: command   /home/users/mollabashi/v19.2/lapw0para lapw0.def   failed
>
> slurm-17032361.out file is as follows:
>
> # .machines
>
> bash: lapw0: command not found
>
> real    0m0.001s
>
> user    0m0.000s
>
> sys     0m0.001s
>
> grep: *scf1*: No such file or directory
>
> grep: lapw2*.error: No such file or directory
>
> >   stop error
>
> Then when I manually run this error appears:
>
> There are not enough slots available in the system to satisfy the 4
>
> slots that were requested by the application:
>
>   /home/users/mollabashi/v19.2/lapw0_mpi
>
> Either request fewer slots for your application, or make more slots
>
> available for use.
>
> A "slot" is the Open MPI term for an allocatable unit where we can
>
> launch a process.  The number of slots available are defined by the
>
> environment in which Open MPI processes are run:
>
>   1. Hostfile, via "slots=N" clauses (N defaults to number of
>
>      processor cores if not provided)
>
>   2. The --host command line parameter, via a ":N" suffix on the
>
>      hostname (N defaults to 1 if not provided)
>
>   3. Resource manager (e.g., SLURM, PBS/Torque, LSF, etc.)
>
>   4. If none of a hostfile, the --host command line parameter, or an
>
>      RM is present, Open MPI defaults to the number of processor cores
>
> In all the above cases, if you want Open MPI to default to the number
>
> of hardware threads instead of the number of processor cores, use the
>
> --use-hwthread-cpus option.
>
> Alternatively, you can use the --oversubscribe option to ignore the
>
> number of available slots when deciding the number of processes to
>
> launch.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://zeus.theochem.tuwien.ac.at/pipermail/wien/attachments/20210328/149cd511/attachment.htm>


More information about the Wien mailing list