[Wien] Error in MPI run
Laurence Marks
laurence.marks at gmail.com
Mon Mar 29 00:11:42 CEST 2021
Almost certainly the issue is in the error messages. For the remote casd:
"bash: lapw0: command not found"
Presumably you have not exported WIENROOT when you started your job, and/or
it is not exported by openmpi. Check how to use mpi on your system
including exporting PATH.
___
For the manual case:
"There are not enough slots available in the system to satisfy the 4 slots
that were requested by the application:"
Since this is system specific you will need to look at what is needed
locally to allocate cores for an interactive job.
_____
Professor Laurence Marks
"Research is to see what everybody else has seen, and to think what nobody
else has thought", Albert Szent-Györgyi
www.numis.northwestern.edu
On Sun, Mar 28, 2021, 15:38 leila mollabashi <le.mollabashi at gmail.com>
wrote:
> Dear Wien2k users,
>
> I have a problem with MPI parallelization while I have compiled the code
> with no error. The v19.2 version of WIEN2k has been compiled with ifort, cc
> and openmpi. The mkl and FFTW libraries were also used. On the SLURM queue
> cluster I can run with the k-point parallel mode. But I could not run it
> mpi parallel mode even on 1 node. I used this script to run:
>
> Sbatch submit_script.sl
> <https://urldefense.com/v3/__http://submit_script.sl__;!!Dq0X2DkFhyF93HkjWTBQKhk!GBjaGLxhtn9Pc7_neae0XGGQKa19d7Zchpp-25dfdrC1HpsVqnVQ7-WFBCfeOwPEeJq5iA$>
>
> Which submit_script.sl
> <https://urldefense.com/v3/__http://submit_script.sl__;!!Dq0X2DkFhyF93HkjWTBQKhk!GBjaGLxhtn9Pc7_neae0XGGQKa19d7Zchpp-25dfdrC1HpsVqnVQ7-WFBCfeOwPEeJq5iA$>
> file is for example as follows:
>
> #! /bin/bash -l
>
> hostname
>
> rm -fr .machines
>
> # for 4 cpus and kpoints (in input file)
>
> nproc=4
>
> #write .machines file
>
> echo '#' .machines
>
> # example for an MPI parallel lapw0
>
> echo 'lapw0:'`hostname`'
>
> #:'$nproc >> .machines
>
> # k-point and mpi parallel lapw1/2
>
> echo '1:'`hostname`':2' >> .machines
>
> echo '1:'`hostname`':2' >> .machines
>
> echo 'granularity:1' >>.machines
>
> echo 'extrafine:1' >>.machines
>
> run_lapw –p
>
> Then this error appears:
>
> error: command /home/users/mollabashi/v19.2/lapw0para lapw0.def failed
>
> slurm-17032361.out file is as follows:
>
> # .machines
>
> bash: lapw0: command not found
>
> real 0m0.001s
>
> user 0m0.000s
>
> sys 0m0.001s
>
> grep: *scf1*: No such file or directory
>
> grep: lapw2*.error: No such file or directory
>
> > stop error
>
> Then when I manually run this error appears:
>
> There are not enough slots available in the system to satisfy the 4
>
> slots that were requested by the application:
>
> /home/users/mollabashi/v19.2/lapw0_mpi
>
> Either request fewer slots for your application, or make more slots
>
> available for use.
>
> A "slot" is the Open MPI term for an allocatable unit where we can
>
> launch a process. The number of slots available are defined by the
>
> environment in which Open MPI processes are run:
>
> 1. Hostfile, via "slots=N" clauses (N defaults to number of
>
> processor cores if not provided)
>
> 2. The --host command line parameter, via a ":N" suffix on the
>
> hostname (N defaults to 1 if not provided)
>
> 3. Resource manager (e.g., SLURM, PBS/Torque, LSF, etc.)
>
> 4. If none of a hostfile, the --host command line parameter, or an
>
> RM is present, Open MPI defaults to the number of processor cores
>
> In all the above cases, if you want Open MPI to default to the number
>
> of hardware threads instead of the number of processor cores, use the
>
> --use-hwthread-cpus option.
>
> Alternatively, you can use the --oversubscribe option to ignore the
>
> number of available slots when deciding the number of processes to
>
> launch.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://zeus.theochem.tuwien.ac.at/pipermail/wien/attachments/20210328/149cd511/attachment.htm>
More information about the Wien
mailing list