[Wien] Error in MPI run
leila mollabashi
le.mollabashi at gmail.com
Sun Mar 28 22:37:40 CEST 2021
Dear Wien2k users,
I have a problem with MPI parallelization while I have compiled the code
with no error. The v19.2 version of WIEN2k has been compiled with ifort, cc
and openmpi. The mkl and FFTW libraries were also used. On the SLURM queue
cluster I can run with the k-point parallel mode. But I could not run it
mpi parallel mode even on 1 node. I used this script to run:
Sbatch submit_script.sl
Which submit_script.sl file is for example as follows:
#! /bin/bash -l
hostname
rm -fr .machines
# for 4 cpus and kpoints (in input file)
nproc=4
#write .machines file
echo '#' .machines
# example for an MPI parallel lapw0
echo 'lapw0:'`hostname`'
#:'$nproc >> .machines
# k-point and mpi parallel lapw1/2
echo '1:'`hostname`':2' >> .machines
echo '1:'`hostname`':2' >> .machines
echo 'granularity:1' >>.machines
echo 'extrafine:1' >>.machines
run_lapw –p
Then this error appears:
error: command /home/users/mollabashi/v19.2/lapw0para lapw0.def failed
slurm-17032361.out file is as follows:
# .machines
bash: lapw0: command not found
real 0m0.001s
user 0m0.000s
sys 0m0.001s
grep: *scf1*: No such file or directory
grep: lapw2*.error: No such file or directory
> stop error
Then when I manually run this error appears:
There are not enough slots available in the system to satisfy the 4
slots that were requested by the application:
/home/users/mollabashi/v19.2/lapw0_mpi
Either request fewer slots for your application, or make more slots
available for use.
A "slot" is the Open MPI term for an allocatable unit where we can
launch a process. The number of slots available are defined by the
environment in which Open MPI processes are run:
1. Hostfile, via "slots=N" clauses (N defaults to number of
processor cores if not provided)
2. The --host command line parameter, via a ":N" suffix on the
hostname (N defaults to 1 if not provided)
3. Resource manager (e.g., SLURM, PBS/Torque, LSF, etc.)
4. If none of a hostfile, the --host command line parameter, or an
RM is present, Open MPI defaults to the number of processor cores
In all the above cases, if you want Open MPI to default to the number
of hardware threads instead of the number of processor cores, use the
--use-hwthread-cpus option.
Alternatively, you can use the --oversubscribe option to ignore the
number of available slots when deciding the number of processes to
launch.
--------------------------------------------------------------------------
[1] Exit 1 mpirun -np 4 -machinefile .machine0
/home/users/mollabashi/v19.2/lapw0_mpi lapw0.def >> .time00
--------------------------------------------------------------------------
There are not enough slots available in the system to satisfy the 2
slots that were requested by the application:
/home/users/mollabashi/v19.2/lapw1_mpi
Either request fewer slots for your application, or make more slots
available for use.
A "slot" is the Open MPI term for an allocatable unit where we can
launch a process. The number of slots available are defined by the
environment in which Open MPI processes are run:
1. Hostfile, via "slots=N" clauses (N defaults to number of
processor cores if not provided)
2. The --host command line parameter, via a ":N" suffix on the
hostname (N defaults to 1 if not provided)
3. Resource manager (e.g., SLURM, PBS/Torque, LSF, etc.)
4. If none of a hostfile, the --host command line parameter, or an
RM is present, Open MPI defaults to the number of processor cores
In all the above cases, if you want Open MPI to default to the number
of hardware threads instead of the number of processor cores, use the
--use-hwthread-cpus option.
Alternatively, you can use the --oversubscribe option to ignore the
number of available slots when deciding the number of processes to
launch.
--------------------------------------------------------------------------
[1] + Done ( cd $PWD; $t $ttt; rm -f
.lock_$lockfile[$p] ) >> .time1_$loop
--------------------------------------------------------------------------
There are not enough slots available in the system to satisfy the 2
slots that were requested by the application:
/home/users/mollabashi/v19.2/lapw1_mpi
Either request fewer slots for your application, or make more slots
available for use.
A "slot" is the Open MPI term for an allocatable unit where we can
launch a process. The number of slots available are defined by the
environment in which Open MPI processes are run:
1. Hostfile, via "slots=N" clauses (N defaults to number of
processor cores if not provided)
2. The --host command line parameter, via a ":N" suffix on the
hostname (N defaults to 1 if not provided)
3. Resource manager (e.g., SLURM, PBS/Torque, LSF, etc.)
4. If none of a hostfile, the --host command line parameter, or an
RM is present, Open MPI defaults to the number of processor cores
In all the above cases, if you want Open MPI to default to the number
of hardware threads instead of the number of processor cores, use the
--use-hwthread-cpus option.
Alternatively, you can use the --oversubscribe option to ignore the
number of available slots when deciding the number of processes to
launch.
--------------------------------------------------------------------------
[1] + Done ( cd $PWD; $t $ttt; rm -f
.lock_$lockfile[$p] ) >> .time1_$loop
ce.scf1_1: No such file or directory.
grep: *scf1*: No such file or directory
LAPW2 - Error. Check file lapw2.error
cp: cannot stat ‘.in.tmp’: No such file or directory
grep: *scf1*: No such file or directory
> stop error
Would you please kindly guide me?
Sincerely yours,
Leila Mollabashi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://zeus.theochem.tuwien.ac.at/pipermail/wien/attachments/20210329/735b5508/attachment.htm>
More information about the Wien
mailing list