[Wien] Error in MPI run

leila mollabashi le.mollabashi at gmail.com
Sun Mar 28 22:37:40 CEST 2021


Dear Wien2k users,

I have a problem with MPI parallelization while I have compiled the code
with no error. The v19.2 version of WIEN2k has been compiled with ifort, cc
and openmpi. The mkl and FFTW libraries were also used. On the SLURM queue
cluster I can run with the k-point parallel mode. But I could not run it
mpi parallel mode even on 1 node.  I used this script to run:

Sbatch submit_script.sl

Which submit_script.sl file is for example as follows:

#! /bin/bash -l

hostname

rm -fr .machines

# for 4 cpus and kpoints (in input file)

nproc=4

#write .machines file

echo '#' .machines

# example for an MPI parallel lapw0

echo 'lapw0:'`hostname`'

#:'$nproc >> .machines

# k-point and mpi parallel lapw1/2

echo '1:'`hostname`':2' >> .machines

echo '1:'`hostname`':2' >> .machines

echo 'granularity:1' >>.machines

echo 'extrafine:1' >>.machines

 run_lapw –p

Then this error appears:

error: command   /home/users/mollabashi/v19.2/lapw0para lapw0.def   failed

slurm-17032361.out file is as follows:

# .machines

bash: lapw0: command not found

real    0m0.001s

user    0m0.000s

sys     0m0.001s

grep: *scf1*: No such file or directory

grep: lapw2*.error: No such file or directory

>   stop error

Then when I manually run this error appears:

There are not enough slots available in the system to satisfy the 4

slots that were requested by the application:

  /home/users/mollabashi/v19.2/lapw0_mpi

Either request fewer slots for your application, or make more slots

available for use.

A "slot" is the Open MPI term for an allocatable unit where we can

launch a process.  The number of slots available are defined by the

environment in which Open MPI processes are run:

  1. Hostfile, via "slots=N" clauses (N defaults to number of

     processor cores if not provided)

  2. The --host command line parameter, via a ":N" suffix on the

     hostname (N defaults to 1 if not provided)

  3. Resource manager (e.g., SLURM, PBS/Torque, LSF, etc.)

  4. If none of a hostfile, the --host command line parameter, or an

     RM is present, Open MPI defaults to the number of processor cores

In all the above cases, if you want Open MPI to default to the number

of hardware threads instead of the number of processor cores, use the

--use-hwthread-cpus option.

Alternatively, you can use the --oversubscribe option to ignore the

number of available slots when deciding the number of processes to

launch.

--------------------------------------------------------------------------

[1]    Exit 1                        mpirun -np 4 -machinefile .machine0
/home/users/mollabashi/v19.2/lapw0_mpi lapw0.def >> .time00

--------------------------------------------------------------------------

There are not enough slots available in the system to satisfy the 2

slots that were requested by the application:

  /home/users/mollabashi/v19.2/lapw1_mpi

Either request fewer slots for your application, or make more slots

available for use.

A "slot" is the Open MPI term for an allocatable unit where we can

launch a process.  The number of slots available are defined by the

environment in which Open MPI processes are run:

  1. Hostfile, via "slots=N" clauses (N defaults to number of

     processor cores if not provided)

  2. The --host command line parameter, via a ":N" suffix on the

     hostname (N defaults to 1 if not provided)

  3. Resource manager (e.g., SLURM, PBS/Torque, LSF, etc.)

  4. If none of a hostfile, the --host command line parameter, or an

     RM is present, Open MPI defaults to the number of processor cores

In all the above cases, if you want Open MPI to default to the number

of hardware threads instead of the number of processor cores, use the

--use-hwthread-cpus option.

Alternatively, you can use the --oversubscribe option to ignore the

number of available slots when deciding the number of processes to

launch.

--------------------------------------------------------------------------

[1]  + Done                          ( cd $PWD; $t $ttt; rm -f
.lock_$lockfile[$p] ) >> .time1_$loop

--------------------------------------------------------------------------

There are not enough slots available in the system to satisfy the 2

slots that were requested by the application:

  /home/users/mollabashi/v19.2/lapw1_mpi

Either request fewer slots for your application, or make more slots

available for use.

A "slot" is the Open MPI term for an allocatable unit where we can

launch a process.  The number of slots available are defined by the

environment in which Open MPI processes are run:

  1. Hostfile, via "slots=N" clauses (N defaults to number of

     processor cores if not provided)

  2. The --host command line parameter, via a ":N" suffix on the

     hostname (N defaults to 1 if not provided)

  3. Resource manager (e.g., SLURM, PBS/Torque, LSF, etc.)

  4. If none of a hostfile, the --host command line parameter, or an

     RM is present, Open MPI defaults to the number of processor cores

In all the above cases, if you want Open MPI to default to the number

of hardware threads instead of the number of processor cores, use the

--use-hwthread-cpus option.

Alternatively, you can use the --oversubscribe option to ignore the

number of available slots when deciding the number of processes to

launch.

--------------------------------------------------------------------------

[1]  + Done                          ( cd $PWD; $t $ttt; rm -f
.lock_$lockfile[$p] ) >> .time1_$loop

ce.scf1_1: No such file or directory.

grep: *scf1*: No such file or directory

LAPW2 - Error. Check file lapw2.error

cp: cannot stat ‘.in.tmp’: No such file or directory

grep: *scf1*: No such file or directory

>   stop error

Would you please kindly guide me?

Sincerely yours,

Leila Mollabashi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://zeus.theochem.tuwien.ac.at/pipermail/wien/attachments/20210329/735b5508/attachment.htm>


More information about the Wien mailing list