[Wien] MPI error

Fecher, Gerhard fecher at uni-mainz.de
Mon May 3 13:48:15 CEST 2021


Dear Leila
In your first mail you mentioned that you use
     slurm.job
with the added lines 
   module load openmpi/4.1.0_gcc620
   module load ifort
   module load mkl

Sorry that my Sunday evening answer was too short, here a little more detail:
I guess your login shell is bash, and you run the command 
     sbatch slurm.job 
that is written for tcsh, but tcsh does not know where the module command is, therfore  the job file should tell it where it is, 
e.g.: the beginning of slurm.job should look like 

#!/bin/tcsh
#
# Load the respective software module you intend to use, here for tcsh shell
# NOTE: you may need to edit the source line !
source /usr/share/lmod/lmod/init/tcsh
module load openmpi/4.1.0_gcc620
module load ifort
module load mkl 

take care on the correct location it may be in: /usr/share/modules/init/csh
if you do not find its correct location then ask your administrator

I wonder that you have only single modules for ifort and mkl and not different version, 
I guess that are defaults, but which ? ask your administrator;
you may also wish to make a single module file to be loaded, and
you may also whish to send the output to the data nirvana by using >& /dev/null
in that case you may have only the lines (as an example)
source /usr/share/lmod/lmod/init/tcsh
module load Wien2k/wien2k_21_intel19 >& /dev/null
echo -n "Running Wien2k" $WienVersion


PS.: maybe one should mention this tcsh "problem"  in the slurm.job example on the FAQ page by adding (or similar)
  #  NOTE: you may need to edit the following line !
  #  source /usr/share/lmod/lmod/init/tcsh
as modules are frequently used on clusters and allow easily to change between different versions.
On our cluster we have different W2k modules that have been compiled with different libraries, compilers, and/or settings.

PSS.: I am not aware of typos ;-)

Ciao
Gerhard

DEEP THOUGHT in D. Adams; Hitchhikers Guide to the Galaxy:
"I think the problem, to be quite honest with you,
is that you have never actually known what the question is."

====================================
Dr. Gerhard H. Fecher
Institut of Physics
Johannes Gutenberg - University
55099 Mainz
________________________________________
Von: Wien [wien-bounces at zeus.theochem.tuwien.ac.at] im Auftrag von leila mollabashi [le.mollabashi at gmail.com]
Gesendet: Montag, 3. Mai 2021 00:35
An: A Mailing list for WIEN2k users
Betreff: Re: [Wien] MPI error

Thank you.

On Mon, May 3, 2021, 3:04 AM Laurence Marks <laurence.marks at gmail.com<mailto:laurence.marks at gmail.com>> wrote:
You have to solve the "mpirun not found". That is due to your path/nfs/module -- we do not know.

---
Prof Laurence Marks
"Research is to see what everyone else has seen, and to think what nobody else has thought", Albert Szent-Györgyi
www.numis.northwestern.edu<http://www.numis.northwestern.edu>

On Sun, May 2, 2021, 17:12 leila mollabashi <le.mollabashi at gmail.com<mailto:le.mollabashi at gmail.com>> wrote:
>You have an error in the LD_LIBRARY_PATH def you sent -- it needs to be "...:$LD_LIB..."
Thank you. I have corrected it but I still have error in x lapw1 “mpirun: command not found”
>Why not load the modules in the script to run a job? I have loaded but this error happened “bash: mpirun: command not found”.

On Mon, May 3, 2021 at 2:23 AM Laurence Marks <laurence.marks at gmail.com<mailto:laurence.marks at gmail.com>> wrote:
You have an error in the LD_LIBRARY_PATH def you sent -- it needs to be "...:$LD_LIB...".

Why not load the modules in the script to run a job?

---
Prof Laurence Marks
"Research is to see what everyone else has seen, and to think what nobody else has thought", Albert Szent-Györgyi
www.numis.northwestern.edu<http://www.numis.northwestern.edu>

On Sun, May 2, 2021, 16:35 leila mollabashi <le.mollabashi at gmail.com<mailto:le.mollabashi at gmail.com>> wrote:
Dear all WIEN2k users,
Thank you for your reply.
>The error is exactly what it says -- mpirun not found. This has something to do with the modules, almost certainly the openmpi one. You need to find where mpirun is on your system, and ensure that it is in your PATH. This is an issue with your OS, not Wien2k. However...
which mpirun:
/opt/exp_soft/local/generic/openmpi/4.1.0_gcc620/bin/mpirun
I have installed WIEN2k by loading ifort, mkl, openmpi/4.1.0_gcc620,  fftw/3.3.8_gcc620 modules. when I added the path in my .bashrc file as followes:
export LD_LIBRARY_PATH=/opt/exp_soft/local/generic/openmpi/4.1.0_gcc620/lib:/opt/exp_soft/local/generic/fftw/3.3.8_gcc620/lib:LD_LIBRARY_PATH
export PATH=/opt/exp_soft/local/generic/openmpi/4.1.0_gcc620/bin:/opt/exp_soft/local/generic/fftw/3.3.8_gcc620/bin:$PATH
wien2k does not run:
error while loading shared libraries: libiomp5.so: cannot open shared object file: No such file or directory
0.000u 0.000s 0:00.00 0.0%      0+0k 0+0io 0pf+0w
but without the path and by loading the modules it runs.
> First do "x lapw0 -p", send the .machines file and the last few lines of your *.output0*. Then we can confirm if that worked right, did not or what.
.machines:
lapw0:e0183:4
1:e0183:4
1:e0183:4
Almost end of *output0000:
TOTAL VALUE = -10433.492442     (H)
:DEN  : DENSITY INTEGRAL  =        -20866.98488444   (Ry)
Almost end of *output0001
TOTAL VALUE = -10433.492442     (H)
>Assuming that you used gcc
Yes.
>For certain you cannot run lapw2 without first running lapw1.
Yes. You are right. When x lapw1 –p has not executed I have changed the .machines file and run in kpoint parallel mode then changed the .machines file again and run lapw2 –p.
>How? Do you mean that there are no error messages?
Yes and I also checked compile.msg in SRC_lapw1

Sincerely yours,

Leila

On Mon, May 3, 2021 at 12:42 AM Fecher, Gerhard <fecher at uni-mainz.de<mailto:fecher at uni-mainz.de>> wrote:
I guess that module does not work with tcsh

Ciao
Gerhard

DEEP THOUGHT in D. Adams; Hitchhikers Guide to the Galaxy:
"I think the problem, to be quite honest with you,
is that you have never actually known what the question is."

====================================
Dr. Gerhard H. Fecher
Institut of Physics
Johannes Gutenberg - University
55099 Mainz
________________________________________
Von: Wien [wien-bounces at zeus.theochem.tuwien.ac.at<mailto:wien-bounces at zeus.theochem.tuwien.ac.at>] im Auftrag von Laurence Marks [laurence.marks at gmail.com<mailto:laurence.marks at gmail.com>]
Gesendet: Sonntag, 2. Mai 2021 21:32
An: A Mailing list for WIEN2k users
Betreff: Re: [Wien] MPI error

Inlined response and questions

On Sun, May 2, 2021 at 2:19 PM leila mollabashi <le.mollabashi at gmail.com<mailto:le.mollabashi at gmail.com><mailto:le.mollabashi at gmail.com<mailto:le.mollabashi at gmail.com>>> wrote:
Dear Prof. Peter Blaha and WIEN2k users,
Now I have loaded the openmpi/4.1.0 and compiled Wine2k. The admin told me that I can use your script in >http://www.wien2k.at/reg_user/faq/slurm.job<https://urldefense.com/v3/__http://www.wien2k.at/reg_user/faq/slurm.job__;!!Dq0X2DkFhyF93HkjWTBQKhk!A4zeMc6H184Nsbinv0lWLQyzxpdvRUetaqlHDTUV8sC-k8WlE7z_qcoC_7AzO5s6X8cPOw$><https://urldefense.com/v3/__http://www.wien2k.at/reg_user/faq/slurm.job__;!!Dq0X2DkFhyF93HkjWTBQKhk!G_67ZheBzKx4rn9SJ-7AOPNV2M9DFC6mHQ4b1S_sPZITO1RwQsLYLGNWwENJJwPKlowiXQ$> . I added this lines to it too:
module load openmpi/4.1.0_gcc620
module load ifort
module load mkl
but this error happened “bash: mpirun: command not found”.
The error is exactly what it says -- mpirun not found. This has something to do with the modules, almost certainly the openmpi one. You need to find where mpirun is on your system, and ensure that it is in your PATH. This is an issue with your OS, not Wien2k. However...

In an interactive mode “x lapw0 –p” and “x lapw2 –p” are executed MPI but “x lapw1 –p” is stoped with following error:
w2k_dispatch_signal(): received: Segmentation fault
Is this mpi mode? None of lapw0/1/2 can work in true parallel without mpirun, so there is something major wrong here. I doubt that anything really executed properly. For certain you cannot run lapw2 without first running lapw1. What is your .machines file? what is the content of the error files? (cat *.error).

First do "x lapw0 -p", send the .machines file and the last few lines of your *.output0*. Then we can confirm if that worked right, did not or what.

--------------------------------------------------------------------------
I noticed that the FFTW3 and OpenMPI installed on the cluster are both compiled by gfortan. But I have compiled WIEN2k by intel ifort. I am not sure whether the problem originates from this inconsistency between gfortan and ifort.
Almost everything in FFTW3 and OpenMPI is in fact c. Assuming that you used gcc there should be no problem. In general there should be no problem.

I have checked that lapw1 has compiled correctly.
How? Do you mean that there are no error messages?


Sincerely yours,

Leila



_______________________________________________
Wien mailing list
Wien at zeus.theochem.tuwien.ac.at<mailto:Wien at zeus.theochem.tuwien.ac.at>
http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien<https://urldefense.com/v3/__http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien__;!!Dq0X2DkFhyF93HkjWTBQKhk!A4zeMc6H184Nsbinv0lWLQyzxpdvRUetaqlHDTUV8sC-k8WlE7z_qcoC_7AzO5uDU05KOQ$>
SEARCH the MAILING-LIST at:  http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html<https://urldefense.com/v3/__http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html__;!!Dq0X2DkFhyF93HkjWTBQKhk!A4zeMc6H184Nsbinv0lWLQyzxpdvRUetaqlHDTUV8sC-k8WlE7z_qcoC_7AzO5swXEtm0g$>
_______________________________________________
Wien mailing list
Wien at zeus.theochem.tuwien.ac.at<mailto:Wien at zeus.theochem.tuwien.ac.at>
https://urldefense.com/v3/__http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien__;!!Dq0X2DkFhyF93HkjWTBQKhk!A4zeMc6H184Nsbinv0lWLQyzxpdvRUetaqlHDTUV8sC-k8WlE7z_qcoC_7AzO5uDU05KOQ$
SEARCH the MAILING-LIST at:  https://urldefense.com/v3/__http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html__;!!Dq0X2DkFhyF93HkjWTBQKhk!A4zeMc6H184Nsbinv0lWLQyzxpdvRUetaqlHDTUV8sC-k8WlE7z_qcoC_7AzO5swXEtm0g$
_______________________________________________
Wien mailing list
Wien at zeus.theochem.tuwien.ac.at<mailto:Wien at zeus.theochem.tuwien.ac.at>
http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien<https://urldefense.com/v3/__http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien__;!!Dq0X2DkFhyF93HkjWTBQKhk!CZnbK8WwpYpOc2mUYWb9cTAiCvhVrGX09mpZmoHtNgKXf6RoG5gXEtluuJ63EqfuT1QoPw$>
SEARCH the MAILING-LIST at:  http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html<https://urldefense.com/v3/__http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html__;!!Dq0X2DkFhyF93HkjWTBQKhk!CZnbK8WwpYpOc2mUYWb9cTAiCvhVrGX09mpZmoHtNgKXf6RoG5gXEtluuJ63Eqdv4aqpLg$>
_______________________________________________
Wien mailing list
Wien at zeus.theochem.tuwien.ac.at<mailto:Wien at zeus.theochem.tuwien.ac.at>
https://urldefense.com/v3/__http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien__;!!Dq0X2DkFhyF93HkjWTBQKhk!CZnbK8WwpYpOc2mUYWb9cTAiCvhVrGX09mpZmoHtNgKXf6RoG5gXEtluuJ63EqfuT1QoPw$
SEARCH the MAILING-LIST at:  https://urldefense.com/v3/__http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html__;!!Dq0X2DkFhyF93HkjWTBQKhk!CZnbK8WwpYpOc2mUYWb9cTAiCvhVrGX09mpZmoHtNgKXf6RoG5gXEtluuJ63Eqdv4aqpLg$
_______________________________________________
Wien mailing list
Wien at zeus.theochem.tuwien.ac.at<mailto:Wien at zeus.theochem.tuwien.ac.at>
http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
SEARCH the MAILING-LIST at:  http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html


More information about the Wien mailing list