[Wien] MPI error
Laurence Marks
laurence.marks at gmail.com
Wed May 5 14:44:12 CEST 2021
I think we (collectively) may be confusing things by offering too much
advice!
Let's keep it simple, and focus on one thing at a time. The "mpirun not
found" has nothing to do with compilers. It is 100% due to your not having
the PATH variable set right. This is not fftw, but probably in the general
options and mpi module. (It is not in the compiler.)
The library not being found is similarly due to LD_LIBRARY_PATH not being
right, or perhaps nfs mounting issues. This may be from the compiler
variables (mpiifort for Intel), which may not be correctly set by the
module.
I suggest that you focus on the PATH first, using
which mpirun
which lapw1
echo $PATH
echo $WIENROOT
When this is correct in the script that runs your job we can move forward
and solve the library issue using
echo $LD_LIBRARY_PATH
ldd $WIENROOT/lapw1_mpi
ldd $WIENROOT/lapw0_mpi
Please focus just of the PATH first. If you have problems, find a way
(DropBox/Drive etc) to post the script and result.
_____
Professor Laurence Marks
"Research is to see what everybody else has seen, and to think what nobody
else has thought", Albert Szent-Györgyi
www.numis.northwestern.edu
On Tue, May 4, 2021, 22:24 Gavin Abo <gsabo at crimson.ua.edu> wrote:
> Three additional comments:
>
> 1) If you are running the slurm.job script as Non-Interactive [1,2], you
> might need a "source /etc/profile.d/ummodules.csh" line like that at [3].
> [1] https://slurm.schedmd.com/faq.html#sbatch_srun
> <https://urldefense.com/v3/__https://slurm.schedmd.com/faq.html*sbatch_srun__;Iw!!Dq0X2DkFhyF93HkjWTBQKhk!C5RwGaeHepJtMl42xVZlEAUWtk2DM4zXGp3jfTPI5NJGAplXozUMwOc-7I4TfdoCMq2-Yw$>
> [2] https://wiki.umiacs.umd.edu/umiacs/index.php/SLURM/JobSubmission
> <https://urldefense.com/v3/__https://wiki.umiacs.umd.edu/umiacs/index.php/SLURM/JobSubmission__;!!Dq0X2DkFhyF93HkjWTBQKhk!C5RwGaeHepJtMl42xVZlEAUWtk2DM4zXGp3jfTPI5NJGAplXozUMwOc-7I4TfdqwsBiRag$>
> [3] https://wiki.umiacs.umd.edu/umiacs/index.php/Modules
> <https://urldefense.com/v3/__https://wiki.umiacs.umd.edu/umiacs/index.php/Modules__;!!Dq0X2DkFhyF93HkjWTBQKhk!C5RwGaeHepJtMl42xVZlEAUWtk2DM4zXGp3jfTPI5NJGAplXozUMwOc-7I4TfdrUX-bsYA$>
>
> 2) To eliminate any possible conflicts between Intel compilers (ifort,
> icc) and GNU compilers (gfortran, gcc) such as those mentioned in posts
> [4,5]. I suggest compiling of WIEN2k using the same C compiler and Fortran
> compiler that Open MPI was compiled with.
>
> The commands [6] below might help you check that the Linux environment is
> using the intended Open MPI and mpi parallel compilers [7]:
>
> username at computername:~/Desktop$ mpirun -V
> mpirun (Open MPI) 4.1.1
>
> Report bugs to http://www.open-mpi.org/community/help/
> <https://urldefense.com/v3/__http://www.open-mpi.org/community/help/__;!!Dq0X2DkFhyF93HkjWTBQKhk!C5RwGaeHepJtMl42xVZlEAUWtk2DM4zXGp3jfTPI5NJGAplXozUMwOc-7I4TfdpyptJ4aw$>
> username at computername:~/Desktop$ mpicc --showme:version
> mpicc: Open MPI 4.1.1 (Language: C)
> username at computername:~/Desktop$ mpifort --showme:version
> mpifort: Open MPI 4.1.1 (Language: Fortran)
> username at computername:~/Desktop$ mpicc --version
> gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
> Copyright (C) 2019 Free Software Foundation, Inc.
> This is free software; see the source for copying conditions. There is NO
> warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
>
> username at computername:~/Desktop$ mpifort --version
> GNU Fortran (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
> Copyright (C) 2019 Free Software Foundation, Inc.
> This is free software; see the source for copying conditions. There is NO
> warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
> [4]
> https://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/msg08315.html
> <https://urldefense.com/v3/__https://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/msg08315.html__;!!Dq0X2DkFhyF93HkjWTBQKhk!C5RwGaeHepJtMl42xVZlEAUWtk2DM4zXGp3jfTPI5NJGAplXozUMwOc-7I4Tfdqey1lHyg$>
> [5]
> https://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/msg17052.html
> <https://urldefense.com/v3/__https://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/msg17052.html__;!!Dq0X2DkFhyF93HkjWTBQKhk!C5RwGaeHepJtMl42xVZlEAUWtk2DM4zXGp3jfTPI5NJGAplXozUMwOc-7I4TfdraaFQRnw$>
> [6]
> https://stackoverflow.com/questions/10056898/how-do-you-check-the-version-of-openmpi
> <https://urldefense.com/v3/__https://stackoverflow.com/questions/10056898/how-do-you-check-the-version-of-openmpi__;!!Dq0X2DkFhyF93HkjWTBQKhk!C5RwGaeHepJtMl42xVZlEAUWtk2DM4zXGp3jfTPI5NJGAplXozUMwOc-7I4Tfdph7KJdew$>
> [7] https://www.open-mpi.org/faq/?category=mpi-apps#general-build
> <https://urldefense.com/v3/__https://www.open-mpi.org/faq/?category=mpi-apps*general-build__;Iw!!Dq0X2DkFhyF93HkjWTBQKhk!C5RwGaeHepJtMl42xVZlEAUWtk2DM4zXGp3jfTPI5NJGAplXozUMwOc-7I4TfdosQiVCzQ$>
>
> 3) Those cluster administrators are usually more savvy than I am with
> installation and optimization of software (using compiler documentation,
> e.g. [8,9]) on a high performance computing (hpc) supercomputer [10,11].
> They would know your situation better. For example, they could login to
> their administrator account on the cluster to install WIEN2k only in your
> user account directory (/home/users/mollabashi), and they would know how to
> set the appropriate access permissions [12]. Alternatively, if your not
> using a personal laptop but a computer at the organization to remotely
> connect to the cluster then they might use remote desktop access [13] to
> help you with the installation within only your account. Or they might use
> another method.
> [8]
> https://software.intel.com/content/www/us/en/develop/articles/download-documentation-intel-compiler-current-and-previous.html
> <https://urldefense.com/v3/__https://software.intel.com/content/www/us/en/develop/articles/download-documentation-intel-compiler-current-and-previous.html__;!!Dq0X2DkFhyF93HkjWTBQKhk!C5RwGaeHepJtMl42xVZlEAUWtk2DM4zXGp3jfTPI5NJGAplXozUMwOc-7I4TfdqQX1Yugg$>
> [9] https://gcc.gnu.org/onlinedocs/
> <https://urldefense.com/v3/__https://gcc.gnu.org/onlinedocs/__;!!Dq0X2DkFhyF93HkjWTBQKhk!C5RwGaeHepJtMl42xVZlEAUWtk2DM4zXGp3jfTPI5NJGAplXozUMwOc-7I4TfdobQb5a1g$>
> [10]
> https://www.usgs.gov/core-science-systems/sas/arc/about/what-high-performance-computing
> <https://urldefense.com/v3/__https://www.usgs.gov/core-science-systems/sas/arc/about/what-high-performance-computing__;!!Dq0X2DkFhyF93HkjWTBQKhk!C5RwGaeHepJtMl42xVZlEAUWtk2DM4zXGp3jfTPI5NJGAplXozUMwOc-7I4TfdqNzk9j6g$>
> [11] https://en.wikipedia.org/wiki/Supercomputer
> <https://urldefense.com/v3/__https://en.wikipedia.org/wiki/Supercomputer__;!!Dq0X2DkFhyF93HkjWTBQKhk!C5RwGaeHepJtMl42xVZlEAUWtk2DM4zXGp3jfTPI5NJGAplXozUMwOc-7I4Tfdr6pni8Lw$>
> [12]
> https://www.oreilly.com/library/view/running-linux-third/156592469X/ch04s14.html
> <https://urldefense.com/v3/__https://www.oreilly.com/library/view/running-linux-third/156592469X/ch04s14.html__;!!Dq0X2DkFhyF93HkjWTBQKhk!C5RwGaeHepJtMl42xVZlEAUWtk2DM4zXGp3jfTPI5NJGAplXozUMwOc-7I4TfdroF-ZgDw$>
> [13] https://en.wikipedia.org/wiki/Desktop_sharing
> <https://urldefense.com/v3/__https://en.wikipedia.org/wiki/Desktop_sharing__;!!Dq0X2DkFhyF93HkjWTBQKhk!C5RwGaeHepJtMl42xVZlEAUWtk2DM4zXGp3jfTPI5NJGAplXozUMwOc-7I4Tfdr-07_1qQ$>
>
> On 5/4/2021 3:40 PM, Laurence Marks wrote:
>
> For certain, "/opt/exp_soft/local/generic/openmpi/4.1.0_gcc620/bin/mpiexec
> /home/users/mollabashi/codes/v21.1/run_lapw -p" is completely wrong. You
> do not, repear do not use mpirun or mpiexec to start run_lapw. It has to be
> started by simply "run_lapw -p ..." by itself.
>
> I suggest that you create a very simple job which has the commands:
>
> which mpirun
> which lapw1_mpi
> echo $WIENROOT
> ldd $WIENROOT/lapw1_mpi
> ldd $WIENROOT/lapw1
> echo env
> echo $PATH
>
> Run this interactively as well as in a batch job and compare. You will
> find that there are something which are not present when you are launching
> your slurm job that are present interactively. You need to repair these
> with relevant PATH/LD_LIBRARY_PATH etc
>
> Your problems are not Wien2k problems, they are due to incorrect
> modules/script/environment or similar. Have you asked your sysadmin for
> help? I am certain that someone local who is experienced with standard
> linux can tell you very quickly what to do.
>
> N.B., there is an error in your path setting.
>
> On Tue, May 4, 2021 at 3:38 PM leila mollabashi <le.mollabashi at gmail.com>
> wrote:
>
>> Dear all WIEN2k users,
>> Thank you for your guides.
>> >take care on the correct location ...
>> It is the /usr/share/Modules/init
>> After adding the “source /usr/share/Modules/init/tcsh” line in to the
>> script the same error appeared:
>> mpirun: command not found
>>
>> In fact, with and without “source /usr/share/Modules/init/tcsh” it is
>> written in slurm.out file that “ module load complete ”.
>>
>> I noticed that “export” is also the bash command so I used these commands
>> to path the openmpi and fftw:
>> setenv LD_LIBRARY_PATH
>> {$LD_LIBRARY_PATH}:/opt/exp_soft/local/generic/openmpi/4.1.0_gcc620/lib:/opt/exp_soft/local/generic/fftw/3.3.8_gcc620/lib
>> set path = ($path
>> /opt/exp_soft/local/generic/openmpi/4.1.0_gcc620/bin:/opt/exp_soft/local/generic/fftw/3.3.8_gcc620/bin)
>> But result is the same:
>> bash: mpirun: command not found
>>
>> By using this line in the script:
>> /opt/exp_soft/local/generic/openmpi/4.1.0_gcc620/bin/mpiexec
>> /home/users/mollabashi/codes/v21.1/run_lapw -p
>> The calculation stopped with the following error:
>> mpirun does not support recursive calls
>>
>> > I wonder that you have only single modules…
>> There are different versions of ifort and mkl: ifort/15.0.0,
>> ifort/15.0.3, ifort/17.0.1, ifort/19.1.3.304(default) mkl/11.2, mkl/
>> 11.2.3.187
>> <https://urldefense.com/v3/__http://11.2.3.187__;!!Dq0X2DkFhyF93HkjWTBQKhk!BD88XA2ujgG8Gel0NZaSOKaSrtPN7kq75O9hkG-dnIZQJRbsnQE-ArEqbFqA6XVu7qQFcg$> mkl/2017.1.132,
>> mkl/2019.2.187, mkl/2020.0.4(default). I used the defaults
>> > you may also wish to make a single module file to be loaded…
>> That is a good idea.
>> > On our cluster we have different W2k modules ….
>> As you know WIEN2k is not a free code and the users of the cluster that I
>> am using are not registered WIEN2k users. Thus, according to my moral
>> commitment to the WIEN2k developers, I cannot ask the administrator to
>> install it on the cluster. I should install it on my user account.
>>
>> Sincerely yours,
>> Leila
>> >PS.: maybe one should mention this tcsh "problem" in the slurm.job
>> example on the FAQ page by adding (or similar)…
>> That is a good idea. Thank you for your suggestion.
>>
>> --
> Professor Laurence Marks
> Department of Materials Science and Engineering
> Northwestern University
> www.numis.northwestern.edu
> "Research is to see what everybody else has seen, and to think what nobody
> else has thought" Albert Szent-Györgyi
>
> _______________________________________________
> Wien mailing list
> Wien at zeus.theochem.tuwien.ac.at
>
> https://urldefense.com/v3/__http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien__;!!Dq0X2DkFhyF93HkjWTBQKhk!C5RwGaeHepJtMl42xVZlEAUWtk2DM4zXGp3jfTPI5NJGAplXozUMwOc-7I4Tfdoe8NT97g$
> SEARCH the MAILING-LIST at:
> https://urldefense.com/v3/__http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html__;!!Dq0X2DkFhyF93HkjWTBQKhk!C5RwGaeHepJtMl42xVZlEAUWtk2DM4zXGp3jfTPI5NJGAplXozUMwOc-7I4TfdrbLIpdvQ$
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://zeus.theochem.tuwien.ac.at/pipermail/wien/attachments/20210505/47c27b67/attachment.htm>
More information about the Wien
mailing list