[Wien] error in running .machines file

Gavin Abo gsabo at crimson.ua.edu
Sat Jun 16 16:27:44 CEST 2018


The "ssh cn308 ldd $WIENROOT/lapw0_mpi" is finding files for your ifort 
installation like 
/THFS/opt/intel/composer_xe_2013_sp1.3.174/mkl/lib/intel64/libmkl_scalapack_lp64.so 
just fine.  So your environmental variables seem to be setup and working 
fine on both nodes.  It looks like the 
/opt/intel/impi/5.0.2.044/intel64/lib/libmpifort.so.12 exists on the 
renwei node but not on the cn308 node.    It looks to me that Intel MPI 
(impi) is not installed on the cn308 node.

Perhaps the cn308 node is using a different partition or different 
shared drive.  I have read that there are different possible solutions 
for the slurm cluster problem you seem to have which depend on how it is 
configured [ 
https://lists.schedmd.com/pipermail/slurm-users/2017-December/000272.html 
].  You might be able to check which partition the renwei node and cn308 
node are using with sinfo [ https://slurm.schedmd.com/sinfo.html ].

Maybe you just have to have your cluster manager (administrator, help 
desk, ...) install impi like what you did for ifort.  To remove the 
"manpath: command not found", the cluster manager probably just has to 
install the man or man-db package on the cn308 node (they should be able 
to check the documentation or forums for the OS that their cluster is 
using on how to install manpath, typically for example: yum install man 
or apt-get install man-db).  I have never performed administration 
functions of a slurm cluster, so for additional help with your problem 
you may have to ask a slurm expert (e.g., your cluster manager or the 
slurm mailing list [ https://slurm.schedmd.com/mail.html ]).

On 6/16/2018 4:28 AM, venkatesh chandragiri wrote:
>
> Dear Prof. Marks,
>
> I did "ssh othernode ldd $WIENROOT/lapw0_mpi".
>
> =========
>
> [renwei at ln3 ~]$  ssh cn308 ldd $WIENROOT/lapw0_mpi
> /THFS/opt/intel/composer_xe_2013_sp1.3.174/mkl/bin/mklvars.sh: line 
> 118: manpath: command not found
>         linux-vdso.so.1 =>  (0x00007fffd8fff000)
>         libfftw3_mpi.so.3 => 
> /THFS/home/renwei/venky/soft/fftw/lib/libfftw3_mpi.so.3 
> (0x00007fd41621d000)
>         libmkl_scalapack_lp64.so => 
> /THFS/opt/intel/composer_xe_2013_sp1.3.174/mkl/lib/intel64/libmkl_scalapack_lp64.so 
> (0x00007fd415947000)
>         libmkl_blacs_intelmpi_lp64.so => 
> /THFS/opt/intel/composer_xe_2013_sp1.3.174/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.so 
> (0x00007fd41570a000)
>         libfftw3.so.3 => 
> /THFS/home/renwei/venky/soft/fftw/lib/libfftw3.so.3 (0x00007fd4153fe000)
>         libmkl_intel_lp64.so => 
> /THFS/opt/intel/composer_xe_2013_sp1.3.174/mkl/lib/intel64/libmkl_intel_lp64.so 
> (0x00007fd414cb0000)
>         libmkl_intel_thread.so => 
> /THFS/opt/intel/composer_xe_2013_sp1.3.174/mkl/lib/intel64/libmkl_intel_thread.so 
> (0x00007fd413c90000)
>         libmkl_core.so => 
> /THFS/opt/intel/composer_xe_2013_sp1.3.174/mkl/lib/intel64/libmkl_core.so 
> (0x00007fd41259c000)
>         libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fd412380000)
> *_        libmpifort.so.12 => not found
>         libmpi.so.12 => not found_*
>         libdl.so.2 => /lib64/libdl.so.2 (0x00007fd412172000)
>         librt.so.1 => /lib64/librt.so.1 (0x00007fd411f69000)
>         libm.so.6 => /lib64/libm.so.6 (0x00007fd411ce5000)
>         libiomp5.so => 
> /opt/intel/composer_xe_2013_sp1.3.174/compiler/lib/intel64/libiomp5.so 
> (0x00007fd4119ca000)
>         libc.so.6 => /lib64/libc.so.6 (0x00007fd411628000)
>         libgcc_s.so.1 => 
> /THFS/home/sh-hzw2/software/Matlab2014a//sys/os/glnxa64/libgcc_s.so.1 
> (0x00007fd411413000)
>         libimf.so => 
> /opt/intel/composer_xe_2013_sp1.3.174/compiler/lib/intel64/libimf.so 
> (0x00007fd410f50000)
>         libsvml.so => 
> /opt/intel/composer_xe_2013_sp1.3.174/compiler/lib/intel64/libsvml.so 
> (0x00007fd410354000)
>         libirng.so => 
> /opt/intel/composer_xe_2013_sp1.3.174/compiler/lib/intel64/libirng.so 
> (0x00007fd41014d000)
>         libintlc.so.5 => 
> /opt/intel/composer_xe_2013_sp1.3.174/compiler/lib/intel64/libintlc.so.5 
> (0x00007fd40fef7000)
>         /lib64/ld-linux-x86-64.so.2 (0x00007fd416436000)
>
> =========
>
> As it is shown here *_        libmpifort.so.12 => not found,         
> libmpi.so.12 => not found when I run in cn308 node_*
>
> But these have well defined paths when run ldd at "renwei"
>
>         libmpifort.so.12 => 
> /opt/intel/impi/5.0.2.044/intel64/lib/libmpifort.so.12 
> <http://5.0.2.044/intel64/lib/libmpifort.so.12> (0x00002b3a37c98000)
>         libmpi.so.12 => 
> /opt/intel/impi/5.0.2.044/intel64/lib/libmpi.so.12 
> <http://5.0.2.044/intel64/lib/libmpi.so.12> (0x00002b3a37f21000)
>
> ===================
>
> [renwei at ln3 ~]$ ssh cn308 $WIENROOT/lapw0_mpi
> /THFS/opt/intel/composer_xe_2013_sp1.3.174/mkl/bin/mklvars.sh: line 
> 118: manpath: command not found
> /THFS/home/renwei/venky/soft/wien2k/lapw0_mpi: error while loading 
> shared libraries: libmpifort.so.12: cannot open shared object file: No 
> such file or directory
> [renwei at ln3 ~]$
>
>
> ===============
>
> [renwei at ln3 ~]$ ssh cn308
> Last login: Sat Jun 16 17:59:04 2018 from ln3-gn0
> -bash: manpath: command not found
> [renwei at cn308 ~]$ $WIENROOT/lapw0_mpi
> /THFS/home/renwei/venky/soft/wien2k/lapw0_mpi: error while loading 
> shared libraries: libmpifort.so.12: cannot open shared object file: No 
> such file or directory
>
> *
> *
>
> *========================*
>
> You also mentioned to use " use static compilation". I don't 
> understand this. do you meant to be static compilation of wien2k..? 
> how I can do it (I am sorry to ask this, as I belongs to experimental 
> background I don't come across these kind of issues).
>
>
> thank you.
>
> venkatesh
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://zeus.theochem.tuwien.ac.at/pipermail/wien/attachments/20180616/164a4226/attachment.html>


More information about the Wien mailing list