[Wien] Fwd: MPI segmentation fault

Laurence Marks L-marks at northwestern.edu
Fri Jan 29 16:47:53 CET 2010


I've edited your information down (too large for the list), and am
including it so others can see if they run into similar problems.

In essence, you have a mess and you are going to have to talk to your
sysadmin (hikmpn) to get things sorted out. Issues:

a) You have openmpi-1.3.3. This works for small problems, fails for
large ones. This needs to be updated to 1.4.0 or 1.4.1 (the older
versions of openmpi have bugs).
b) The openmpi was compiled with ifort 10.1 but you are using 11.1.064
for Wien2k -- could lead to problems.
c) The openmpi was compiled with gcc and ifort 10.1, not icc and ifort
which could lead to problems.
d) The fftw library you are using was compiled with gcc not icc, this
could lead to problems.
e) Some of the shared libraries are in your LD_LIBRARY_PATH, you will
need to add -x LD_LIBRARY_PATH to how mpirun is called (in
$WIENROOT/parallel_options) -- look at man mpirun.
f) I still don't know what the stack limits are on your machine --
this can lead to severe problems in lapw0_mpi

---------- Forwarded message ----------
From: Fhokrul Islam <fhokrul.islam at lnu.se>
Date: Fri, Jan 29, 2010 at 9:16 AM
Subject: MPI segmentation fault
To: "L-marks at northwestern.edu" <L-marks at northwestern.edu>

Below are the information that you requested. I would like to mention
that MPI worked fine when I used it for a bulk
8 atom system. But for surface supercell of 96 atom it crashes at lapw0.

Thanks,
Fhokrul

>> 1) Please do "ompi_info " and paste the output to the end of your
>> response to this email.

   1. [eishfh at milleotto s110]$ ompi_info
                Package: Open MPI hikmpn at milleotto.local Distribution
               Open MPI: 1.3.3
                 Prefix: /home/hikmpn/local
 Configured architecture: x86_64-unknown-linux-gnu
         Configure host: milleotto.local
          Configured by: hikmpn
 Fortran90 bindings size: small
             C compiler: gcc
    C compiler absolute: /usr/bin/gcc
           C++ compiler: g++
  C++ compiler absolute: /usr/bin/g++
     Fortran77 compiler: ifort
 Fortran77 compiler abs: /sw/pkg/intel/10.1/bin//ifort
     Fortran90 compiler: ifort
 Fortran90 compiler abs: /sw/pkg/intel/10.1/bin//ifort

>> 2) Also paste the output of "echo $LD_LIBRARY_PATH"

2. [eishfh at milleotto s110]$ echo $LD_LIBRARY_PATH
/home/eishfh/fftw-2.1.5-gcc/lib/:/home/hikmpn/local/lib/:/sw/pkg/intel/11.1.064//lib/intel64:/sw/pkg/mkl/10.0/lib/em64t:/lib64:/usr/lib64:/usr/X11R6/lib64:/lib:/usr/lib:/usr/X11R6/lib:/usr/local/lib

>> 3) If you have in your .bashrc a "ulimit -s unlimited" please edit
>> this (temporarily) out, then ssh into one of the child nodes.

After editing .bashrc file I did the following from the child node:

3. [eishfh at mn012 ~]$ which mpirun
/home/hikmpn/local/bin/mpirun

4. [eishfh at mn012 ~]$ which lapw0_mpi
/disk/global/home/eishfh/Wien2k_09_2/lapw0_mpi

5. [eishfh at mn012 ~]$ echo $LD_LIBRARY_PATH
-bash: home/eishfh/fftw-2.1.5-gcc/lib/:/home/hikmpn/local/lib/:/sw/pkg/intel/11.1.064//lib/intel64:/sw/pkg/mkl/10.0/lib/em64t:/lib64:/usr/lib64:/usr/X11R6/lib64:/lib:/usr/lib:/usr/X11R6/lib:/usr/local/lib

6. [eishfh at mn012 ~]$ ldd $WIENROOT/lapw0_mpi
       libmkl_intel_lp64.so =>
/sw/pkg/mkl/10.0/lib/em64t/libmkl_intel_lp64.so (0x00002ab5610d3000)
       libmkl_sequential.so =>
/sw/pkg/mkl/10.0/lib/em64t/libmkl_sequential.so (0x00002ab5613d9000)
       libmkl_core.so => /sw/pkg/mkl/10.0/lib/em64t/libmkl_core.so
(0x00002ab561566000)
       libiomp5.so => /sw/pkg/intel/11.1.064//lib/intel64/libiomp5.so
(0x00002ab561738000)
       libsvml.so => /sw/pkg/intel/11.1.064//lib/intel64/libsvml.so
(0x00002ab5618e9000)
       libimf.so => /sw/pkg/intel/11.1.064//lib/intel64/libimf.so
(0x00002ab562694000)
       libifport.so.5 =>
/sw/pkg/intel/11.1.064//lib/intel64/libifport.so.5
(0x00002ab562a28000)
       libifcoremt.so.5 =>
/sw/pkg/intel/11.1.064//lib/intel64/libifcoremt.so.5
(0x00002ab562b61000)
       libintlc.so.5 =>
/sw/pkg/intel/11.1.064//lib/intel64/libintlc.so.5 (0x00002ab562e05000)

-- 
Laurence Marks
Department of Materials Science and Engineering
MSE Rm 2036 Cook Hall
2220 N Campus Drive
Northwestern University
Evanston, IL 60208, USA
Tel: (847) 491-3996 Fax: (847) 491-7820
email: L-marks at northwestern dot edu
Web: www.numis.northwestern.edu
Chair, Commission on Electron Crystallography of IUCR
www.numis.northwestern.edu/
Electron crystallography is the branch of science that uses electron
scattering and imaging to study the structure of matter.


More information about the Wien mailing list