[Wien] Fwd: MPI segmentation fault

Md. Fhokrul Islam fislam at hotmail.com
Fri Jan 29 17:17:13 CET 2010


Hi Marks,

    Thanks for pointing out possible problems with our system. I will talk to the 
system admin about these issues.

Fhokrul



> Date: Fri, 29 Jan 2010 09:47:53 -0600
> From: L-marks at northwestern.edu
> To: wien at zeus.theochem.tuwien.ac.at
> Subject: [Wien] Fwd: MPI segmentation fault
> 
> I've edited your information down (too large for the list), and am
> including it so others can see if they run into similar problems.
> 
> In essence, you have a mess and you are going to have to talk to your
> sysadmin (hikmpn) to get things sorted out. Issues:
> 
> a) You have openmpi-1.3.3. This works for small problems, fails for
> large ones. This needs to be updated to 1.4.0 or 1.4.1 (the older
> versions of openmpi have bugs).
> b) The openmpi was compiled with ifort 10.1 but you are using 11.1.064
> for Wien2k -- could lead to problems.
> c) The openmpi was compiled with gcc and ifort 10.1, not icc and ifort
> which could lead to problems.
> d) The fftw library you are using was compiled with gcc not icc, this
> could lead to problems.
> e) Some of the shared libraries are in your LD_LIBRARY_PATH, you will
> need to add -x LD_LIBRARY_PATH to how mpirun is called (in
> $WIENROOT/parallel_options) -- look at man mpirun.
> f) I still don't know what the stack limits are on your machine --
> this can lead to severe problems in lapw0_mpi
> 
> ---------- Forwarded message ----------
> From: Fhokrul Islam <fhokrul.islam at lnu.se>
> Date: Fri, Jan 29, 2010 at 9:16 AM
> Subject: MPI segmentation fault
> To: "L-marks at northwestern.edu" <L-marks at northwestern.edu>
> 
> Below are the information that you requested. I would like to mention
> that MPI worked fine when I used it for a bulk
> 8 atom system. But for surface supercell of 96 atom it crashes at lapw0.
> 
> Thanks,
> Fhokrul
> 
> >> 1) Please do "ompi_info " and paste the output to the end of your
> >> response to this email.
> 
>    1. [eishfh at milleotto s110]$ ompi_info
>                 Package: Open MPI hikmpn at milleotto.local Distribution
>                Open MPI: 1.3.3
>                  Prefix: /home/hikmpn/local
>  Configured architecture: x86_64-unknown-linux-gnu
>          Configure host: milleotto.local
>           Configured by: hikmpn
>  Fortran90 bindings size: small
>              C compiler: gcc
>     C compiler absolute: /usr/bin/gcc
>            C++ compiler: g++
>   C++ compiler absolute: /usr/bin/g++
>      Fortran77 compiler: ifort
>  Fortran77 compiler abs: /sw/pkg/intel/10.1/bin//ifort
>      Fortran90 compiler: ifort
>  Fortran90 compiler abs: /sw/pkg/intel/10.1/bin//ifort
> 
> >> 2) Also paste the output of "echo $LD_LIBRARY_PATH"
> 
> 2. [eishfh at milleotto s110]$ echo $LD_LIBRARY_PATH
> /home/eishfh/fftw-2.1.5-gcc/lib/:/home/hikmpn/local/lib/:/sw/pkg/intel/11.1.064//lib/intel64:/sw/pkg/mkl/10.0/lib/em64t:/lib64:/usr/lib64:/usr/X11R6/lib64:/lib:/usr/lib:/usr/X11R6/lib:/usr/local/lib
> 
> >> 3) If you have in your .bashrc a "ulimit -s unlimited" please edit
> >> this (temporarily) out, then ssh into one of the child nodes.
> 
> After editing .bashrc file I did the following from the child node:
> 
> 3. [eishfh at mn012 ~]$ which mpirun
> /home/hikmpn/local/bin/mpirun
> 
> 4. [eishfh at mn012 ~]$ which lapw0_mpi
> /disk/global/home/eishfh/Wien2k_09_2/lapw0_mpi
> 
> 5. [eishfh at mn012 ~]$ echo $LD_LIBRARY_PATH
> -bash: home/eishfh/fftw-2.1.5-gcc/lib/:/home/hikmpn/local/lib/:/sw/pkg/intel/11.1.064//lib/intel64:/sw/pkg/mkl/10.0/lib/em64t:/lib64:/usr/lib64:/usr/X11R6/lib64:/lib:/usr/lib:/usr/X11R6/lib:/usr/local/lib
> 
> 6. [eishfh at mn012 ~]$ ldd $WIENROOT/lapw0_mpi
>        libmkl_intel_lp64.so =>
> /sw/pkg/mkl/10.0/lib/em64t/libmkl_intel_lp64.so (0x00002ab5610d3000)
>        libmkl_sequential.so =>
> /sw/pkg/mkl/10.0/lib/em64t/libmkl_sequential.so (0x00002ab5613d9000)
>        libmkl_core.so => /sw/pkg/mkl/10.0/lib/em64t/libmkl_core.so
> (0x00002ab561566000)
>        libiomp5.so => /sw/pkg/intel/11.1.064//lib/intel64/libiomp5.so
> (0x00002ab561738000)
>        libsvml.so => /sw/pkg/intel/11.1.064//lib/intel64/libsvml.so
> (0x00002ab5618e9000)
>        libimf.so => /sw/pkg/intel/11.1.064//lib/intel64/libimf.so
> (0x00002ab562694000)
>        libifport.so.5 =>
> /sw/pkg/intel/11.1.064//lib/intel64/libifport.so.5
> (0x00002ab562a28000)
>        libifcoremt.so.5 =>
> /sw/pkg/intel/11.1.064//lib/intel64/libifcoremt.so.5
> (0x00002ab562b61000)
>        libintlc.so.5 =>
> /sw/pkg/intel/11.1.064//lib/intel64/libintlc.so.5 (0x00002ab562e05000)
> 
> -- 
> Laurence Marks
> Department of Materials Science and Engineering
> MSE Rm 2036 Cook Hall
> 2220 N Campus Drive
> Northwestern University
> Evanston, IL 60208, USA
> Tel: (847) 491-3996 Fax: (847) 491-7820
> email: L-marks at northwestern dot edu
> Web: www.numis.northwestern.edu
> Chair, Commission on Electron Crystallography of IUCR
> www.numis.northwestern.edu/
> Electron crystallography is the branch of science that uses electron
> scattering and imaging to study the structure of matter.
> _______________________________________________
> Wien mailing list
> Wien at zeus.theochem.tuwien.ac.at
> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
 		 	   		  
_________________________________________________________________
Your E-mail and More On-the-Go. Get Windows Live Hotmail Free.
https://signup.live.com/signup.aspx?id=60969
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://zeus.theochem.tuwien.ac.at/pipermail/wien/attachments/20100129/b73b6315/attachment-0001.htm>


More information about the Wien mailing list