[Wien] Fwd: MPI segmentation fault
Md. Fhokrul Islam
fislam at hotmail.com
Fri Jan 29 17:17:13 CET 2010
Hi Marks,
Thanks for pointing out possible problems with our system. I will talk to the
system admin about these issues.
Fhokrul
> Date: Fri, 29 Jan 2010 09:47:53 -0600
> From: L-marks at northwestern.edu
> To: wien at zeus.theochem.tuwien.ac.at
> Subject: [Wien] Fwd: MPI segmentation fault
>
> I've edited your information down (too large for the list), and am
> including it so others can see if they run into similar problems.
>
> In essence, you have a mess and you are going to have to talk to your
> sysadmin (hikmpn) to get things sorted out. Issues:
>
> a) You have openmpi-1.3.3. This works for small problems, fails for
> large ones. This needs to be updated to 1.4.0 or 1.4.1 (the older
> versions of openmpi have bugs).
> b) The openmpi was compiled with ifort 10.1 but you are using 11.1.064
> for Wien2k -- could lead to problems.
> c) The openmpi was compiled with gcc and ifort 10.1, not icc and ifort
> which could lead to problems.
> d) The fftw library you are using was compiled with gcc not icc, this
> could lead to problems.
> e) Some of the shared libraries are in your LD_LIBRARY_PATH, you will
> need to add -x LD_LIBRARY_PATH to how mpirun is called (in
> $WIENROOT/parallel_options) -- look at man mpirun.
> f) I still don't know what the stack limits are on your machine --
> this can lead to severe problems in lapw0_mpi
>
> ---------- Forwarded message ----------
> From: Fhokrul Islam <fhokrul.islam at lnu.se>
> Date: Fri, Jan 29, 2010 at 9:16 AM
> Subject: MPI segmentation fault
> To: "L-marks at northwestern.edu" <L-marks at northwestern.edu>
>
> Below are the information that you requested. I would like to mention
> that MPI worked fine when I used it for a bulk
> 8 atom system. But for surface supercell of 96 atom it crashes at lapw0.
>
> Thanks,
> Fhokrul
>
> >> 1) Please do "ompi_info " and paste the output to the end of your
> >> response to this email.
>
> 1. [eishfh at milleotto s110]$ ompi_info
> Package: Open MPI hikmpn at milleotto.local Distribution
> Open MPI: 1.3.3
> Prefix: /home/hikmpn/local
> Configured architecture: x86_64-unknown-linux-gnu
> Configure host: milleotto.local
> Configured by: hikmpn
> Fortran90 bindings size: small
> C compiler: gcc
> C compiler absolute: /usr/bin/gcc
> C++ compiler: g++
> C++ compiler absolute: /usr/bin/g++
> Fortran77 compiler: ifort
> Fortran77 compiler abs: /sw/pkg/intel/10.1/bin//ifort
> Fortran90 compiler: ifort
> Fortran90 compiler abs: /sw/pkg/intel/10.1/bin//ifort
>
> >> 2) Also paste the output of "echo $LD_LIBRARY_PATH"
>
> 2. [eishfh at milleotto s110]$ echo $LD_LIBRARY_PATH
> /home/eishfh/fftw-2.1.5-gcc/lib/:/home/hikmpn/local/lib/:/sw/pkg/intel/11.1.064//lib/intel64:/sw/pkg/mkl/10.0/lib/em64t:/lib64:/usr/lib64:/usr/X11R6/lib64:/lib:/usr/lib:/usr/X11R6/lib:/usr/local/lib
>
> >> 3) If you have in your .bashrc a "ulimit -s unlimited" please edit
> >> this (temporarily) out, then ssh into one of the child nodes.
>
> After editing .bashrc file I did the following from the child node:
>
> 3. [eishfh at mn012 ~]$ which mpirun
> /home/hikmpn/local/bin/mpirun
>
> 4. [eishfh at mn012 ~]$ which lapw0_mpi
> /disk/global/home/eishfh/Wien2k_09_2/lapw0_mpi
>
> 5. [eishfh at mn012 ~]$ echo $LD_LIBRARY_PATH
> -bash: home/eishfh/fftw-2.1.5-gcc/lib/:/home/hikmpn/local/lib/:/sw/pkg/intel/11.1.064//lib/intel64:/sw/pkg/mkl/10.0/lib/em64t:/lib64:/usr/lib64:/usr/X11R6/lib64:/lib:/usr/lib:/usr/X11R6/lib:/usr/local/lib
>
> 6. [eishfh at mn012 ~]$ ldd $WIENROOT/lapw0_mpi
> libmkl_intel_lp64.so =>
> /sw/pkg/mkl/10.0/lib/em64t/libmkl_intel_lp64.so (0x00002ab5610d3000)
> libmkl_sequential.so =>
> /sw/pkg/mkl/10.0/lib/em64t/libmkl_sequential.so (0x00002ab5613d9000)
> libmkl_core.so => /sw/pkg/mkl/10.0/lib/em64t/libmkl_core.so
> (0x00002ab561566000)
> libiomp5.so => /sw/pkg/intel/11.1.064//lib/intel64/libiomp5.so
> (0x00002ab561738000)
> libsvml.so => /sw/pkg/intel/11.1.064//lib/intel64/libsvml.so
> (0x00002ab5618e9000)
> libimf.so => /sw/pkg/intel/11.1.064//lib/intel64/libimf.so
> (0x00002ab562694000)
> libifport.so.5 =>
> /sw/pkg/intel/11.1.064//lib/intel64/libifport.so.5
> (0x00002ab562a28000)
> libifcoremt.so.5 =>
> /sw/pkg/intel/11.1.064//lib/intel64/libifcoremt.so.5
> (0x00002ab562b61000)
> libintlc.so.5 =>
> /sw/pkg/intel/11.1.064//lib/intel64/libintlc.so.5 (0x00002ab562e05000)
>
> --
> Laurence Marks
> Department of Materials Science and Engineering
> MSE Rm 2036 Cook Hall
> 2220 N Campus Drive
> Northwestern University
> Evanston, IL 60208, USA
> Tel: (847) 491-3996 Fax: (847) 491-7820
> email: L-marks at northwestern dot edu
> Web: www.numis.northwestern.edu
> Chair, Commission on Electron Crystallography of IUCR
> www.numis.northwestern.edu/
> Electron crystallography is the branch of science that uses electron
> scattering and imaging to study the structure of matter.
> _______________________________________________
> Wien mailing list
> Wien at zeus.theochem.tuwien.ac.at
> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
_________________________________________________________________
Your E-mail and More On-the-Go. Get Windows Live Hotmail Free.
https://signup.live.com/signup.aspx?id=60969
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://zeus.theochem.tuwien.ac.at/pipermail/wien/attachments/20100129/b73b6315/attachment-0001.htm>
More information about the Wien
mailing list