[Wien] Segfault in lapw1_mpi (SL_INIT)
Laurence Marks
L-marks at northwestern.edu
Tue Jul 3 15:10:26 CEST 2012
This is an issue with your openmpi, either a simple one or a nasty
one. Suggestions:
a. Check that you are using libmkl_blacs_openmpi_lp64 or similar, the
"blacs_openmpi" is what matters. This is probably the reaons and just
changing this will fix everything.
b. Run "ompi_info" which is in the openmpi directory and look for
compatibility issues.
c. Recompile openmpi, and I suggest using 1.4.4. Unfortunately there
are some bugs in the 1.3.X versions of openmpi and I never got them to
work, but I did get 1.4.4 to work.
On Tue, Jul 3, 2012 at 3:25 AM, Elias Assmann <elias.assmann at gmail.com> wrote:
> Hello,
>
> When I execute lapw1_mpi, it dies on me immediately:
>
> $ ./lapw1_mpi
> w2k_dispatch_signal(): received: Segmentation fault
> Child id 0 SIGSEGV, contact developers
> --------------------------------------------------------------------------
> MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
> with errorcode 6.
>
> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
> You may or may not see output from other processes, depending on
> exactly when Open MPI kills them.
> --------------------------------------------------------------------------
>
> It turns out that the offending line is the first call to SL_INIT in
> INIT_PARALLEL (SRC_lapw1/modules.F),
>
> SUBROUTINE INIT_PARALLEL
> IMPLICIT NONE
> #ifdef Parallel
> include 'mpif.h'
> INTEGER :: IERR,i,j
> call MPI_INIT(IERR)
> call MPI_COMM_SIZE( MPI_COMM_WORLD, NPE, IERR)
> call MPI_COMM_RANK( MPI_COMM_WORLD, MYID, IERR)
> CALL BARRIER
> -> CALL SL_INIT(ICTXTALL, 1, NPE)
>
> which is called eventually via GTFNAM at the top of the main program
> LAPW1.
>
> I used ifort version 11.1 (specifically, I tried two revisions: 046
> and 072) and the corresponding MKL libraries (including ScaLAPACK).
> The MPI version is openmpi-1.3.2-icc, in case that matters. Neither
> lapw0_mpi nor lapw2_mpi have this problem (then again, they do not
> seem to use SL_INIT).
>
> Any pointers how I should proceed?
>
> Thanks,
>
> Elias
>
> _______________________________________________
> Wien mailing list
> Wien at zeus.theochem.tuwien.ac.at
> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
--
Professor Laurence Marks
Department of Materials Science and Engineering
Northwestern University
www.numis.northwestern.edu 1-847-491-3996
"Research is to see what everybody else has seen, and to think what
nobody else has thought"
Albert Szent-Gyorgi
More information about the Wien
mailing list