[Wien] Segfault in lapw1_mpi (SL_INIT)
mbraga
mbraga at fe.up.pt
Tue Jul 3 15:17:32 CEST 2012
On 03.07.2012 14:10, Laurence Marks wrote:
> This is an issue with your openmpi, either a simple one or a nasty
> one. Suggestions:
>
> a. Check that you are using libmkl_blacs_openmpi_lp64 or similar, the
> "blacs_openmpi" is what matters. This is probably the reaons and just
> changing this will fix everything.
>
> b. Run "ompi_info" which is in the openmpi directory and look for
> compatibility issues.
>
> c. Recompile openmpi, and I suggest using 1.4.4. Unfortunately there
> are some bugs in the 1.3.X versions of openmpi and I never got them
> to
> work, but I did get 1.4.4 to work.
>
> On Tue, Jul 3, 2012 at 3:25 AM, Elias Assmann
> <elias.assmann at gmail.com> wrote:
>> Hello,
>>
>> When I execute lapw1_mpi, it dies on me immediately:
>>
>> $ ./lapw1_mpi
>> w2k_dispatch_signal(): received: Segmentation fault
>> Child id 0 SIGSEGV, contact developers
>>
>> --------------------------------------------------------------------------
>> MPI_ABORT was invoked on rank 0 in communicator
>> MPI_COMM_WORLD
>> with errorcode 6.
>>
>> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI
>> processes.
>> You may or may not see output from other processes,
>> depending on
>> exactly when Open MPI kills them.
>>
>> --------------------------------------------------------------------------
>>
>> It turns out that the offending line is the first call to SL_INIT in
>> INIT_PARALLEL (SRC_lapw1/modules.F),
>>
>> SUBROUTINE INIT_PARALLEL
>> IMPLICIT NONE
>> #ifdef Parallel
>> include 'mpif.h'
>> INTEGER :: IERR,i,j
>> call MPI_INIT(IERR)
>> call MPI_COMM_SIZE( MPI_COMM_WORLD, NPE, IERR)
>> call MPI_COMM_RANK( MPI_COMM_WORLD, MYID, IERR)
>> CALL BARRIER
>> -> CALL SL_INIT(ICTXTALL, 1, NPE)
>>
>> which is called eventually via GTFNAM at the top of the main program
>> LAPW1.
>>
>> I used ifort version 11.1 (specifically, I tried two revisions: 046
>> and 072) and the corresponding MKL libraries (including ScaLAPACK).
>> The MPI version is openmpi-1.3.2-icc, in case that matters. Neither
>> lapw0_mpi nor lapw2_mpi have this problem (then again, they do not
>> seem to use SL_INIT).
>>
>> Any pointers how I should proceed?
>>
>> Thanks,
>>
>> Elias
>>
>> _______________________________________________
>> Wien mailing list
>> Wien at zeus.theochem.tuwien.ac.at
>> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
--
Helena Braga
Engineering Physics Department
Engineering Faculty, Universidade do Porto
R. Dr. Roberto Frias, s/n
4200-465 Porto
Portugal
phone: +351 225081869
email: mbraga at fe.up.pt
URL 1: http://paginas.fe.up.pt/~mbraga/
URL 2:
https://sigarra.up.pt/feup/funcionarios_geral.formview?p_codigo=320005
Our book chapter:
http://www.intechopen.com/books/neutron-diffraction/hydrides-of-cu-and-mg-intermetallic-systems-characterization-catalytic-function-and-applications
More information about the Wien
mailing list