[Wien] lapw0_mpi (MPI_Bcast) bug
Pawel Lesniak
lesniak at ifmpan.poznan.pl
Wed Feb 3 09:01:57 CET 2010
W dniu 02.02.2010 20:21, Laurence Marks pisze:
> There is a bug in some mpi implementations that can lead to a SIGSEV
> in lapw0_mpi for large problems. The symptom is a SIGSEV at the
> MPI_Bcast call
> at about line 1574:
> #ifdef Parallel
> if (.not.coul) allocate(potk(nkk))
> call MPI_Bcast(potk, NKK, MPI_DOUBLE_COMPLEX, 0,MPI_COMM_WORLD , ierr)
> #endif
>
> A patch (works) is to not do the MPI_Bcase all at once:
>
> #ifdef Parallel
> if (.not.coul) allocate(potk(nkk))
> ibbuff=32768 ! Could be optimized
> i2=NKK/ibbuff
> it1=1
> do i=1,i2
> call MPI_Bcast(potk(it1), 1024, MPI_DOUBLE_COMPLEX ,
> 0,MPI_COMM_WORLD , ierr)
> it1=it1+ibbuff
> enddo
> if(it1 .lt. nkk)then
> it2=nkk-it1+1
> call MPI_Bcast(potk(it1), it2, MPI_DOUBLE_COMPLEX ,
> 0,MPI_COMM_WORLD , ierr)
> endif
> call MPI_BARRIER(MPI_COMM_WORLD,ierr)
> #endif
>
>
>
Are you sure that
call MPI_Bcast(potk(it1), 1024, MPI_DOUBLE_COMPLEX ,
0,MPI_COMM_WORLD , ierr)
is correct?
Why length of 1024 when you split array by ibbuff ?
I believe correct version is:
call MPI_Bcast(potk(it1), ibbuff, MPI_DOUBLE_COMPLEX ,
0,MPI_COMM_WORLD , ierr)
Pawel Lesniak
More information about the Wien
mailing list