[Wien] Lapw1_mpi complains on true64 using Compaq Scalapack/Blacs
Florent Boucher
Florent.Boucher at cnrs-imn.fr
Fri Dec 12 18:17:23 CET 2003
Dear developer,
I am trying to use the mpi version of lapw1 on a GS1280 Marvell (32xEV7)
computer.
I was able to compile without problem but when I try to start the
process with Compaq MPI 1.96 I have the following errors in the
initialization part for the parallel.
azalee-m utest7> run -m hibiscus dmpirun -np 4 lapw1_mpi lapw1_1.def
Using 4 processors, My ID = 2
Using 4 processors, My ID = 1
Using 4 processors, My ID = 3
Using 4 processors, My ID = 0
1 - MPI_COMM_GROUP : Invalid communicator
0 - MPI_COMM_GROUP : Invalid communicator
[1] Aborting program !
2 - MPI_COMM_GROUP : Invalid communicator
3 - MPI_COMM_GROUP : Invalid communicator
[0] Aborting program !
[2] Aborting program !
[3] Aborting program !
MPI process 662009 exited with status 5
MPI process 662008 exited with status 5
MPI process 662012 exited with status 5
MPI process 662011 exited with status 5
I found that the the crash appears in modules.F
#ifdef Parallel
include 'mpif.h'
INTEGER :: IERR
call MPI_INIT(IERR)
call MPI_COMM_SIZE( MPI_COMM_WORLD, NPE, IERR)
call MPI_COMM_RANK( MPI_COMM_WORLD, MYID, IERR)
write(6,*) 'Using ', NPE, ' processors, My ID = ', MYID
CALL BARRIER
CALL SL_INIT(ICTXTALL, 1, NPE)
--> It complains after !!
call blacs_gridinit(ICTXTALLR,'R',NPE,1)
#else
According to BLACS documentation
(http://davinci01.man.ac.uk/pessl/pessl29.html#HDRXINITB), it seems that
a first call to blacs_get should be done before a blacs_gridinit (it is
done in this way in SL_init). I have tried...
This solve the initialization part but the it crash in sigsegv
azalee-m utest7> run -m hibiscus dmpirun -np 4 lapw1_mpi lapw1_1.def
Using 4 processors, My ID = 0
Using 4 processors, My ID = 2
Using 4 processors, My ID = 3
Using 4 processors, My ID = 1
forrtl: severe (174): SIGSEGV, segmentation fault occurred
forrtl: severe (174): SIGSEGV, segmentation fault occurred
forrtl: severe (174): SIGSEGV, segmentation fault occurred
forrtl: severe (174): SIGSEGV, segmentation fault occurred
cannot read in lapw1_mpi
cannot read in lapw1_mpi
cannot read in cannot read in lapw1_mpi
lapw1_mpi
MPI process 661759 exited with status 174
MPI process 661758 exited with status 174
MPI process 661760 exited with status 174
MPI process 661757 exited with status 174
So, if you have an idea about the solution, it will be nice.
PS: for Peter, if you are interested in speed test on EV7 processor,
just let me know, I can do benchmark for you.
Regards
Florent
--
--------------------------------------------------------------------------
| Florent BOUCHER | |
| Institut des Matériaux Jean Rouxel | Mailto:Florent.Boucher at cnrs-imn.fr |
| 2, rue de la Houssinière | Phone: (33) 2 40 37 39 24 |
| BP 32229 | Fax: (33) 2 40 37 39 95 |
| 44322 NANTES CEDEX 3 (FRANCE) | http://www.cnrs-imn.fr |
--------------------------------------------------------------------------
More information about the Wien
mailing list