[Wien] Lapw1_mpi complains on true64 using Compaq Scalapack/Blacs

Florent Boucher Florent.Boucher at cnrs-imn.fr
Fri Dec 12 18:17:23 CET 2003


Dear developer,
I am trying to use the mpi version of lapw1 on a GS1280 Marvell (32xEV7) 
computer.
I was able to compile without problem but when I try to start the 
process with Compaq MPI 1.96 I have the following errors in the 
initialization part for the parallel.

azalee-m utest7> run -m hibiscus  dmpirun -np 4 lapw1_mpi lapw1_1.def
 Using            4  processors, My ID =            2
 Using            4  processors, My ID =            1
 Using            4  processors, My ID =            3
 Using            4  processors, My ID =            0
1 - MPI_COMM_GROUP : Invalid communicator
0 - MPI_COMM_GROUP : Invalid communicator
[1] Aborting program !
2 - MPI_COMM_GROUP : Invalid communicator
3 - MPI_COMM_GROUP : Invalid communicator
[0] Aborting program !
[2] Aborting program !
[3] Aborting program !
MPI process 662009 exited with status 5
MPI process 662008 exited with status 5
MPI process 662012 exited with status 5
MPI process 662011 exited with status 5


I found that the the crash appears in modules.F

#ifdef Parallel
          include 'mpif.h'
          INTEGER :: IERR
          call MPI_INIT(IERR)
          call MPI_COMM_SIZE( MPI_COMM_WORLD, NPE, IERR)
          call MPI_COMM_RANK( MPI_COMM_WORLD, MYID, IERR)
          write(6,*) 'Using ', NPE, ' processors, My ID = ', MYID
          CALL BARRIER
          CALL SL_INIT(ICTXTALL, 1, NPE)
--> It complains after !!
          call blacs_gridinit(ICTXTALLR,'R',NPE,1)
#else

According to BLACS documentation 
(http://davinci01.man.ac.uk/pessl/pessl29.html#HDRXINITB), it seems that 
a first call to blacs_get should be done before a blacs_gridinit (it is 
done in this way in SL_init). I have tried...
This solve the initialization part but the it crash in  sigsegv

azalee-m utest7> run -m hibiscus  dmpirun -np 4 lapw1_mpi lapw1_1.def
 Using            4  processors, My ID =            0
 Using            4  processors, My ID =            2
 Using            4  processors, My ID =            3
 Using            4  processors, My ID =            1
forrtl: severe (174): SIGSEGV, segmentation fault occurred
forrtl: severe (174): SIGSEGV, segmentation fault occurred
forrtl: severe (174): SIGSEGV, segmentation fault occurred
forrtl: severe (174): SIGSEGV, segmentation fault occurred
cannot read in lapw1_mpi
cannot read in lapw1_mpi
cannot read in cannot read in lapw1_mpi
lapw1_mpi
MPI process 661759 exited with status 174
MPI process 661758 exited with status 174
MPI process 661760 exited with status 174
MPI process 661757 exited with status 174

 
So, if you have an idea about the solution, it will be nice.

PS: for Peter, if you are interested in speed test on EV7 processor, 
just let me know, I can do benchmark for you.

Regards
Florent

-- 
 --------------------------------------------------------------------------
| Florent BOUCHER                    |                                     |
| Institut des Matériaux Jean Rouxel | Mailto:Florent.Boucher at cnrs-imn.fr  |
| 2, rue de la Houssinière           | Phone: (33) 2 40 37 39 24           |
| BP 32229                           | Fax:   (33) 2 40 37 39 95           |
| 44322 NANTES CEDEX 3 (FRANCE)      | http://www.cnrs-imn.fr              |
 --------------------------------------------------------------------------





More information about the Wien mailing list