[Wien] Lapw1_mpi complains on true64 using Compaq Scalapack/Blacs

Peter Blaha pblaha at zeus.theochem.tuwien.ac.at
Fri Dec 12 21:57:51 CET 2003


Dear Florent,

This is the second complain about the new mpi-parallel lapw1 version.
This version was developed on an IBM SP4 in Garching and seems to run fine
on these machines. This new version should run faster in the HNS part,
where the old version had significant sequential overhead. However, it
seems makes problems on both, Linux PCs and on this Alpha machine.

Please use  moduls.F_old and hns.F_old  (only these two "old" files, not
any others!!) and try to compile and run with the "old" version.

PS: To all others who are running lapw1mpi: Please send me any info if you
are able / have problems with the new lapw1mpi version ("new" means since
September 2003). Unfortunately I have also only an IBM machine available
for mpi runs - and here it runs fine.

Eventually I'll have to switch back to the old version.


PPS: Yes, a benchmark time on this EV7 machine would be intersting.


> I am trying to use the mpi version of lapw1 on a GS1280 Marvell (32xEV7)
> computer.
> I was able to compile without problem but when I try to start the
> process with Compaq MPI 1.96 I have the following errors in the
> initialization part for the parallel.
>
> azalee-m utest7> run -m hibiscus  dmpirun -np 4 lapw1_mpi lapw1_1.def
>  Using            4  processors, My ID =            2
>  Using            4  processors, My ID =            1
>  Using            4  processors, My ID =            3
>  Using            4  processors, My ID =            0
> 1 - MPI_COMM_GROUP : Invalid communicator
> 0 - MPI_COMM_GROUP : Invalid communicator
> [1] Aborting program !
> 2 - MPI_COMM_GROUP : Invalid communicator
> 3 - MPI_COMM_GROUP : Invalid communicator
> [0] Aborting program !
> [2] Aborting program !
> [3] Aborting program !
> MPI process 662009 exited with status 5
> MPI process 662008 exited with status 5
> MPI process 662012 exited with status 5
> MPI process 662011 exited with status 5
>
>
> I found that the the crash appears in modules.F
>
> #ifdef Parallel
>           include 'mpif.h'
>           INTEGER :: IERR
>           call MPI_INIT(IERR)
>           call MPI_COMM_SIZE( MPI_COMM_WORLD, NPE, IERR)
>           call MPI_COMM_RANK( MPI_COMM_WORLD, MYID, IERR)
>           write(6,*) 'Using ', NPE, ' processors, My ID = ', MYID
>           CALL BARRIER
>           CALL SL_INIT(ICTXTALL, 1, NPE)
> --> It complains after !!
>           call blacs_gridinit(ICTXTALLR,'R',NPE,1)
> #else
>
> According to BLACS documentation
> (http://davinci01.man.ac.uk/pessl/pessl29.html#HDRXINITB), it seems that
> a first call to blacs_get should be done before a blacs_gridinit (it is
> done in this way in SL_init). I have tried...
> This solve the initialization part but the it crash in  sigsegv
>
> azalee-m utest7> run -m hibiscus  dmpirun -np 4 lapw1_mpi lapw1_1.def
>  Using            4  processors, My ID =            0
>  Using            4  processors, My ID =            2
>  Using            4  processors, My ID =            3
>  Using            4  processors, My ID =            1
> forrtl: severe (174): SIGSEGV, segmentation fault occurred
> forrtl: severe (174): SIGSEGV, segmentation fault occurred
> forrtl: severe (174): SIGSEGV, segmentation fault occurred
> forrtl: severe (174): SIGSEGV, segmentation fault occurred
> cannot read in lapw1_mpi
> cannot read in lapw1_mpi
> cannot read in cannot read in lapw1_mpi
> lapw1_mpi
> MPI process 661759 exited with status 174
> MPI process 661758 exited with status 174
> MPI process 661760 exited with status 174
> MPI process 661757 exited with status 174
>
>
> So, if you have an idea about the solution, it will be nice.
>
> PS: for Peter, if you are interested in speed test on EV7 processor,
> just let me know, I can do benchmark for you.
>
> Regards
> Florent
>
> --
>  --------------------------------------------------------------------------
> | Florent BOUCHER                    |                                     |
> | Institut des Matériaux Jean Rouxel | Mailto:Florent.Boucher at cnrs-imn.fr  |
> | 2, rue de la Houssinière           | Phone: (33) 2 40 37 39 24           |
> | BP 32229                           | Fax:   (33) 2 40 37 39 95           |
> | 44322 NANTES CEDEX 3 (FRANCE)      | http://www.cnrs-imn.fr              |
>  --------------------------------------------------------------------------
>
>
> _______________________________________________
> Wien mailing list
> Wien at zeus.theochem.tuwien.ac.at
> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
>


                                      P.Blaha
--------------------------------------------------------------------------
Peter BLAHA, Inst.f. Materials Chemistry, TU Vienna, A-1060 Vienna
Phone: +43-1-58801-15671             FAX: +43-1-58801-15698
Email: blaha at theochem.tuwien.ac.at    WWW: http://info.tuwien.ac.at/theochem/
--------------------------------------------------------------------------




More information about the Wien mailing list