[Wien] MPI problem for LAPW2

Duy Le ttduyle at gmail.com
Wed Sep 30 19:42:36 CEST 2009


Your comment sounds reasonable. However, our machines are pretty new and
they do have 4GB RAM/core. I can handle this job with one single core, so I
am not sure if you are correct about memory problem. I will check more
details about memory when I get the same problem again.

Anyway, it works now and I understand that I did not do a stupid thing when
split lapw2 to too many parts.
Thank you all for the input.


On Wed, Sep 30, 2009 at 12:17 PM, Laurence Marks
<L-marks at northwestern.edu>wrote:

> It sounds like you are memory limited (RAM). If you use
> lapw2_vector_split:N then the arrays used by lapw2 are split into N
> parts. If M is the size of the array, then the total memory
> requirement with the method you are using will always be M. If you
> have 21 atoms (total) I am surprised that you have this problem,
> perhaps you need more memory. (If it is 21 unique atoms then it is
> possible, but still surprising.) Have a look at /proc/meminfo and if
> you are running ganglia look at your memory records. You need
> something like 2Gb per core and more might be better for newer
> systems; with 1Gb or less per core you can easily run into this sort
> of problem.
>
> 2009/9/30 Duy Le <ttduyle at gmail.com>:
> > Thank you for your all inputs.
> > I am running test on a system of 21 atoms, with spin polarized
> calculation,
> > with 2 k-points, without inversion symmetry. Of course this test only
> with
> > small system. So there would be no problem with the matrix size. The
> > .machines file I have provided in my previous email.
> > Good news, the problem has been solved. By using:
> > lapw2_vector_split:$NCUS_per_MPI_JOB
> > I am able to finish the benchmark test with 1, 2, 4, 8, 16 CPUS (on the
> same
> > nodes) by fully MPI or by hydrid K-parallel& MPI.
> >
> > I am really not sure the way I do is correct.
> > (lapw2_vector_split:$NCUS_per_MPI_JOB)
> > Could anyone explain this for me? I am pretty new with Wien2k.
> > Thank you.
> > On Wed, Sep 30, 2009 at 3:12 AM, Peter Blaha <
> pblaha at theochem.tuwien.ac.at>
> > wrote:
> >>
> >> Very unusual, I cannot believe that 3 or 7 nodes run efficiently (lapw1)
> >> or
> >> are necessary.
> >> Maybe memory is an issue and you should try to set
> >>
> >> lapw2_vector_split:2
> >>
> >> (with a even number of processors!)
> >>
> >>> I can run mpi with lapw0, lapw1, and lapw2. However, lapw2 can run
> >>> without problem with certain number of PROCESSORS PER MPI JOB (in both
> >>> cases: fully mpi and/or hybrid k-parallel+mpi). Those certain numbers
> are 3
> >>> and 7. If I try to run with other numbers of PROCESSORS PER MPI JOB, it
> >>> gives me an message like below. This problem doesn't occur with lapw0
> and
> >>> lapw1. If any of you could give me some suggestion of fixing this
> problem,
> >>> it would be appreciated.
> >>>
> >>> [compute-0-2.local:08162] *** An error occurred in MPI_Comm_split
> >>> [compute-0-2.local:08162] *** on communicator MPI_COMM_WORLD
> >>> [compute-0-2.local:08162] *** MPI_ERR_ARG: invalid argument of some
> other
> >>> kind
> >>> [compute-0-2.local:08162] *** MPI_ERRORS_ARE_FATAL (goodbye)
> >>> forrtl: error (78): process killed (SIGTERM)
> >>> Image              PC                Routine            Line
> >>>  Source          libpthread.so.0    000000383440DE80  Unknown
> >>> Unknown  Unknown
> >>> ........... etc....
> >>>
> >>>
> >>> Reference:
> >>> OPTIONS file:
> >>> current:FOPT:-FR -mp1 -w -prec_div -pc80 -pad -align -DINTEL_VML
> >>> -traceback
> >>> current:FPOPT:$(FOPT)
> >>> current:LDFLAGS:$(FOPT) -L/share/apps/fftw-3.2.1/lib/ -lfftw3
> >>> -L/share/apps/inte
> >>> l/mkl/10.0.011/lib/em64t -i-static -openmp
> >>> current:DPARALLEL:'-DParallel'
> >>> current:R_LIBS:-lmkl_lapack -lmkl_core -lmkl_em64t -lguide -lpthread
> >>> current:RP_LIBS:-lmkl_scalapack_lp64 -lmkl_solver_lp64_sequential
> >>> -Wl,--start-gr
> >>> oup -lmkl_intel_lp64 -lmkl_sequential -lmkl_core
> -lmkl_blacs_openmpi_lp64
> >>> -Wl,--
> >>> end-group -lpthread -lmkl_em64t -L/share/apps/intel/fce/10.1.008/lib
> >>> -limf
> >>> current:MPIRUN:mpirun -np _NP_ -machinefile _HOSTS_ _EXEC_
> >>>
> >>> Openmpi 1.2.6
> >>> Intel compiler 10
> >>>
> >>> .machines
> >>> lapw0:compute-0-2:4
> >>> 1:compute-0-2:4
> >>> granularity:1
> >>> extrafine:1
> >>> lapw2_vector_split:1
> >>>
> >>> --------------------------------------------------
> >>> Duy Le
> >>> PhD Student
> >>> Department of Physics
> >>> University of Central Florida.
> >>>
> >>>
> >>>
> ------------------------------------------------------------------------
> >>>
> >>> _______________________________________________
> >>> Wien mailing list
> >>> Wien at zeus.theochem.tuwien.ac.at
> >>> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
> >>
> >> --
> >>
> >>                                      P.Blaha
> >>
> --------------------------------------------------------------------------
> >> Peter BLAHA, Inst.f. Materials Chemistry, TU Vienna, A-1060 Vienna
> >> Phone: +43-1-58801-15671             FAX: +43-1-58801-15698
> >> Email: blaha at theochem.tuwien.ac.at    WWW:
> >> http://info.tuwien.ac.at/theochem/
> >>
> --------------------------------------------------------------------------
> >> _______________________________________________
> >> Wien mailing list
> >> Wien at zeus.theochem.tuwien.ac.at
> >> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
> >
> >
> >
> > --
> > --------------------------------------------------
> > Duy Le
> > PhD Student
> > Department of Physics
> > University of Central Florida.
> >
> > _______________________________________________
> > Wien mailing list
> > Wien at zeus.theochem.tuwien.ac.at
> > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
> >
> >
>
>
>
> --
> Laurence Marks
> Department of Materials Science and Engineering
> MSE Rm 2036 Cook Hall
> 2220 N Campus Drive
> Northwestern University
> Evanston, IL 60208, USA
> Tel: (847) 491-3996 Fax: (847) 491-7820
> email: L-marks at northwestern dot edu
> Web: www.numis.northwestern.edu
> Chair, Commission on Electron Crystallography of IUCR
> www.numis.northwestern.edu/
> Electron crystallography is the branch of science that uses electron
> scattering and imaging to study the structure of matter.
> _______________________________________________
> Wien mailing list
> Wien at zeus.theochem.tuwien.ac.at
> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
>



-- 
--------------------------------------------------
Duy Le
PhD Student
Department of Physics
University of Central Florida.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://zeus.theochem.tuwien.ac.at/pipermail/wien/attachments/20090930/5ec623c0/attachment.htm>


More information about the Wien mailing list