<br>Your comment sounds reasonable. However, our machines are pretty new and they do have 4GB RAM/core. I can handle this job with one single core, so I am not sure if you are correct about memory problem. I will check more details about memory when I get the same problem again.<br>
<br>Anyway, it works now and I understand that I did not do a stupid thing when split lapw2 to too many parts.<br>Thank you all for the input. <br><br><br><div class="gmail_quote">On Wed, Sep 30, 2009 at 12:17 PM, Laurence Marks <span dir="ltr"><<a href="mailto:L-marks@northwestern.edu">L-marks@northwestern.edu</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">It sounds like you are memory limited (RAM). If you use<br>
lapw2_vector_split:N then the arrays used by lapw2 are split into N<br>
parts. If M is the size of the array, then the total memory<br>
requirement with the method you are using will always be M. If you<br>
have 21 atoms (total) I am surprised that you have this problem,<br>
perhaps you need more memory. (If it is 21 unique atoms then it is<br>
possible, but still surprising.) Have a look at /proc/meminfo and if<br>
you are running ganglia look at your memory records. You need<br>
something like 2Gb per core and more might be better for newer<br>
systems; with 1Gb or less per core you can easily run into this sort<br>
of problem.<br>
<br>
2009/9/30 Duy Le <<a href="mailto:ttduyle@gmail.com">ttduyle@gmail.com</a>>:<br>
<div><div></div><div class="h5">> Thank you for your all inputs.<br>
> I am running test on a system of 21 atoms, with spin polarized calculation,<br>
> with 2 k-points, without inversion symmetry. Of course this test only with<br>
> small system. So there would be no problem with the matrix size. The<br>
> .machines file I have provided in my previous email.<br>
> Good news, the problem has been solved. By using:<br>
> lapw2_vector_split:$NCUS_per_MPI_JOB<br>
> I am able to finish the benchmark test with 1, 2, 4, 8, 16 CPUS (on the same<br>
> nodes) by fully MPI or by hydrid K-parallel& MPI.<br>
><br>
> I am really not sure the way I do is correct.<br>
> (lapw2_vector_split:$NCUS_per_MPI_JOB)<br>
> Could anyone explain this for me? I am pretty new with Wien2k.<br>
> Thank you.<br>
> On Wed, Sep 30, 2009 at 3:12 AM, Peter Blaha <<a href="mailto:pblaha@theochem.tuwien.ac.at">pblaha@theochem.tuwien.ac.at</a>><br>
> wrote:<br>
>><br>
>> Very unusual, I cannot believe that 3 or 7 nodes run efficiently (lapw1)<br>
>> or<br>
>> are necessary.<br>
>> Maybe memory is an issue and you should try to set<br>
>><br>
>> lapw2_vector_split:2<br>
>><br>
>> (with a even number of processors!)<br>
>><br>
>>> I can run mpi with lapw0, lapw1, and lapw2. However, lapw2 can run<br>
>>> without problem with certain number of PROCESSORS PER MPI JOB (in both<br>
>>> cases: fully mpi and/or hybrid k-parallel+mpi). Those certain numbers are 3<br>
>>> and 7. If I try to run with other numbers of PROCESSORS PER MPI JOB, it<br>
>>> gives me an message like below. This problem doesn't occur with lapw0 and<br>
>>> lapw1. If any of you could give me some suggestion of fixing this problem,<br>
>>> it would be appreciated.<br>
>>><br>
>>> [compute-0-2.local:08162] *** An error occurred in MPI_Comm_split<br>
>>> [compute-0-2.local:08162] *** on communicator MPI_COMM_WORLD<br>
>>> [compute-0-2.local:08162] *** MPI_ERR_ARG: invalid argument of some other<br>
>>> kind<br>
>>> [compute-0-2.local:08162] *** MPI_ERRORS_ARE_FATAL (goodbye)<br>
>>> forrtl: error (78): process killed (SIGTERM)<br>
>>> Image PC Routine Line<br>
>>> Source libpthread.so.0 000000383440DE80 Unknown<br>
>>> Unknown Unknown<br>
>>> ........... etc....<br>
>>><br>
>>><br>
>>> Reference:<br>
>>> OPTIONS file:<br>
>>> current:FOPT:-FR -mp1 -w -prec_div -pc80 -pad -align -DINTEL_VML<br>
>>> -traceback<br>
>>> current:FPOPT:$(FOPT)<br>
>>> current:LDFLAGS:$(FOPT) -L/share/apps/fftw-3.2.1/lib/ -lfftw3<br>
>>> -L/share/apps/inte<br>
>>> l/mkl/10.0.011/lib/em64t -i-static -openmp<br>
>>> current:DPARALLEL:'-DParallel'<br>
>>> current:R_LIBS:-lmkl_lapack -lmkl_core -lmkl_em64t -lguide -lpthread<br>
>>> current:RP_LIBS:-lmkl_scalapack_lp64 -lmkl_solver_lp64_sequential<br>
>>> -Wl,--start-gr<br>
>>> oup -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lmkl_blacs_openmpi_lp64<br>
>>> -Wl,--<br>
>>> end-group -lpthread -lmkl_em64t -L/share/apps/intel/fce/10.1.008/lib<br>
>>> -limf<br>
>>> current:MPIRUN:mpirun -np _NP_ -machinefile _HOSTS_ _EXEC_<br>
>>><br>
>>> Openmpi 1.2.6<br>
>>> Intel compiler 10<br>
>>><br>
>>> .machines<br>
>>> lapw0:compute-0-2:4<br>
>>> 1:compute-0-2:4<br>
>>> granularity:1<br>
>>> extrafine:1<br>
>>> lapw2_vector_split:1<br>
>>><br>
>>> --------------------------------------------------<br>
>>> Duy Le<br>
>>> PhD Student<br>
>>> Department of Physics<br>
>>> University of Central Florida.<br>
>>><br>
>>><br>
>>> ------------------------------------------------------------------------<br>
>>><br>
>>> _______________________________________________<br>
>>> Wien mailing list<br>
>>> <a href="mailto:Wien@zeus.theochem.tuwien.ac.at">Wien@zeus.theochem.tuwien.ac.at</a><br>
>>> <a href="http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien" target="_blank">http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien</a><br>
>><br>
>> --<br>
>><br>
>> P.Blaha<br>
>> --------------------------------------------------------------------------<br>
>> Peter BLAHA, Inst.f. Materials Chemistry, TU Vienna, A-1060 Vienna<br>
>> Phone: +43-1-58801-15671 FAX: +43-1-58801-15698<br>
>> Email: <a href="mailto:blaha@theochem.tuwien.ac.at">blaha@theochem.tuwien.ac.at</a> WWW:<br>
>> <a href="http://info.tuwien.ac.at/theochem/" target="_blank">http://info.tuwien.ac.at/theochem/</a><br>
>> --------------------------------------------------------------------------<br>
>> _______________________________________________<br>
>> Wien mailing list<br>
>> <a href="mailto:Wien@zeus.theochem.tuwien.ac.at">Wien@zeus.theochem.tuwien.ac.at</a><br>
>> <a href="http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien" target="_blank">http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien</a><br>
><br>
><br>
><br>
> --<br>
> --------------------------------------------------<br>
> Duy Le<br>
> PhD Student<br>
> Department of Physics<br>
> University of Central Florida.<br>
><br>
> _______________________________________________<br>
> Wien mailing list<br>
> <a href="mailto:Wien@zeus.theochem.tuwien.ac.at">Wien@zeus.theochem.tuwien.ac.at</a><br>
> <a href="http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien" target="_blank">http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien</a><br>
><br>
><br>
<br>
<br>
<br>
--<br>
</div></div>Laurence Marks<br>
Department of Materials Science and Engineering<br>
MSE Rm 2036 Cook Hall<br>
2220 N Campus Drive<br>
Northwestern University<br>
Evanston, IL 60208, USA<br>
Tel: (847) 491-3996 Fax: (847) 491-7820<br>
email: L-marks at northwestern dot edu<br>
Web: <a href="http://www.numis.northwestern.edu" target="_blank">www.numis.northwestern.edu</a><br>
Chair, Commission on Electron Crystallography of IUCR<br>
<a href="http://www.numis.northwestern.edu/" target="_blank">www.numis.northwestern.edu/</a><br>
Electron crystallography is the branch of science that uses electron<br>
scattering and imaging to study the structure of matter.<br>
<div><div></div><div class="h5">_______________________________________________<br>
Wien mailing list<br>
<a href="mailto:Wien@zeus.theochem.tuwien.ac.at">Wien@zeus.theochem.tuwien.ac.at</a><br>
<a href="http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien" target="_blank">http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien</a><br>
</div></div></blockquote></div><br><br clear="all"><br>-- <br>--------------------------------------------------<br>Duy Le<br>PhD Student<br>Department of Physics<br>University of Central Florida.<br>