<div dir="ltr">Dear Prof. Blaha, Marks, Rubel and Abo,<div><br></div><div> First of all, I would like to thank your attention concerning my mpiexec_mpt problem. It is now solved. The hint was in the documentation sent by Prof. Marks and Abo ( <a href="http://techpubs.sgi.com/library/tpl/cgi-bin/getdoc.cgi?coll=linux&db=man&fname=/usr/share/catman/man1/mpiexec_mpt.1.html" target="_blank">http://techpubs.sgi.com/library/tpl/cgi-bin/getdoc.cgi?coll=linux&db=man&fname=/usr/share/catman/man1/mpiexec_mpt.1.html</a> ). At its end it is written : </div>
<div><br></div><div><div>" The mpiexec_mpt command reads the node list from the $PBS_NODEFILE file. "</div></div><div><br><div class="gmail_extra">what means that the -machinefile option must be omitted in the " setenv WIEN_MPIRUN " line of parallel_options file when one is using mpiexec_mpt.</div>
<div class="gmail_extra"><br></div><div class="gmail_extra"> I would like to ask another question: is it dangerous to use " extrafine " and " -it " simultaneously in a parallel calculation ?</div><div class="gmail_extra">
I have some indications (using the SGI cluster and a DELL workstation) that:</div><div class="gmail_extra"><br></div><div class="gmail_extra">1) " extrafine " WITHOUT " -it " is fine</div><div class="gmail_extra">
2) " -it " WITHOUT " extrafine " is fine</div><div class="gmail_extra">3) " extrafine " WITH " -it " does not succeed, giving rise to the following error message in the SGI cluster (scratch is the working directory) :</div>
<div class="gmail_extra"><br></div><div class="gmail_extra"><div class="gmail_extra">forrtl: severe (41): insufficient virtual memory</div><div class="gmail_extra">Image PC Routine Line Source </div>
<div class="gmail_extra">lapw1c 000000000052E04A Unknown Unknown Unknown</div><div class="gmail_extra">lapw1c 000000000052CB46 Unknown Unknown Unknown</div><div class="gmail_extra">
lapw1c 00000000004D6B50 Unknown Unknown Unknown</div><div class="gmail_extra">lapw1c 00000000004895CF Unknown Unknown Unknown</div><div class="gmail_extra">lapw1c 00000000004BA106 Unknown Unknown Unknown</div>
<div class="gmail_extra">lapw1c 0000000000478D9A jacdavblock_ 240 jacdavblock_tmp_.F</div><div class="gmail_extra">lapw1c 0000000000470690 seclr5_ 277 seclr5_tmp_.F</div>
<div class="gmail_extra">lapw1c 000000000040FA16 calkpt_ 241 calkpt_tmp_.F</div><div class="gmail_extra">lapw1c 0000000000449EB3 MAIN__ 61 lapw1_tmp_.F</div>
<div class="gmail_extra">lapw1c 000000000040515C Unknown Unknown Unknown</div><div class="gmail_extra">libc.so.6 00002ABBA7930BC6 Unknown Unknown Unknown</div><div class="gmail_extra">
lapw1c 0000000000405059 Unknown Unknown Unknown</div><div class="gmail_extra"><br></div><div class="gmail_extra"><br></div><div class="gmail_extra" style>only for some processors, not all of them. This is a little bit strange, remembering that all the nodes are equal in the cluster. May this have relation with the (number of k-points)/(number of processors) ratio ?</div>
<div class="gmail_extra" style><br></div><div class="gmail_extra" style> Well, many thanks again.</div><div class="gmail_extra" style> All the best,</div><div class="gmail_extra" style> Luis</div><div class="gmail_extra" style>
<br></div><div class="gmail_extra"><br></div><div class="gmail_extra"><br></div><div class="gmail_extra"><br></div><div class="gmail_extra"><br></div><br><div class="gmail_quote">2013/7/11 Laurence Marks <span dir="ltr"><<a href="mailto:L-marks@northwestern.edu" target="_blank">L-marks@northwestern.edu</a>></span><br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">I 99,9% agree with what Peter just said.<br>
<br>
According to the man page at<br>
<a href="http://techpubs.sgi.com/library/tpl/cgi-bin/getdoc.cgi?coll=linux&db=man&fname=/usr/share/catman/man1/mpiexec_mpt.1.html" target="_blank">http://techpubs.sgi.com/library/tpl/cgi-bin/getdoc.cgi?coll=linux&db=man&fname=/usr/share/catman/man1/mpiexec_mpt.1.html</a><br>
(which may be wrong for you), the same global options as mpirun<br>
accepts will work. Therefore just use " mpirun --help" and look for<br>
whatever is the option for file mapping procs to machines on your<br>
system, then change WIEN_MPIRUN in parallel_options.<br>
<br>
A word of other advice concerning talking to the sys_admins at your<br>
center. I have found without exception that they expect people to<br>
launch just one mpi task which runs for hours to days. All the<br>
schedulers that I have come across expect this. Wien2k is much smarter<br>
than this, and can exploit the cores much better. Therefore you will<br>
have to "filter" (i.e. in some cases ignore) what you are told if it<br>
is not appropriate. Sometimes this takes more time than anything else!<br>
<div class="im"><br>
On Thu, Jul 11, 2013 at 9:41 AM, Peter Blaha<br>
<<a href="mailto:pblaha@theochem.tuwien.ac.at">pblaha@theochem.tuwien.ac.at</a>> wrote:<br>
> But I'm afraid, only YOU have access to the specific doku of your system.<br>
><br>
> As was mentioned before, I would ALWAYS recommend to use mpirun,<br>
> which should be a "standardized wrapper" to the specific mpi-scheduler.<br>
><br>
> Only, when you mpi does not have mpirun, use the more specific calls.<br>
><br>
> For your case it seems "trivial:<br>
><br>
> > *mpiexec_mpt error: -machinefile option not supported.*<br>
><br>
> the option -machinefile does not exist for mpiexec_mpt<br>
><br>
> sometimes it is called -hostfile<br>
><br>
> but you should easily find it out by<br>
><br>
> man mpiexec_mpt<br>
><br>
> or mpiexec_mpt --help<br>
><br>
><br>
> On 07/11/2013 04:30 PM, Luis Ogando wrote:<br>
</div><div class="im">>> Dear Oleg Rubel,<br>
>><br>
>> I agree with you ! This is the reason I asked for hints from someone<br>
>> that uses WIEN with mpiexec_mpt (to save efforts and time).<br>
>> Thank you again !<br>
>> All the best,<br>
>> Luis<br>
>><br>
>><br>
>><br>
</div>>> 2013/7/11 Oleg Rubel <<a href="mailto:orubel@lakeheadu.ca">orubel@lakeheadu.ca</a> <mailto:<a href="mailto:orubel@lakeheadu.ca">orubel@lakeheadu.ca</a>>><br>
<div class="im">>><br>
>> Dear Luis,<br>
>><br>
>> It looks like the problem is not in Wien2k. I would recommend to<br>
>> make sure that you can get a list of host names correctly before<br>
>> proceeding with wien. There are slight difference between various<br>
>> mpi implementation in a way of passing the host name list.<br>
>><br>
>> Oleg<br>
>><br>
>> On 2013-07-11 9:52 AM, "Luis Ogando" <<a href="mailto:lcodacal@gmail.com">lcodacal@gmail.com</a><br>
</div><div class="im">>> <mailto:<a href="mailto:lcodacal@gmail.com">lcodacal@gmail.com</a>>> wrote:<br>
>><br>
>> Dear Prof. Marks and Rubel,<br>
>><br>
>> Many thanks for your kind responses.<br>
>> I am forwarding your messages to the computation center. As<br>
>> soon as I have any reply, I will contact you.<br>
>><br>
>> I know that they have other wrappers (Intel MPI, for<br>
>> example), but they argue that mpiexec_mpt is the optimized option.<br>
>> I really doubt that this option will succeed, because I am<br>
>> getting the following error message in case.dayfile (bold)<br>
>><br>
>> ================================================================================<br>
>> Calculating InPwurt15InPzb3 in<br>
>> /home/ice/proj/proj546/ogando/Wien/Calculos/InP/InPwurtInPzb/15camadasWZ+3ZB/InPwurt15InPzb3<br>
>> on r1i0n8 with PID 6433<br>
>> using WIEN2k_12.1 (Release 22/7/2012) in<br>
>> /home/ice/proj/proj546/ogando/RICARDO2/wien/src<br>
>><br>
>><br>
>> start (Wed Jul 10 13:29:42 BRT 2013) with lapw0 (150/99 to go)<br>
>><br>
>> cycle 1 (Wed Jul 10 13:29:42 BRT 2013) (150/99 to go)<br>
>><br>
</div>>> > lapw0 -grr -p(13:29:42) starting parallel lapw0 at Wed Jul<br>
<div class="im">>> 10 13:29:42 BRT 2013<br>
>> -------- .machine0 : 12 processors<br>
</div><div class="im">>> *mpiexec_mpt error: -machinefile option not supported.*<br>
</div>>> 0.016u 0.008s 0:00.40 2.5%0+0k 0+176io 0pf+0w<br>
<div class="im">>> error: command<br>
>> /home/ice/proj/proj546/ogando/RICARDO2/wien/src/lapw0para -c<br>
>> lapw0.def failed<br>
>><br>
>> > stop error<br>
>> ================================================================================<br>
>><br>
>> Related to -sgi option, I am using -pbs option because PBS<br>
>> is the queueing system. As I said, I works well for parallel<br>
>> execution that uses just one node.<br>
>> Many thanks again,<br>
>> Luis<br>
>><br>
>><br>
>><br>
>> 2013/7/11 Oleg Rubel <<a href="mailto:orubel@lakeheadu.ca">orubel@lakeheadu.ca</a><br>
</div>>> <mailto:<a href="mailto:orubel@lakeheadu.ca">orubel@lakeheadu.ca</a>>><br>
<div><div class="h5">>><br>
>> Dear Luis,<br>
>><br>
>> Can you run other MPI codes under SGI scheduler on your<br>
>> cluster? In any case, I would suggest first to try the<br>
>> simplest check<br>
>><br>
>> mpiexec -n $NSLOTS hostname<br>
>><br>
>> this is what we use for Wien2k<br>
>><br>
>> mpiexec -machinefile _HOSTS_ -n _NP_ _EXEC_<br>
>><br>
>> the next line is also useful to ensure a proper CPU load<br>
>><br>
>> setenv MV2_ENABLE_AFFINITY 0<br>
>><br>
>><br>
>> I hope this will help<br>
>> Oleg<br>
>><br>
>><br>
>> On 13-07-11 8:32 AM, Luis Ogando wrote:<br>
>><br>
>> Dear WIEN2k community,<br>
>><br>
>> I am trying to use WIEN2k 12.1 in a SGI cluster.<br>
>> When I perform<br>
>> parallel calculations using just "one" node, I can use<br>
>> mpirun and<br>
>> everything goes fine (many thanks to Prof. Marks and his<br>
>> SRC_mpiutil<br>
>> directory).<br>
>> On the other hand, when I want to use more than one<br>
>> node, I have to<br>
>> use mpiexec_mpt and the calculation fails. I also tried<br>
>> the mpirun for<br>
>> more than one node, but this is not the proper way in a<br>
>> SGI system and I<br>
>> did not succeed.<br>
>> Well, I would like to know if anyone has experience<br>
>> in using WIEN2k<br>
>> with mpiexec_mpt and could give me any hint.<br>
>> I can give more information. This is only an<br>
>> initial ask for help.<br>
>> All the best,<br>
>> Luis<br>
>><br>
>><br>
>><br>
</div></div><div class="im">>> _________________________________________________<br>
>> Wien mailing list<br>
>> Wien@zeus.theochem.tuwien.ac.__at<br>
>> <mailto:<a href="mailto:Wien@zeus.theochem.tuwien.ac.at">Wien@zeus.theochem.tuwien.ac.at</a>><br>
</div>>> <a href="http://zeus.theochem.tuwien." target="_blank">http://zeus.theochem.tuwien.</a>__<a href="http://ac.at/mailman/listinfo/wien" target="_blank">ac.at/mailman/listinfo/wien</a> <<a href="http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien" target="_blank">http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien</a>><br>
>> SEARCH the MAILING-LIST at:<br>
<div class="im">>> <a href="http://www.mail-archive.com/__wien@zeus.theochem.tuwien.ac.__at/index.html" target="_blank">http://www.mail-archive.com/__wien@zeus.theochem.tuwien.ac.__at/index.html</a><br>
>> <<a href="http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html" target="_blank">http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html</a>><br>
>><br>
>> _________________________________________________<br>
>> Wien mailing list<br>
>> Wien@zeus.theochem.tuwien.ac.__at<br>
>> <mailto:<a href="mailto:Wien@zeus.theochem.tuwien.ac.at">Wien@zeus.theochem.tuwien.ac.at</a>><br>
>> <a href="http://zeus.theochem.tuwien." target="_blank">http://zeus.theochem.tuwien.</a>__<a href="http://ac.at/mailman/listinfo/wien" target="_blank">ac.at/mailman/listinfo/wien</a><br>
</div><div class="im">>> <<a href="http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien" target="_blank">http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien</a>><br>
>> SEARCH the MAILING-LIST at:<br>
</div>>> <a href="http://www.mail-archive.com/__wien@zeus.theochem.tuwien.ac.__at/index.html" target="_blank">http://www.mail-archive.com/__wien@zeus.theochem.tuwien.ac.__at/index.html</a><br>
<div class="im">>> <<a href="http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html" target="_blank">http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html</a>><br>
>><br>
>><br>
>><br>
>> _______________________________________________<br>
>> Wien mailing list<br>
>> <a href="mailto:Wien@zeus.theochem.tuwien.ac.at">Wien@zeus.theochem.tuwien.ac.at</a><br>
</div>>> <mailto:<a href="mailto:Wien@zeus.theochem.tuwien.ac.at">Wien@zeus.theochem.tuwien.ac.at</a>><br>
<div class="im">>> <a href="http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien" target="_blank">http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien</a><br>
>> SEARCH the MAILING-LIST at:<br>
>> <a href="http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html" target="_blank">http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html</a><br>
>><br>
>><br>
>> _______________________________________________<br>
>> Wien mailing list<br>
</div>>> <a href="mailto:Wien@zeus.theochem.tuwien.ac.at">Wien@zeus.theochem.tuwien.ac.at</a> <mailto:<a href="mailto:Wien@zeus.theochem.tuwien.ac.at">Wien@zeus.theochem.tuwien.ac.at</a>><br>
<div class="im">>> <a href="http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien" target="_blank">http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien</a><br>
>> SEARCH the MAILING-LIST at:<br>
>> <a href="http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html" target="_blank">http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html</a><br>
>><br>
>><br>
>><br>
>><br>
>> _______________________________________________<br>
>> Wien mailing list<br>
>> <a href="mailto:Wien@zeus.theochem.tuwien.ac.at">Wien@zeus.theochem.tuwien.ac.at</a><br>
>> <a href="http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien" target="_blank">http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien</a><br>
>> SEARCH the MAILING-LIST at: <a href="http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html" target="_blank">http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html</a><br>
>><br>
><br>
</div><div class="im">> --<br>
><br>
> P.Blaha<br>
> --------------------------------------------------------------------------<br>
> Peter BLAHA, Inst.f. Materials Chemistry, TU Vienna, A-1060 Vienna<br>
> Phone: <a href="tel:%2B43-1-58801-165300" value="+43158801165300">+43-1-58801-165300</a> FAX: <a href="tel:%2B43-1-58801-165982" value="+43158801165982">+43-1-58801-165982</a><br>
> Email: <a href="mailto:blaha@theochem.tuwien.ac.at">blaha@theochem.tuwien.ac.at</a> WWW:<br>
> <a href="http://info.tuwien.ac.at/theochem/" target="_blank">http://info.tuwien.ac.at/theochem/</a><br>
> --------------------------------------------------------------------------<br>
</div><div class="im">> _______________________________________________<br>
> Wien mailing list<br>
> <a href="mailto:Wien@zeus.theochem.tuwien.ac.at">Wien@zeus.theochem.tuwien.ac.at</a><br>
> <a href="http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien" target="_blank">http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien</a><br>
> SEARCH the MAILING-LIST at: <a href="http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html" target="_blank">http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html</a><br>
<br>
<br>
<br>
</div>--<br>
<div class="im">Professor Laurence Marks<br>
Department of Materials Science and Engineering<br>
Northwestern University<br>
<a href="http://www.numis.northwestern.edu" target="_blank">www.numis.northwestern.edu</a> <a href="tel:1-847-491-3996" value="+18474913996">1-847-491-3996</a><br>
"Research is to see what everybody else has seen, and to think what<br>
nobody else has thought"<br>
Albert Szent-Gyorgi<br>
</div><div class=""><div class="h5">_______________________________________________<br>
Wien mailing list<br>
<a href="mailto:Wien@zeus.theochem.tuwien.ac.at">Wien@zeus.theochem.tuwien.ac.at</a><br>
<a href="http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien" target="_blank">http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien</a><br>
SEARCH the MAILING-LIST at: <a href="http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html" target="_blank">http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html</a><br>
</div></div></blockquote></div><br></div></div></div>