<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
  <head>
    <meta content="text/html; charset=UTF-8" http-equiv="Content-Type">
  </head>
  <body bgcolor="#ffffff" text="#000000">
    Dear Wei,<br>
    <br>
    Maybe -machinefile is ok for your mpirun. Which options are
    appropriate for it? What does help say?<br>
    <br>
    Try to restore your MPIRUN variable with -machinefile and rerun the
    calculation. Then see what is in .machine0/1/2 files and let us
    know. It should contain 8 lines of r1i0n0 node and 8 lines of r1i0n1
    node.<br>
    <br>
    One more thing you should check is $WIENROOT/parallel_options file.
    What is its content?<br>
    <pre class="moz-signature" cols="72">Best regards,
   Maxim Rakitin
   email: <a class="moz-txt-link-abbreviated" href="mailto:rms85@physics.susu.ac.ru">rms85@physics.susu.ac.ru</a>
   web: <a class="moz-txt-link-freetext" href="http://www.susu.ac.ru">http://www.susu.ac.ru</a></pre>
    <br>
    01.11.2010 9:06, Wei Xie пишет:
    <blockquote cite="mid:524CB9BF-DC7E-4688-B113-89C81F6272B1@wisc.edu"
      type="cite">Hi Maxim,
      <div><br>
      </div>
      <div>Thanks for your reply! </div>
      <div>We tried MPIRUN=mpirun -np _NP_ -hostfile _HOSTS_ _EXEC_, but
        the problem persists. The only difference is that stdout changes
        to ''… MPI: invalid option -hostfile …''.</div>
      <div><br>
      </div>
      <div>Thanks,</div>
      <div>Wei</div>
      <div><br>
      </div>
      <div><br>
        <div>
          <div>On Oct 31, 2010, at 10:40 PM, Maxim Rakitin wrote:</div>
          <br class="Apple-interchange-newline">
          <blockquote type="cite">
            <div bgcolor="#ffffff" text="#000000"> Hi,<br>
              <br>
              It looks like Intel's mpirun doesn't have '-machinefile'
              option. Instead of this it has '-hostfile' option (form
              here: <a moz-do-not-send="true"
                class="moz-txt-link-freetext"
                href="http://downloadmirror.intel.com/18462/eng/nes_release_notes.txt">http://downloadmirror.intel.com/18462/eng/nes_release_notes.txt</a>).<br>
              <br>
              Try 'mpirun -h' for information about options and apply
              appropriate.<br>
              <pre class="moz-signature" cols="72">Best regards,
   Maxim Rakitin
   email: <a moz-do-not-send="true" class="moz-txt-link-abbreviated" href="mailto:rms85@physics.susu.ac.ru">rms85@physics.susu.ac.ru</a>
   web: <a moz-do-not-send="true" class="moz-txt-link-freetext" href="http://www.susu.ac.ru/">http://www.susu.ac.ru</a></pre>
              <br>
              01.11.2010 4:56, Wei Xie пишет:
              <blockquote
                cite="mid:2C0098E9-D05E-46B8-9BED-983152FB7772@wisc.edu"
                type="cite">
                <div>Dear all WIEN2k community members:</div>
                <div><br>
                </div>
                <div>We encountered some problem when running in
                  parallel (K-point, MPI or both)--the calculations
                  crashed at LAPW2. Note we had no problem running it in
                  serial. We have tried to diagnose the problem,
                  recompile the code with difference options and test
                  with difference cases and parameters based on similar
                  problems reported on the mail list, but the problem
                  persists. So we write here hoping someone can offer us
                  some suggestion. We have attached related files below
                  for your reference. Your replies are appreciated in
                  advance! </div>
                <div><br>
                </div>
                <div>This is a TiC example running in both Kpoint and
                  MPI parallel on two nodes <i>r1i0n0</i> and <i>r1i0n1</i> (8cores/node):</div>
                <div><br>
                </div>
                <div><b>1. </b><b>stdout </b><b>(abridged) </b></div>
                <div>MPI: invalid option -machinefile</div>
                <div>real<span class="Apple-tab-span"
                    style="white-space: pre;"> </span>0m0.004s</div>
                <div>user<span class="Apple-tab-span"
                    style="white-space: pre;"> </span>0m0.000s</div>
                <div>sys<span class="Apple-tab-span" style="white-space:
                    pre;"> </span>0m0.000s</div>
                <div>...</div>
                <div>MPI: invalid option -machinefile</div>
                <div>real<span class="Apple-tab-span"
                    style="white-space: pre;"> </span>0m0.003s</div>
                <div>user<span class="Apple-tab-span"
                    style="white-space: pre;"> </span>0m0.000s</div>
                <div>sys<span class="Apple-tab-span" style="white-space:
                    pre;"> </span>0m0.004s</div>
                <div>TiC.scf1up_1: No such file or directory.</div>
                <div><br>
                </div>
                <div>LAPW2 - Error. Check file lapw2.error</div>
                <div>cp: cannot stat `.in.tmp': No such file or
                  directory</div>
                <div>rm: cannot remove `.in.tmp': No such file or
                  directory</div>
                <div><b><span class="Apple-style-span"
                      style="font-weight: normal;">rm: cannot remove
                      `.in.tmp1': No such file or directory</span></b></div>
                <div><b><br>
                  </b></div>
                <div><b><span class="Apple-style-span"
                      style="font-weight: normal;"></span>2. TiC.dayfile
                    (abridged) </b></div>
                <div>...</div>
                <div>    start <span class="Apple-tab-span"
                    style="white-space: pre;"> </span>(Sun Oct 31
                  16:25:06 MDT 2010) with lapw0 (40/99 to go)</div>
                <div>    cycle 1 <span class="Apple-tab-span"
                    style="white-space: pre;"> </span>(Sun Oct 31
                  16:25:06 MDT 2010) <span class="Apple-tab-span"
                    style="white-space: pre;"> </span>(40/99 to go)</div>
                <div><br>
                </div>
                <div>&gt;   lapw0 -p<span class="Apple-tab-span"
                    style="white-space: pre;"> </span>(16:25:06)
                  starting parallel lapw0 at Sun Oct 31 16:25:07 MDT
                  2010</div>
                <div>-------- .machine0 : 16 processors</div>
                <div>invalid "local" arg: -machinefile</div>
                <div><br>
                </div>
                <div>0.436u 0.412s 0:04.63 18.1%<span
                    class="Apple-tab-span" style="white-space: pre;"> </span>0+0k
                  2600+0io 1pf+0w</div>
                <div>&gt;   lapw1  -up -p   <span class="Apple-tab-span"
                    style="white-space: pre;"> </span>(16:25:12)
                  starting parallel lapw1 at Sun Oct 31 16:25:12 MDT
                  2010</div>
                <div>-&gt;  starting parallel LAPW1 jobs at Sun Oct 31
                  16:25:12 MDT 2010</div>
                <div>running LAPW1 in parallel mode (using .machines)</div>
                <div>2 number_of_parallel_jobs</div>
                <div>     r1i0n0 r1i0n0 r1i0n0 r1i0n0 r1i0n0 r1i0n0
                  r1i0n0 r1i0n0(1)      r1i0n1 r1i0n1 r1i0n1 r1i0n1
                  r1i0n1 r1i0n1 r1i0n1 r1i0n1(1)      r1i0n0 r1i0n0
                  r1i0n0 r1i0n0 r1i0n0 r1i0n0 r1i0n0 r1i0n0(1)  
                   Summary of lapw1para:</div>
                <div>   r1i0n0<span class="Apple-tab-span"
                    style="white-space: pre;"> </span> k=0<span
                    class="Apple-tab-span" style="white-space: pre;"> </span> user=0<span
                    class="Apple-tab-span" style="white-space: pre;"> </span> wallclock=0</div>
                <div>   r1i0n1<span class="Apple-tab-span"
                    style="white-space: pre;"> </span> k=0<span
                    class="Apple-tab-span" style="white-space: pre;"> </span> user=0<span
                    class="Apple-tab-span" style="white-space: pre;"> </span> wallclock=0</div>
                <div>...</div>
                <div>0.116u 0.316s 0:10.48 4.0%<span
                    class="Apple-tab-span" style="white-space: pre;"> </span>0+0k
                  0+0io 0pf+0w</div>
                <div>&gt;   lapw2 -up -p  <span class="Apple-tab-span"
                    style="white-space: pre;"> </span>(16:25:34)
                  running LAPW2 in parallel mode</div>
                <div>**  LAPW2 crashed!</div>
                <div>0.032u 0.104s 0:01.13 11.5%<span
                    class="Apple-tab-span" style="white-space: pre;"> </span>0+0k
                  82304+0io 8pf+0w</div>
                <div>error: command   /home/xiew/WIEN2k_10/lapw2para -up
                  uplapw2.def   failed</div>
                <div><br>
                </div>
                <div><b>3. uplapw2.error </b></div>
                <div>Error in LAPW2</div>
                <div> 'LAPW2' - can't open unit: 18                    
                                             </div>
                <div> 'LAPW2' -        filename: TiC.vspup              
                                            </div>
                <div> 'LAPW2' -          status: old          form:
                  formatted                      </div>
                <div>**  testerror: Error in Parallel LAPW2</div>
                <div><br>
                </div>
                <div>
                  <div>
                    <div><b>4. .machines</b></div>
                    <div>#</div>
                    <div>1:r1i0n0:8</div>
                    <div>1:r1i0n1:8</div>
                    <div>lapw0:r1i0n0:8 r1i0n1:8 </div>
                    <div>granularity:1</div>
                    <div>extrafine:1</div>
                  </div>
                </div>
                <div><br>
                </div>
                <div>
                  <div><b>5. compilers, MPI and options</b></div>
                  <div>Intel Compilers  and MKL 11.1.046</div>
                  <div>Intel MPI 3.2.0.011</div>
                  <div><br>
                  </div>
                  <div>current:FOPT:-FR -mp1 -w -prec_div -pc80 -pad -ip
                    -DINTEL_VML -traceback</div>
                  <div>current:FPOPT:-FR -mp1 -w -prec_div -pc80 -pad
                    -ip -DINTEL_VML -traceback</div>
                  <div>current:LDFLAGS:$(FOPT)
                    -L/usr/local/intel/Compiler/11.1/046/mkl/lib/em64t
                    -pthread</div>
                  <div>current:DPARALLEL:'-DParallel'</div>
                  <div>current:R_LIBS:-lmkl_lapack -lmkl_intel_lp64
                    -lmkl_intel_thread -lmkl_core -openmp -lpthread
                    -lguide</div>
                  <div>current:RP_LIBS:-L/usr/local/intel/Compiler/11.1/046/mkl/lib/em64t

                    -lmkl_scalapack_lp64
                    /usr/local/intel/Compiler/11.1/046/mkl/lib/em64t/libmkl_solver_lp64.a
                    -Wl,--start-group -lmkl_intel_lp64
                    -lmkl_intel_thread -lmkl_core
                    -lmkl_blacs_intelmpi_lp64 -Wl,--end-group -openmp
                    -lpthread -L/home/xiew/fftw-2.1.5/lib -lfftw_mpi
                    -lfftw $(R_LIBS)</div>
                  <div>current:MPIRUN:mpirun -np _NP_ -machinefile
                    _HOSTS_ _EXEC_</div>
                </div>
                <div><br>
                </div>
                <div>Best regards,</div>
                <div>Wei Xie</div>
                <div>Computational Materials Group</div>
                <div>University of Wisconsin-Madison</div>
                <div><br>
                </div>
                <pre wrap=""><fieldset class="mimeAttachmentHeader"></fieldset>
_______________________________________________
Wien mailing list
<a moz-do-not-send="true" class="moz-txt-link-abbreviated" href="mailto:Wien@zeus.theochem.tuwien.ac.at">Wien@zeus.theochem.tuwien.ac.at</a>
<a moz-do-not-send="true" class="moz-txt-link-freetext" href="http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien">http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien</a>
</pre>
              </blockquote>
            </div>
            _______________________________________________<br>
            Wien mailing list<br>
            <a moz-do-not-send="true"
              href="mailto:Wien@zeus.theochem.tuwien.ac.at">Wien@zeus.theochem.tuwien.ac.at</a><br>
            <a class="moz-txt-link-freetext" href="http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien">http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien</a><br>
          </blockquote>
        </div>
        <br>
      </div>
      <pre wrap="">
<fieldset class="mimeAttachmentHeader"></fieldset>
_______________________________________________
Wien mailing list
<a class="moz-txt-link-abbreviated" href="mailto:Wien@zeus.theochem.tuwien.ac.at">Wien@zeus.theochem.tuwien.ac.at</a>
<a class="moz-txt-link-freetext" href="http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien">http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien</a>
</pre>
    </blockquote>
  </body>
</html>