<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class="">Dear Peter and Gavin,<div class=""><br class=""></div><div class="">Thank you for your help. Of course i went over the UG but your explanations cleared things up. I will be eventually doing supercell calculations on TiH and Ti surfaces so will look into the MPI errors in detail then. The PBS works fine now, with the #PBS -V command too. Many thanks again.</div><div class=""><br class=""></div><div class="">Yoji</div><div class=""><br class=""></div><div class=""><br class=""><div><blockquote type="cite" class=""><div class="">On Jun 29, 2017, at 14:49, Yoji Kobayashi <<a href="mailto:yojik@scl.kyoto-u.ac.jp" class="">yojik@scl.kyoto-u.ac.jp</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><meta http-equiv="Content-Type" content="text/html charset=utf-8" class=""><div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class="">Dear Users,<div class=""><br class=""></div><div class="">I have a some questions/problems regarding parallelization and PBS. </div><div class="">I’m not sure if I’m really running parallel vs. serial, and my PBS script isn’t working.</div><div class=""><br class=""></div><div class="">===</div><div class="">My system info:</div><div class="">Intel Xeon CPU E5-2630 v2 @2.6 GHz, 24 CPUS</div><div class="">Memory: 32GB</div><div class="">Running Wien2k_13, on Ubuntu 14.04.03</div><div class="">File system: ext4</div><div class="">(This is considered a single node with 24 processors?)</div><div class="">===</div><div class="">My first question is, am I really running a parallel calculation in a meaningful way?</div><div class=""><br class=""></div><div class="">What I try:</div><div class="">In w2web, a serial calculation (SCF only) for the TiC example (500 k points) takes about 25 sec. to converge.</div><div class="">I do the same calculation (starting with a new case) but setting parallelization in w2web, with slightly different <font face="Courier" class="">.machine</font> files for each case:</div><div class=""><br class=""></div><div class="">Case 1:</div><div class=""><font face="Courier" class="">1:localhost</font></div><div class=""><br class=""></div><div class="">Case 2 (i.e. 20 lines of below):</div><div class=""><div class=""><font face="Courier" class="">1:localhost</font></div><div class=""><font face="Courier" class="">1:localhost</font></div><div class=""><font face="Courier" class="">…</font></div><div class=""><font face="Courier" class="">1:localhost</font></div><div class=""><font face="Courier" class="">1:localhost</font></div></div><div class=""><font face="Courier" class=""><br class=""></font></div><div class="">Case 3</div><div class=""><font face="Courier" class="">1:localhost:20</font></div><div class=""><br class=""></div><div class="">(no lines referring to granularity, etc for now)</div><div class=""><br class=""></div><div class="">What I get:</div><div class="">Case 1 computes in about 54 sec;</div><div class="">Case 2 computes in 1min23 sec.;</div><div class="">Case 3 gives an error in running<font face="Courier" class=""> lapw2</font>, see the<font face="Courier" class=""> dayfile</font> below:</div><div class=""><span style="font-family: Courier, fixed;" class="">-----</span></div><div class=""><span style="font-family: Courier, fixed;" class="">Calculating YK-016-TiC in /home/milkbar/Yoji/YK-016-TiC</span></div><div class=""><pre style="font-family: Courier, fixed;" class="">on milkbar-computer with PID 18077
using WIEN2k_13.1 (Release 17/6/2013) in /home/milkbar/WIEN2k_13
start (2017年 6月 29日 木曜日 14:23:39 JST) with lapw0 (40/99 to go)
cycle 1 (2017年 6月 29日 木曜日 14:23:39 JST) (40/99 to go)
> lapw0 -p (14:23:39) starting parallel lapw0 at 2017年 6月 29日 木曜日 14:23:39 JST
-------- .machine0 : processors
running lapw0 in single mode
1.7u 0.0s 0:01.84 98.3% 0+0k 16+440io 0pf+0w
> lapw1 -p (14:23:41) starting parallel lapw1 at 2017年 6月 29日 木曜日 14:23:41 JST
-> starting parallel LAPW1 jobs at 2017年 6月 29日 木曜日 14:23:41 JST
running LAPW1 in parallel mode (using .machines)
1 number_of_parallel_jobs
localhost localhost localhost localhost localhost localhost localhost localhost localhost localhost localhost localhost localhost localhost localhost localhost localhost localhost localhost localhost(20) 20 total processes failed to start
0.0u 0.0s 0:00.20 10.0% 0+0k 8080+8io 23pf+0w
Summary of lapw1para:
localhost k=0 user=0 wallclock=0
0.0u 0.0s 0:02.10 0.9% 0+0k 8208+216io 24pf+0w
> lapw2 -p (14:23:43) running LAPW2 in parallel mode
** LAPW2 crashed!
0.0u 0.0s 0:00.07 28.5% 0+0k 32+104io 0pf+0w
error: command /home/milkbar/WIEN2k_13/lapw2para lapw2.def failed
> stop error</pre><pre style="font-family: Courier, fixed;" class="">------</pre><pre style="font-family: Courier, fixed;" class="">Is my “serial” calculation actually processed over 24 CPUs already, so this is why it is faster than Case 2? Or am I doing something wrong? Why does Case 3 crash? </pre></div><div class=""><br class=""></div><div class="">====</div><div class="">My second question is about PBS.</div><div class="">I installed torque PBS, and created a queue:</div><span class=""><div class=""><font face="Courier" class=""><br class=""></font></div><div class=""><font face="Courier" class=""># create default queue</font></div><div class=""><font face="Courier" class=""> qmgr -c 'create queue batch'</font></div><div class=""><font face="Courier" class=""> qmgr -c 'set queue batch queue_type = execution'</font></div><div class=""><font face="Courier" class=""> qmgr -c 'set queue batch started = true'</font></div><div class=""><font face="Courier" class=""> qmgr -c 'set queue batch enabled = true'</font></div><div class=""><font face="Courier" class=""> qmgr -c 'set queue batch resources_default.walltime = 1:00:00'</font></div><div class=""><font face="Courier" class=""> qmgr -c 'set queue batch </font><span style="font-family: Courier;" class="">resources_default.nodes</span><span style="font-family: Courier;" class=""> = 1'</span></div><div class=""><font face="Courier" class=""> qmgr -c 'set server default_queue = batch’</font></div><div class=""><span class=""><br class=""></span></div>and followed other instructions on</span><div class=""><span class=""><a href="https://jabriffa.wordpress.com/2015/02/11/installing-torquepbs-job-scheduler-on-ubuntu-14-04-lts/" class="">https://jabriffa.wordpress.com/2015/02/11/installing-torquepbs-job-scheduler-on-ubuntu-14-04-lts/</a><br class=""><br class=""></span>The PBS system seems to work since I can submit very simple scripts and see them on qstat. My problem is that when I try to submit a serial wien2k job via PBS, it gives me an error (ultimately of course I’d like to submit them as parallel, but because of the ambiguity above I’ve kept it to serial) . Here's the PBS script and error message:<span class=""><br class=""></span><div class=""><br class=""></div><div class=""><div style="margin: 0px; line-height: normal; background-color: rgb(255, 255, 255);" class=""><div style="margin: 0px; line-height: normal;" class=""><font face="Courier" class=""> #!/bin/tcsh</font></div><div style="margin: 0px; line-height: normal;" class=""><font face="Courier" class=""> ##PBS -A your_allocation</font></div><div style="margin: 0px; line-height: normal;" class=""><font face="Courier" class=""> # specify the allocation. Change it to your allocation</font></div><div style="margin: 0px; line-height: normal;" class=""><font face="Courier" class=""> #PBS -q batch</font></div><div style="margin: 0px; line-height: normal;" class=""><font face="Courier" class=""> #PBS -l nodes=1:ppn=20</font></div><div style="margin: 0px; line-height: normal;" class=""><font face="Courier" class=""> #PBS -l walltime=1:00:00</font></div><div style="margin: 0px; line-height: normal;" class=""><font face="Courier" class=""> #PBS -o wien2k_output</font></div><div style="margin: 0px; line-height: normal;" class=""><font face="Courier" class=""> #PBS -j oe</font></div><div style="margin: 0px; line-height: normal;" class=""><font face="Courier" class=""> #PBS -N wien2k_test</font></div><div style="margin: 0px; line-height: normal;" class=""><font face="Courier" class=""> cd $PBS_O_WORKDIR</font></div><div style="margin: 0px; line-height: normal;" class=""><font face="Courier" class=""> echo hello</font></div><div style="margin: 0px; line-height: normal;" class=""><font face="Courier" class=""> run_lapw -i 40 -ec .0001 -I</font></div></div></div><div class=""><br class=""></div><div class="">Error message (contents of <font face="Courier" class="">wien2k_output</font>):</div><div class=""><div style="margin: 0px; line-height: normal; background-color: rgb(255, 255, 255);" class=""><span style="font-variant-ligatures: no-common-ligatures;" class=""><font face="Courier" class="">hello</font></span></div><div style="margin: 0px; line-height: normal; background-color: rgb(255, 255, 255);" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""><font face="Courier" class="">/var/spool/torque/mom_priv/jobs/44.milkbar-computer.kage.SC: line 12: run_lapw: command not found</font></span></div></div><div style="margin: 0px; font-size: 11px; line-height: normal; font-family: Menlo; background-color: rgb(255, 255, 255);" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""><br class=""></span></div><div class=""><span style="background-color: rgb(255, 255, 255);" class="">The job is listed as complete in <font face="Courier" class="">qs</font>tat, and the “hello” is written into the<font face="Courier" class=""> wien2k_output</font> file. Changing the <font face="Courier" class="">cd $PBS_O_WORKDIR</font> to the path for the current case hasn’t changed anything. </span>I can run <span style="background-color: rgb(255, 255, 255);" class=""><font face="Courier" class="">run_lapw</font></span><span style="font-family: Menlo; font-size: 11px; background-color: rgb(255, 255, 255);" class=""> </span>from the command line fine, though. Also, what do I write for allocation? (I commented it out, as I see other PBS scripts don’t always have this.)</div><div class=""><br class=""></div><div class="">I’ve also tried the parallel case, with the following PBS script. I set up the <font face="Courier" class="">.structure</font> file and do the initialization with w2web. I leave the “parallel calculation” option unchecked when setting up the case file in w2web.</div><div class=""><font face="Courier" class=""><br class=""></font></div><div class=""><div style="margin: 0px; line-height: normal; background-color: rgb(255, 255, 255);" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""><font face="Courier" class=""> #!/bin/tcsh</font></span></div><div style="margin: 0px; line-height: normal; background-color: rgb(255, 255, 255);" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""><font face="Courier" class=""> ##PBS -A your_allocation</font></span></div><div style="margin: 0px; line-height: normal; background-color: rgb(255, 255, 255);" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""><font face="Courier" class=""> #PBS -q batch</font></span></div><div style="margin: 0px; line-height: normal; background-color: rgb(255, 255, 255);" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""><font face="Courier" class=""> #PBS -l nodes=1:ppn=20</font></span></div><div style="margin: 0px; line-height: normal; background-color: rgb(255, 255, 255);" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""><font face="Courier" class=""> #PBS -l walltime=1:00:00</font></span></div><div style="margin: 0px; line-height: normal; background-color: rgb(255, 255, 255);" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""><font face="Courier" class=""> #</font></span></div><div style="margin: 0px; line-height: normal; background-color: rgb(255, 255, 255);" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""><font face="Courier" class=""> #PBS -o wien2k_output</font></span></div><div style="margin: 0px; line-height: normal; background-color: rgb(255, 255, 255);" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""><font face="Courier" class=""> #PBS -j oe</font></span></div><div style="margin: 0px; line-height: normal; background-color: rgb(255, 255, 255);" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""><font face="Courier" class=""> #PBS -N wien2k_test</font></span></div><div style="margin: 0px; line-height: normal; background-color: rgb(255, 255, 255);" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""><font face="Courier" class=""> cd $PBS_O_WORKDIR</font></span></div><div style="margin: 0px; line-height: normal; background-color: rgb(255, 255, 255);" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""><font face="Courier" class=""> #</font></span></div><div style="margin: 0px; line-height: normal; background-color: rgb(255, 255, 255);" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""><font face="Courier" class=""> #cat $PBS_NODEFILE |cut -c1-6 >.machines_currentdd</font></span></div><div style="margin: 0px; line-height: normal; background-color: rgb(255, 255, 255);" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""><font face="Courier" class=""> #set aa=`wc .machines_current`</font></span></div><div style="margin: 0px; line-height: normal; background-color: rgb(255, 255, 255);" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""><font face="Courier" class=""> #echo '#' > .machines</font></span></div><div style="margin: 0px; line-height: normal; background-color: rgb(255, 255, 255);" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""><font face="Courier" class=""> #</font></span></div><div style="margin: 0px; line-height: normal; background-color: rgb(255, 255, 255);" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""><font face="Courier" class=""> ##example for k-point parallel lapw1/2</font></span></div><div style="margin: 0px; line-height: normal; background-color: rgb(255, 255, 255);" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""><font face="Courier" class=""> set i=1</font></span></div><div style="margin: 0px; line-height: normal; background-color: rgb(255, 255, 255);" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""><font face="Courier" class=""> <span class="Apple-tab-span" style="white-space:pre"> </span>while ($i <= $aa[1] )</font></span></div><div style="margin: 0px; line-height: normal; background-color: rgb(255, 255, 255);" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""><font face="Courier" class=""> <span class="Apple-tab-span" style="white-space:pre"> </span>echo -n '1:' >>.machines</font></span></div><div style="margin: 0px; line-height: normal; background-color: rgb(255, 255, 255);" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""><font face="Courier" class=""> <span class="Apple-tab-span" style="white-space:pre"> </span>head -$i .machines_current |tail -1 >> .machines</font></span></div><div style="margin: 0px; line-height: normal; background-color: rgb(255, 255, 255);" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""><font face="Courier" class=""> <span class="Apple-tab-span" style="white-space:pre"> </span>@ i ++</font></span></div><div style="margin: 0px; line-height: normal; background-color: rgb(255, 255, 255);" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""><font face="Courier" class=""> end</font></span></div><div style="margin: 0px; line-height: normal; background-color: rgb(255, 255, 255);" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""><font face="Courier" class="">echo 'granularity:1' >>.machines</font></span></div><div style="margin: 0px; line-height: normal; background-color: rgb(255, 255, 255);" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""><font face="Courier" class="">echo 'extrafine:1' >>.machines</font></span></div><div style="margin: 0px; line-height: normal; background-color: rgb(255, 255, 255);" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""><font face="Courier" class=""> #</font></span></div><div style="margin: 0px; line-height: normal; background-color: rgb(255, 255, 255);" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""><font face="Courier" class=""> #define here your Wien2k command</font></span></div><div style="margin: 0px; line-height: normal; background-color: rgb(255, 255, 255);" class=""><span style="font-variant-ligatures: no-common-ligatures" class=""><font face="Courier" class=""> run_lapw -p -i 40 -ec .0001 -I</font></span></div></div><div class=""><font face="Courier" class=""><br class=""></font></div><div class="">When I submit this job via <font face="Courier" class="">qsub</font>, again the job is immediately listed as complete in <font face="Courier" class="">qstat</font>, and I get the following error message in <font face="Courier" class="">wien2k_output</font>:</div><div class=""><br class=""></div><div class=""><div style="margin: 0px; line-height: normal; background-color: rgb(255, 255, 255);" class=""><span style="font-variant-ligatures: no-common-ligatures;" class=""><font face="Courier" class="">milkbar@milkbar-computer:~/Yoji/YK-017-TiC$ cat wien2k_output</font></span></div><div style="margin: 0px; line-height: normal; background-color: rgb(255, 255, 255);" class=""><span style="font-variant-ligatures: no-common-ligatures;" class=""><font face="Courier" class="">/var/spool/torque/mom_priv/jobs/45.milkbar-computer.kage.SC: line 28: syntax error: unexpected end of file</font></span></div></div><div class=""><br class=""></div><div class="">No <font face="Courier" class="">.machines</font> file has been created in the case folder. </div><div class=""> How can I successfully submit serial/parallel PBS jobs? Thanks in advance for your help.</div><div class=""><br class=""></div><div class="">Yoji Kobayashi</div><div class=""><br class=""></div><div class=""><div class="">
<div style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class="">==========================================================<br class="">Yoji Kobayashi, Junior Assoc. Prof. <a href="mailto:yojik@scl.kyoto-u.ac.jp" class="">yojik@scl.kyoto-u.ac.jp</a><br class=""><a href="http://www.scl.kyoto-u.ac.jp/~yojik/index.htm" class="">http://www.scl.kyoto-u.ac.jp/~yojik/index.htm</a><br class=""><br class="">Kageyama Group, Dept. of Energy and Hydrocarbon Chemistry<br class="">Graduate School of Engineering, Kyoto University<br class="">Nishikyo-ku, Kyoto 615-8510, Japan<br class=""><br class="">Tel.: +81-75-383-2509 Fax: +81-75-383-2510<br class=""><a href="http://www.ehcc.kyoto-u.ac.jp/eh10/kageyama.html" class="">http://www.ehcc.kyoto-u.ac.jp/eh10/kageyama.html</a><br class="">==========================================================</div>
</div>
<br class=""></div></div></div></div></blockquote></div><br class=""><div class="">
<div style="color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px;">==========================================================<br class="">小林洋治 <a href="mailto:yojik@scl.kyoto-u.ac.jp" class="">yojik@scl.kyoto-u.ac.jp</a><br class=""><a href="http://www.scl.kyoto-u.ac.jp/~yojik/index.htm" class="">http://www.scl.kyoto-u.ac.jp/~yojik/index.htm</a><br class=""><br class="">〒615-8510 京都市西京区 京都大学桂<br class="">京都大学 大学院工学研究科 物質エネルギー化学専攻<br class="">陰山研究室 講師<br class=""><br class="">Tel.: 075-383-2509 Fax: 075-383-2510<br class="">http://www.ehcc.kyoto-u.ac.jp/eh10/kageyama.html<br class="">==========================================================</div>
</div>
<br class=""></div></body></html>