<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40"><head><meta http-equiv=Content-Type content="text/html; charset=utf-8"><meta name=Generator content="Microsoft Word 15 (filtered medium)"><style><!--
/* Font Definitions */
@font-face
{font-family:Wingdings;
panose-1:5 0 0 0 0 0 0 0 0 0;}
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0cm;
margin-bottom:.0001pt;
font-size:12.0pt;
font-family:"Times New Roman",serif;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:purple;
text-decoration:underline;}
p.MsoListParagraph, li.MsoListParagraph, div.MsoListParagraph
{mso-style-priority:34;
margin-top:0cm;
margin-right:0cm;
margin-bottom:0cm;
margin-left:36.0pt;
margin-bottom:.0001pt;
font-size:12.0pt;
font-family:"Times New Roman",serif;}
span.EmailStyle17
{mso-style-type:personal;
font-family:"Calibri",sans-serif;
color:windowtext;}
span.EmailStyle19
{mso-style-type:personal-compose;
font-family:"Calibri",sans-serif;
color:windowtext;}
.MsoChpDefault
{mso-style-type:export-only;
font-family:"Calibri",sans-serif;
mso-fareast-language:EN-US;}
@page WordSection1
{size:612.0pt 792.0pt;
margin:70.85pt 70.85pt 70.85pt 70.85pt;}
div.WordSection1
{page:WordSection1;}
/* List Definitions */
@list l0
{mso-list-id:857235053;
mso-list-type:hybrid;
mso-list-template-ids:2103465384 269025281 269025283 269025285 269025281 269025283 269025285 269025281 269025283 269025285;}
@list l0:level1
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:Symbol;}
@list l0:level2
{mso-level-number-format:bullet;
mso-level-text:o;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:"Courier New";}
@list l0:level3
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:Wingdings;}
@list l0:level4
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:Symbol;}
@list l0:level5
{mso-level-number-format:bullet;
mso-level-text:o;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:"Courier New";}
@list l0:level6
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:Wingdings;}
@list l0:level7
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:Symbol;}
@list l0:level8
{mso-level-number-format:bullet;
mso-level-text:o;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:"Courier New";}
@list l0:level9
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:Wingdings;}
@list l1
{mso-list-id:1170948199;
mso-list-type:hybrid;
mso-list-template-ids:1742382192 269025297 269025305 269025307 269025295 269025305 269025307 269025295 269025305 269025307;}
@list l1:level1
{mso-level-text:"%1\)";
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;}
@list l1:level2
{mso-level-number-format:alpha-lower;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;}
@list l1:level3
{mso-level-number-format:roman-lower;
mso-level-tab-stop:none;
mso-level-number-position:right;
text-indent:-9.0pt;}
@list l1:level4
{mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;}
@list l1:level5
{mso-level-number-format:alpha-lower;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;}
@list l1:level6
{mso-level-number-format:roman-lower;
mso-level-tab-stop:none;
mso-level-number-position:right;
text-indent:-9.0pt;}
@list l1:level7
{mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;}
@list l1:level8
{mso-level-number-format:alpha-lower;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;}
@list l1:level9
{mso-level-number-format:roman-lower;
mso-level-tab-stop:none;
mso-level-number-position:right;
text-indent:-9.0pt;}
ol
{margin-bottom:0cm;}
ul
{margin-bottom:0cm;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]--></head><body lang=EN-CA link=blue vlink=purple><div class=WordSection1><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'>Dear Rajiv,<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'>It is impossible to answer your question, since we have no idea what kinds of jobs you are trying to run (# of k-points, matrix size, memory requirements, etc) or the capabilities of your cluster (e.g. I guess your nodes do not have 56 cores). It is best to work with your cluster administrator to answer these kinds of performance questions. You will have to test the performance and scaling of WIEN2k of your cluster yourself, since we do not have access to it.<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'>A few hints:<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'><o:p> </o:p></span></p><p class=MsoListParagraph style='text-indent:-18.0pt;mso-list:l0 level1 lfo2'><![if !supportLists]><span style='font-size:11.0pt;font-family:Symbol'><span style='mso-list:Ignore'>·<span style='font:7.0pt "Times New Roman"'> </span></span></span><![endif]><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'>The case.dayfile will contain output from the unix ‘time’ command between the various steps in the SCF cycle, which can help you answer the question of which job options you choose are faster. For example, after lapw0 runs:<o:p></o:p></span></p><p class=MsoNormal style='margin-left:18.0pt'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'><o:p> </o:p></span></p><p class=MsoNormal style='text-indent:18.0pt'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'>-------- .machine0 : 12 processors<o:p></o:p></span></p><p class=MsoNormal style='text-indent:18.0pt'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'>3.254u 0.122s 0:02.36 142.7% 0+0k 0+120io 0pf+0w<o:p></o:p></span></p><p class=MsoNormal style='text-indent:36.0pt'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'><o:p> </o:p></span></p><p class=MsoNormal style='text-indent:36.0pt'><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'>The :log file also gives you the time each command is started.<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'><o:p> </o:p></span></p><p class=MsoListParagraph style='text-indent:-18.0pt;mso-list:l0 level1 lfo2'><![if !supportLists]><span style='font-size:11.0pt;font-family:Symbol'><span style='mso-list:Ignore'>·<span style='font:7.0pt "Times New Roman"'> </span></span></span><![endif]><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'>WIEN2k can look bad on cluster CPU efficiency metrics (depending on the job) because it is I/O bandwidth intensive, i.e. reading and writing large vectors to disk takes walltime but not cputime.<o:p></o:p></span></p><p class=MsoListParagraph style='text-indent:-18.0pt;mso-list:l0 level1 lfo2'><![if !supportLists]><span style='font-size:11.0pt;font-family:Symbol'><span style='mso-list:Ignore'>·<span style='font:7.0pt "Times New Roman"'> </span></span></span><![endif]><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'>A good question to ask is how your job is scaling with the resources you give it. If you give a job 32 cores and it completes an SCF cycle in 5 minutes, but 64 cores of the same job takes 4 minutes, you probably are hitting a scaling limit (usually node I/O) and wasting resources.<o:p></o:p></span></p><p class=MsoListParagraph style='text-indent:-18.0pt;mso-list:l0 level1 lfo2'><![if !supportLists]><span style='font-size:11.0pt;font-family:Symbol'><span style='mso-list:Ignore'>·<span style='font:7.0pt "Times New Roman"'> </span></span></span><![endif]><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'>From your screenshot, it looks like you are running more than one job simultaneously on node001, since lapw1 and lapw2c are running at the same time. Running a more controlled test will help you better measure job performance/scaling.<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'>Beyond that I can’t be much help. Good luck and remember to search the mailing list!<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;mso-fareast-language:EN-US'>--<b><o:p></o:p></b></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;mso-fareast-language:EN-US'>Dr. Eamon McDermott<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;mso-fareast-language:EN-US'>CEA Grenoble<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;mso-fareast-language:EN-US'>DRT/LETI/DTSI/SCMC<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri",sans-serif'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;mso-fareast-language:EN-US'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri",sans-serif;mso-fareast-language:EN-US'><o:p> </o:p></span></p><p class=MsoNormal><b><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif'>From:</span></b><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri",sans-serif'> Wien [mailto:wien-bounces@zeus.theochem.tuwien.ac.at] <b>On Behalf Of </b>Rajiv Chouhan<br><b>Sent:</b> Saturday, December 03, 2016 19:59<br><b>To:</b> A Mailing list for WIEN2k users <wien@zeus.theochem.tuwien.ac.at><br><b>Subject:</b> Re: [Wien] Parallel execution in clusters<o:p></o:p></span></p><p class=MsoNormal><o:p> </o:p></p><div><div><div><div><p class=MsoNormal>Hi All,<o:p></o:p></p></div><p class=MsoNormal>Please reply to my previous email. It is very important to understand the code running in parallel mode in the cluster. <o:p></o:p></p></div><p class=MsoNormal>Thank you,<o:p></o:p></p></div><p class=MsoNormal>Rajiv<o:p></o:p></p></div><div><p class=MsoNormal><o:p> </o:p></p><div><p class=MsoNormal>On Wed, Nov 30, 2016 at 10:53 AM, Rajiv Chouhan <<a href="mailto:chouhanrajiv14@gmail.com" target="_blank">chouhanrajiv14@gmail.com</a>> wrote:<o:p></o:p></p><blockquote style='border:none;border-left:solid #CCCCCC 1.0pt;padding:0cm 0cm 0cm 6.0pt;margin-left:4.8pt;margin-right:0cm'><div><p class=MsoNormal><o:p> </o:p></p><div><div><blockquote style='border:none;border-left:solid #CCCCCC 1.0pt;padding:0cm 0cm 0cm 6.0pt;margin-left:4.8pt;margin-right:0cm'><div><div><div><div><p class=MsoNormal>Hi,<o:p></o:p></p></div><div><p class=MsoNormal><o:p> </o:p></p></div><p class=MsoNormal style='margin-bottom:12.0pt'>I am using WIEN2k in cluster with mpi parallelization. I have written the script to run the job in machine using pbs script in the slurm platform of the same cluster. I have attached the link to the snapshot of the jobs running in the cluster in two nodes001 and node004. In both the nodes the allocations are 52 and 56 respectively. The top command shows the utilization shown in snapshot attached. Can anyone tell me which of the node is utilizing full resources and which execution is faster. Again jobs of "raji..+" (in node001) are submitted with pbs and other users (node004) are submitted with slurm script.<o:p></o:p></p></div></div></div></blockquote><div><p class=MsoNormal> <a href="https://www.dropbox.com/s/a8ikape71f43cva/Capture-02.jpg?dl=0" target="_blank">https://www.dropbox.com/s/a8ikape71f43cva/Capture-02.jpg?dl=0</a><o:p></o:p></p></div><div><p class=MsoNormal> <o:p></o:p></p></div><blockquote style='border:none;border-left:solid #CCCCCC 1.0pt;padding:0cm 0cm 0cm 6.0pt;margin-left:4.8pt;margin-right:0cm'><div><div><p class=MsoNormal>Thank you,<o:p></o:p></p></div><p class=MsoNormal>Rajiv<o:p></o:p></p></div></blockquote></div><p class=MsoNormal><o:p> </o:p></p></div></div></blockquote></div><p class=MsoNormal><o:p> </o:p></p></div></div></body></html>