<p>Please have a look at the end of case.outputup_* which gives the real cpu and wall times and post those. It may be that the times being reported are misleading.</p>
<p>In addition, I do not understand why you are seeing an error and the script is continuing - it should not. Maybe some of the tasks are not working or there are bugs in the csh. It may be useful to post the dayfile.</p>
<p>---------------------------<br>
Professor Laurence Marks<br>
Department of Materials Science and Engineering<br>
Northwestern University<br>
<a href="http://www.numis.northwestern.edu">www.numis.northwestern.edu</a> 1-847-491-3996<br>
"Research is to see what everybody else has seen, and to think what nobody else has thought"<br>
Albert Szent-Gyorgi</p>
<div class="gmail_quote">On May 3, 2013 6:47 PM, "Oliver Albertini" <<a href="mailto:ora@georgetown.edu" target="_blank">ora@georgetown.edu</a>> wrote:<br type="attribution"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div>
<div dir="ltr">Thanks to you both for the suggestions. The OS was recently updated beyond those versions mentioned in the link (now 6100-08).
<div><br>
</div>
<div>Adding the iostat statement to all the errclr.f files prevents the program from stopping altogether although error messages sill appear in the output:</div>
<div><br>
</div>
<div>
<div>STOP LAPW0 END</div>
<div>STOP LAPW0 END</div>
<div>STOP LAPW0 END</div>
<div>STOP LAPW0 END</div>
<div>STOP LAPW0 END</div>
<div>STOP LAPW1 - Error</div>
<div>STOP LAPW1 END</div>
<div>STOP LAPW1 END</div>
<div>STOP LAPW1 END</div>
<div>STOP LAPW1 END</div>
<div>STOP LAPW1 - Error</div>
<div>STOP LAPW1 END</div>
<div>STOP LAPW1 END</div>
<div>STOP LAPW1 END</div>
<div>STOP LAPW1 END</div>
<div>STOP LAPW2 - FERMI; weighs written</div>
<div>STOP LAPW2 END</div>
<div>STOP LAPW2 END</div>
<div>STOP LAPW2 END</div>
<div>STOP LAPW2 END</div>
<div>STOP LAPW2 END</div>
<div>STOP SUMPARA END</div>
<div>STOP LAPW2 - FERMI; weighs written</div>
<div>STOP LAPW2 END</div>
<div>STOP LAPW2 END</div>
<div>STOP LAPW2 END</div>
<div>STOP LAPW2 END</div>
<div>STOP LAPW2 END</div>
<div>STOP SUMPARA END</div>
<div>STOP CORE END</div>
<div>STOP CORE END</div>
<div>STOP MIXER END</div>
<div><br>
</div>
<div><br>
</div>
<div>which are more prevalent when using higher processor counts. After completing a few runs with more processors, the times have continually increased:</div>
<div><br>
</div>
<div>
<div>real 6m43.33s </div>
<div>user 6m19.18s serial </div>
<div>sys 0m13.59s </div>
<div> </div>
<div>real 10m36.03s </div>
<div>user 1m4.68s 2proc </div>
<div>sys 0m47.79s </div>
<div> </div>
<div>real 11m11.25s </div>
<div>user 1m5.24s 4proc </div>
<div>sys 0m52.17s </div>
<div> </div>
<div>real 11m39.17s </div>
<div>user 1m6.18s 8proc </div>
<div>sys 1m10.65s </div>
<div> </div>
<div>real 14m31.16s </div>
<div>user 1m7.95s 16proc </div>
<div>sys 2m7.63s </div>
<div><br>
</div>
<div>After looking into various IBM Parallel Operating Environment (poe) environmental variables (MP_SHARED_MEMORY,MP_IO_BUFFER_SIZE,MP_EAGER_LIMIT) it seems like none of them are improving performance. Any ideas why this is getting slower?<br>
</div>
</div>
</div>
</div>
<div class="gmail_extra"><br>
<br>
<div class="gmail_quote">On Thu, May 2, 2013 at 8:49 PM, Gavin Abo <span dir="ltr">
<<a href="mailto:gsabo@crimson.ua.edu" target="_blank">gsabo@crimson.ua.edu</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div><br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
STOP LAPW0 END<br>
"inilpw.f", line 233: 1525-142 The CLOSE statement on unit 200 cannot be completed because an errno value of 2 (A file or directory in the path name does not exist.) was received while closing the file. The program will stop.<br>
STOP LAPW1 END<br>
</blockquote>
</div>
If this is on operating system AIX 6.1 [<a href="http://zeus.theochem.tuwien.ac.at/pipermail/wien/2013-March/018560.html" target="_blank">http://zeus.theochem.tuwien.<u></u>ac.at/pipermail/wien/2013-<u></u>March/018560.html</a>], the following link mentions
that a fix might be needed for some release levels:<br>
<br>
<a href="http://www-01.ibm.com/support/docview.wss?uid=isg1IZ23555" target="_blank">http://www-01.ibm.com/support/<u></u>docview.wss?uid=isg1IZ23555</a><br>
______________________________<u></u>_________________<br>
Wien mailing list<br>
<a href="mailto:Wien@zeus.theochem.tuwien.ac.at" target="_blank">Wien@zeus.theochem.tuwien.ac.<u></u>at</a><br>
<a href="http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien" target="_blank">http://zeus.theochem.tuwien.<u></u>ac.at/mailman/listinfo/wien</a><br>
SEARCH the MAILING-LIST at: <a href="http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html" target="_blank">http://www.mail-archive.com/<u></u>wien@zeus.theochem.tuwien.ac.<u></u>at/index.html</a><br>
</blockquote>
</div>
<br>
</div>
</div>
</blockquote></div>