<div dir="ltr"><div dir="ltr"><div><div dir="ltr" class="gmail_signature"><div dir="ltr"><div><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><span style="font-size:12.8px"></span></div><div dir="ltr"><div>Dear Gavin,</div><div>(updated)<br></div><div>I am writing on behalf of Ms. Bushra, as she
is not able to reply for now, with some test on the same cluster with
wien2k version 17.1 and 18.2.</div><div><br></div><div>The actual error what she/me see is "/usr/common/nsg/bin/mpirun: Permission denied" which may be solved by cluster admin only.</div><div><br></div><div><div><div dir="ltr" class="gmail-m_4067254242100521687gmail-m_-2636521733486260728gmail-m_-1885400293695771074gmail_signature"><div dir="ltr"><div><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr">For Wien2k_17.1 the mpirun was defined as "mpirun -n _NP_ -machinefile _HOSTS_ _EXEC_"</div><div dir="ltr"><br><div><span style="font-size:12.8px">As in one of the thread Prof. Peter suggested to u</span>se "ifort + slurm". <br></div><div><br></div><div>Yes, I just installed Wien2k_18.2 at NERSC with ifort+slurm system environment.</div><div><br></div><div>and the mpirun command is now "srun -K -N_nodes_ -n_NP_ -r_offset_ _PINNING_ _EXEC_"</div><div><br></div><div>But still I face same error.</div><div><br></div><div>The error is same and it does't matter if we have mpirun or srun [1]. Only srun and mpirun word changes in the error.<br></div><div><br></div><div><br></div>In
the past I faces same error and cluster admin only could solve so let
us first write to cluster admin and will update here the final outcome.</div><div dir="ltr"><br></div><div>If you have any advice that can help to get rid of this issue please let us know.</div><div><br></div>[1]</div><div dir="ltr">srun: error: No hardware architecture specified (-C)!<br>srun: error: Unable to allocate resources: Unspecified error<br>srun: fatal: --relative option invalid for job allocation request<br>srun: error: No hardware architecture specified (-C)!<br>srun: error: Unable to allocate resources: Unspecified error<br>LAO.scf1up_1: No such file or directory.<br>grep: No match.<br>srun: fatal: --relative option invalid for job allocation request<br>srun: error: No hardware architecture specified (-C)!<br>srun: error: Unable to allocate resources: Unspecified error<br>LAO.scf1dn_1: No such file or directory.<br>grep: No match.<br>LAPW2 - Error. Check file lapw2.error<br>cp: cannot stat '.in.tmp': No such file or directory<br>grep: No match.<br>grep: No match.<br>grep: No match.<br><br>> stop error<br><br></div></div></div></div></div></div></div></div></div></div></div></div></div></div></div></div></div><br></div><br><div class="gmail_quote"><div dir="ltr">On Sat, Oct 20, 2018 at 8:01 PM Gavin Abo <<a href="mailto:gsabo@crimson.ua.edu">gsabo@crimson.ua.edu</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF">
<p><font face="Times New Roman">1. It looks like you are using
WIEN2k 17.1. Some serious bugs were found in that version [
<a class="m_-2947625588828193011moz-txt-link-freetext" href="http://susi.theochem.tuwien.ac.at/reg_user/updates/" target="_blank">http://susi.theochem.tuwien.ac.at/reg_user/updates/</a> ]. Consider
installing and using WIEN2k 18.2 which has the fixes to it.
Also, WIEN2k 18.2 can be patched according to previous mailing
list posts [
<a class="m_-2947625588828193011moz-txt-link-freetext" href="https://github.com/gsabo/WIEN2k-Patches/tree/master/18.2" target="_blank">https://github.com/gsabo/WIEN2k-Patches/tree/master/18.2</a> ].<br>
</font></p>
<p>2. Regarding your "file LAO.vspup is missing, i think it
automatically generated during parallel lapw2", the case.vspup
file should have been generated by lapw0. See Table 4.3 on page
36 of the WIEN2k 18.2 usersguide [
<a class="m_-2947625588828193011moz-txt-link-freetext" href="http://susi.theochem.tuwien.ac.at/reg_user/textbooks/usersguide.pdf" target="_blank">http://susi.theochem.tuwien.ac.at/reg_user/textbooks/usersguide.pdf</a>
] where it has program LAPW0 generates necessary case.vsp(up/dn).</p>
<p>3. I suggest you investigate why the LAO.vspup "can't open unit:
18" error happens with lapw2 but not with lapw1. For example, did
<span>LAO.vspup exist with a non-zero file size after lapw0
completed, did it exist with a non-zero file size for lapw1, and
did it get deleted or become zero in file size or loose node
connection(s) just before lapw2?</span></p>
<p><span>Is your .machines setup to run k-point parallel, mpi
parallel, or a mix of both? It looks like the job script that
creates the .machines on the fly was not provided that shows
that.<br>
</span></p>
<p>If mpi parallel, using WIEN2k 18.2:</p>
<p>1. Run: ./siteconfig<br>
2. Select Compiling Options, Selection: O<br>
3. Select Parallel options, Selection: PO<br>
4. What is MPIRUN set to?<br>
</p>
<p>You also might check your mpirun command and talk with your
cluster administrator to see if a supported mpi run command is
being used for the system [
<a class="m_-2947625588828193011moz-txt-link-freetext" href="https://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/msg17628.html" target="_blank">https://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/msg17628.html</a>
].</p>
<p>Have you checked the standard output/error file? This file name
can vary from one system to another. So you have to check your
scheduling/queue system documentation to see what the default
file(s) is called or use an option to name it yourself [ for
example,
<a class="m_-2947625588828193011moz-txt-link-freetext" href="https://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/msg18080.html" target="_blank">https://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/msg18080.html</a>
]. If there is a mpi run error, it usually shows up in that file.</p>
<p>You also might have to check the hidden dot files [
<a class="m_-2947625588828193011moz-txt-link-freetext" href="https://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/msg17317.html" target="_blank">https://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/msg17317.html</a>
] and output files (like case.output0, case.output1, etc.).<br>
</p>
<p><br>
</p>
<div class="m_-2947625588828193011moz-cite-prefix">On 10/20/2018 1:58 AM, BUSHRA SABIR
wrote:<br>
</div>
<blockquote type="cite">
<div class="m_-2947625588828193011ydp6f68bc8dyahoo-style-wrap" style="font-family:Helvetica Neue,Helvetica,Arial,sans-serif;font-size:16px">
<div id="m_-2947625588828193011ydp6f68bc8dyiv0529548225">
<div>
<div class="m_-2947625588828193011ydp6f68bc8dyiv0529548225ydp2f991150yahoo-style-wrap" style="font-family:Helvetica Neue,Helvetica,Arial,sans-serif;font-size:16px">
<div>
<div>Dear Peter Blaha and wien2k users</div>
<div><br>
</div>
<div>I am facing one problem in parallel execution of
job script. I am working on LaXO3 materials.
initialization is ok but when i submitted job file on
cluster for parallel execution with command line
runsp_lapw -cc 0.001 -ec 0.0001 -i 40 -p . <br>
</div>
<div><br>
</div>
<div>following error <a href="http://apears.cat" target="_blank">apears.cat</a> *.error<br>
</div>
<div><br>
</div>
<div><span>'LAPW2' - can't open unit:
18 <br>
'LAPW2' - filename:
LAO.vspup <br>
'LAPW2' - status: old form:
formatted <br>
</span>
<div><span>** testerror: Error in Parallel LAPW2<br>
</span></div>
<div><span><br>
</span></div>
<div><span>file LAO.vspup is missing, i think it
automatically generated during parallel lapw2 <br>
</span></div>
<div>
<div><br>
</div>
<div>i checked testpara1_lapw</div>
<div><span>#####################################################<br>
#
TESTPARA1 #<br>
#####################################################<br>
<br>
Sat Oct 20 00:22:39 PDT 2018<br>
<br>
</span>
<div><span> lapw1para has finished</span></div>
<div><span></span><br>
</div>
<div> for testpara2_lapw</div>
<div><span>#####################################################<br>
#
TESTPARA1 #<br>
#####################################################<br>
<br>
Sat Oct 20 00:22:39 PDT 2018<br>
<br>
lapw1para has finished<br>
<br>
</span>
<div>At the end of day file following error is
shown</div>
<div><br>
</div>
</div>
<span><span>0.088u 0.060s 0:05.14 2.7% 0+0k
0+288io 0pf+0w<br>
> lapw2 -up -p (23:56:15)
running LAPW2 in parallel mode<br>
** LAPW2 crashed!<br>
0.048u 0.312s 0:00.72 48.6% 0+0k 11386+96io
36pf+0w<br>
error: command
/global/common/sw/cray/cnl6/haswell/wien2k/17.1/intel/<a href="http://17.0.2.174/wkteycp/lapw2para" target="_blank">17.0.2.174/wkteycp/lapw2para</a>
-up uplapw2.def failed<br>
<br>
</span></span>
<div><span>i go through mailing list but could not
find solution.</span></div>
<div><span></span><br>
</div>
<div><br>
</div>
<div>Bushra</div>
<div>PhD student</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</blockquote>
<blockquote type="cite">
</blockquote>
</div>
_______________________________________________<br>
Wien mailing list<br>
<a href="mailto:Wien@zeus.theochem.tuwien.ac.at" target="_blank">Wien@zeus.theochem.tuwien.ac.at</a><br>
<a href="http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien" rel="noreferrer" target="_blank">http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien</a><br>
SEARCH the MAILING-LIST at: <a href="http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html" rel="noreferrer" target="_blank">http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html</a><br>
</blockquote></div></div>