<html>
<head>
<meta content="text/html; charset=windows-1252"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<div class="moz-cite-prefix">So you are using the ifort version with
the unformatted file read bug. Based on the Intel page at the
link in the previous post below, did you try recompiling
lapwso_mpi with -O0 or revert to one of the versions of ifort that
Intel mentioned to see if it fixed the problem or not?<br>
<br>
On 11/14/2016 8:34 AM, Md. Fhokrul Islam wrote:<br>
</div>
<blockquote
cite="mid:SN1PR13MB0286639C5F90DB91B64929F7D3BC0@SN1PR13MB0286.namprd13.prod.outlook.com"
type="cite">
<meta http-equiv="Content-Type" content="text/html;
charset=windows-1252">
<style type="text/css" style="display:none;"><!-- P {margin-top:0;margin-bottom:0;} --></style>
<div id="divtagdefaultwrapper"
style="font-size:12pt;color:#000000;font-family:Calibri,Arial,Helvetica,sans-serif;"
dir="ltr">
<p>Hi Gavin,</p>
<p><br>
</p>
<p> Thanks for your suggestion. Yes, I am using 16.0.3.210
version of ifort. Debugging such a</p>
<p>big file with 'od' seems to be difficult but I will try with
a smaller system and see if I get the </p>
<p>same <span style="font-size: 12pt;">error.</span></p>
<p><br>
</p>
<p><br>
</p>
<p>Fhokrul</p>
<br>
<br>
<div style="color: rgb(0, 0, 0);">
<hr tabindex="-1" style="display:inline-block; width:98%">
<div id="divRplyFwdMsg" dir="ltr"><font style="font-size:11pt"
face="Calibri, sans-serif" color="#000000"><b>From:</b>
Wien <a class="moz-txt-link-rfc2396E" href="mailto:wien-bounces@zeus.theochem.tuwien.ac.at"><wien-bounces@zeus.theochem.tuwien.ac.at></a> on
behalf of Gavin Abo <a class="moz-txt-link-rfc2396E" href="mailto:gsabo@crimson.ua.edu"><gsabo@crimson.ua.edu></a><br>
<b>Sent:</b> Sunday, November 13, 2016 11:40 PM<br>
<b>To:</b> A Mailing list for WIEN2k users<br>
<b>Subject:</b> Re: [Wien] lapwso_mpi error</font>
<div> </div>
</div>
<div>
<div class="moz-cite-prefix"><span style=""><span style=""><span
style=""></span></span></span>Ok, I agree that it is
likely not due to the set up of the scratch directory.<br>
<br>
What version of ifort was used? If you happened to use
16.0.3.210, maybe it is caused by an ifort bug [
<a moz-do-not-send="true" class="moz-txt-link-freetext"
href="https://software.intel.com/en-us/articles/read-failure-unformatted-file-io-psxe-16-update-3"
id="LPlnk339647" previewremoved="true">
https://software.intel.com/en-us/articles/read-failure-unformatted-file-io-psxe-16-update-3</a>
].<br>
<br>
<br>
Perhaps you can use the linux "od" command to try to
troubleshot and identify what the data mismatch is between
the writing and reading of the 3Mn.vectordn_1 file,
similar to what is described on the web pages at:<br>
<br>
<a moz-do-not-send="true" class="moz-txt-link-freetext"
href="https://software.intel.com/en-us/forums/intel-fortran-compiler-for-linux-and-mac-os-x/topic/269993"
id="LPlnk293742" previewremoved="true">https://software.intel.com/en-us/forums/intel-fortran-compiler-for-linux-and-mac-os-x/topic/269993</a><br>
<br>
<a moz-do-not-send="true" class="moz-txt-link-freetext"
href="https://software.intel.com/en-us/forums/intel-fortran-compiler-for-linux-and-mac-os-x/topic/270436"
id="LPlnk562524" previewremoved="true">https://software.intel.com/en-us/forums/intel-fortran-compiler-for-linux-and-mac-os-x/topic/270436</a><br>
<br>
<a moz-do-not-send="true" class="moz-txt-link-freetext"
href="https://software.intel.com/en-us/forums/intel-fortran-compiler-for-linux-and-mac-os-x/topic/268503"
id="LPlnk877377" previewremoved="true">https://software.intel.com/en-us/forums/intel-fortran-compiler-for-linux-and-mac-os-x/topic/268503</a><br>
<br>
<br>
Though, it might be harder to diagnose with the large
3Mn.vectordn_1, which looks to be about 12 GB. So you may
want to create a mpi SO calculation that creates a smaller
case.vectordn_1 for that.<br>
<br>
On 11/13/2016 7:30 AM, Md. Fhokrul Islam wrote:<br>
</div>
<blockquote type="cite">
<div id="divtagdefaultwrapper" dir="ltr"
style="font-size:12pt; color:#000000;
font-family:Calibri,Arial,Helvetica,sans-serif">
<p>Hi Gavin,</p>
<p><br>
</p>
<p> In my .bashrc scratch is defined as <span
style="">$SCRATCH = ./ so if I use the command</span></p>
<p><span style=""><span style="">echo $SCRATCH, it
always returns ./ </span><br>
</span></p>
<p><span style=""><span style=""><br>
</span></span></p>
<p><span style=""><span style="">For large jobs, I use
local temporary directory that is associated with
each node </span></span></p>
<p><span style=""><span style="">in our system and
is given by $SNIC_TMP. This temporary directory
is created </span></span></p>
<p><span style=""><span style="">on fly, so I set <span
style="">$SCRATCH = <span style="">$SNIC_TMP in
my job submission script.</span> As I said</span></span></span></p>
<p><span style=""><span style=""><span style="">this set
up works fine if I do MPI calculations without
spin-orbit and I get converged</span></span></span></p>
<p><span style=""><span style=""><span style="">results.
But if I submit the job after initializing with
spin-orbit, it crashes at lapwso.</span></span></span></p>
<p><span style=""><span style=""><span style="">SO I
think problem is probably not due to the set up
with scratch directory, it is</span></span></span></p>
<p><span style=""><span style=""><span style="">something
to do with MPI version of LAPWSO.</span></span></span></p>
<p><span style=""><span style=""><span style=""><br>
</span></span></span></p>
<p><span style=""><span style=""><span style=""><br>
</span></span></span></p>
<p><span style=""><span style=""><span style="">Thanks
for your comment.</span></span></span></p>
<p><span style=""><span style=""><span style=""><br>
</span></span></span></p>
<p><span style=""><span style=""><span style="">Fhokrul</span></span></span></p>
</div>
</blockquote>
</div>
</div>
</div>
</blockquote>
</body>
</html>