<html>
<head>
<meta content="text/html; charset=windows-1252"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<div class="moz-cite-prefix">An additional comment, in the post at:<br>
<br>
<a class="moz-txt-link-freetext" href="https://arc.liv.ac.uk/pipermail/gridengine-users/2010-October/032729.html">https://arc.liv.ac.uk/pipermail/gridengine-users/2010-October/032729.html</a><br>
<br>
You can see that they have the error of the form:<br>
<br>
<div>error: commlib error: got select error (Connection reset by
peer)</div>
<div>error: executing task of job x failed: failed sending task to
execd@hostname: can't find connection</div>
<br>
It looks like they might have tracked down the problem to the
master daemon (qmaster), as seen in the post at:<br>
<br>
<a class="moz-txt-link-freetext" href="https://arc.liv.ac.uk/pipermail/gridengine-users/2010-October/032758.html">https://arc.liv.ac.uk/pipermail/gridengine-users/2010-October/032758.html</a><br>
<br>
So, maybe, the error could be caused by a daemon problem (with the
tachyon1478 node).<br>
<br>
On 7/10/2015 5:01 AM, Laurence Marks wrote:<br>
</div>
<blockquote
cite="mid:CANkSMZCiey8T5T6+g_2nJwZuCBFU_Ay-B3DHB9H37jpXD90PmQ@mail.gmail.com"
type="cite">
<p dir="ltr">From a brief Google search this is an mpi error.</p>
<p dir="ltr">How did you compile, it is easy to use wrong blacs
combinations.</p>
<p dir="ltr">Have you run simpler cases such as TiC first?</p>
<p dir="ltr">---<br>
Professor Laurence Marks<br>
Department of Materials Science and Engineering<br>
Northwestern University<br>
<a moz-do-not-send="true"
href="http://www.numis.northwestern.edu">http://www.numis.northwestern.edu</a><br>
Corrosion in 4D <a moz-do-not-send="true"
href="http://MURI4D.numis.northwestern.edu">http://MURI4D.numis.northwestern.edu</a><br>
Co-Editor, Acta Cryst A<br>
"Research is to see what everybody else has seen, and to think
what nobody else has thought"<br>
Albert Szent-Gyorgi</p>
<div class="gmail_quote">On Jul 10, 2015 03:05, "Imran Khan" <<a
moz-do-not-send="true"
href="mailto:imrankhanswati80@gmail.com">imrankhanswati80@gmail.com</a>>
wrote:<br type="attribution">
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
<div>
<div dir="ltr"><span style="font-size:12.8000001907349px">Dear
wien2k experts and users,</span>
<div style="font-size:12.8000001907349px">I am using
wien2k version 14.2 on a queuing system (SGE), with
intel compiler 11.1, MPI libraries mpi/openmpi-1.6.3 and
math libraries fftw-3.3.4. With these options I install
Wien2K without any compile time error.</div>
<div style="font-size:12.8000001907349px">The purpose of
my calculation is to find the stable site for different
substituents in NdFeB intermetallics. </div>
<div style="font-size:12.8000001907349px">I am running the
case.struct given in the attachment, using 200 (6 6 4)
k-points. My RKmax value is 7 and Gmax is 12, and I am
using LDA+U method.</div>
<div style="font-size:12.8000001907349px">I am using the
following command runsp_lapw -p -orb -i 80 -ec 0.0001
-cc 0.001</div>
<div style="font-size:12.8000001907349px">Every time I
submit my job after few scf cycles the job is terminated
with the following error in the error tag file. </div>
<div style="font-size:12.8000001907349px">
<div><br>
</div>
<div>error: commlib error: got select error (Connection
reset by peer)</div>
<div>error: executing task of job 2424636 failed: failed
sending task to execd@tachyon1478: can't find
connection</div>
</div>
<div style="font-size:12.8000001907349px"> .</div>
<div style="font-size:12.8000001907349px"> .</div>
<div style="font-size:12.8000001907349px"> .</div>
<div style="font-size:12.8000001907349px">
<div> LAPW2 END</div>
<div> LAPW2 END</div>
<div> LAPW2 END</div>
<div> LAPW2 END</div>
<div>real 0m53.638s</div>
<div>forrtl: No such file or directory</div>
<div>forrtl: severe (29): file not found, unit 21, file
/home01/x1030imr/khan/Wien2K/Neomagnet/Pr-doped/f-site/AFM/Pr-Af/Pr-Af.scf2up_31</div>
<div>Image PC Routine
Line Source</div>
<div>sumpara 00000000004A671D Unknown
Unknown Unknown</div>
<div>sumpara 00000000004A5225 Unknown
Unknown Unknown</div>
<div>sumpara 0000000000456259 Unknown
Unknown Unknown</div>
<div>sumpara 0000000000416A5A Unknown
Unknown Unknown</div>
<div>sumpara 0000000000416250 Unknown
Unknown Unknown</div>
<div>sumpara 0000000000421E3D Unknown
Unknown Unknown</div>
<div>sumpara 0000000000410771 scfsum_
126 scfsum.f</div>
<div>sumpara 000000000040EE82 MAIN__
219 sumpara.f</div>
<div>sumpara 00000000004033DC Unknown
Unknown Unknown</div>
<div>libc.so.6 00000035AA81D974 Unknown
Unknown Unknown</div>
<div>sumpara 00000000004032E9 Unknown
Unknown Unknown</div>
<div>cp: cannot stat `.in.tmp': No such file or
directory</div>
</div>
<div style="font-size:12.8000001907349px"><br>
</div>
<div style="font-size:12.8000001907349px">I have discussed
this error with the engineers of that queuing system
(tachyon), and I have searched the mailing list as well
but could not find any solutions.</div>
<div style="font-size:12.8000001907349px">your guidance to
solve this issue will be greatly appreciated.</div>
<div style="font-size:12.8000001907349px">Best regards</div>
<div style="font-size:12.8000001907349px">Imran.</div>
</div>
</div>
</blockquote>
</div>
</blockquote>
</body>
</html>