<div dir="ltr"><div>
<b style="font-size:small;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial">dear dr <span style="color:rgb(32,33,36);font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif;font-size:0.875rem;letter-spacing:0.2px;white-space:nowrap">Gavin Abo</span></b>
<br></div><div>actually, the problem was solved by adding the hostname in the hosts file in all the nodes and not only in the master node.</div><div><br></div><div>now the calculation works very well but at each excusion of LAPW0 in the scf I get this error without affecting the calculations :</div><div><i>""calcul.23539PSM2 no hfi units are available (err=23)""</i><br></div><div><br></div><div>I would be grateful if you can help me solve this problem even though it does not affect the calculations<br></div></div><br><div class="gmail_quote"><div dir="ltr">Le jeu. 19 juil. 2018 à 14:06, Gavin Abo <<a href="mailto:gsabo@crimson.ua.edu">gsabo@crimson.ua.edu</a>> a écrit :<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF">
<p><font face="Times New Roman">A response off the mailing list:</font></p>
<p><font face="Times New Roman">I currently do not know. Your
.machines file seems fine.<br>
</font></p>
<p><font face="Times New Roman">I don't know for sure, but the
calcul.local appears to be coming from your mpi program and
maybe not WIEN2k. You do not say what mpi program package
(openmpi, intelmpi, MPICH, or other) you are using. Though,
from the error message, it looks like you may be using MPICH.
If you are using MPICH, I have limited experience with it. So,
you may have to ask the MPICH experts about the "unable to get
host address" and "unable to connect to server" errors [
<a class="m_308653021920269363moz-txt-link-freetext" href="https://www.mpich.org/support/" target="_blank">https://www.mpich.org/support/</a> ].</font></p>
<p><font face="Times New Roman">Since you did not mention, I assume
your using one of the latest WIEN2k versions (WIEN2k 18.1 or
18.2). There may have been some WIEN2k mpi bugs in previous
versions. So, if you are using a older version, you may want to
try the latest WIEN2k 18.2 version to see if it maybe resolves
the problem.<br>
</font></p>
<p><font face="Times New Roman">You might try resolving the hostnames
and check the ip addresses.</font></p>
<p><font face="Times New Roman">Check and see if the ip address set
in the hosts file for calcul.local are the same or different
from master, node1, and node2.</font></p>
<p><font face="Times New Roman">For example, I think you can resolve
the hostname to an ip address using on the cluster the terminal
commands:</font></p>
<font face="Times New Roman">ping -c 1 calcul.local</font><br>
<font face="Times New Roman"><font face="Times New Roman">ping -c 1
master</font></font><br>
<font face="Times New Roman"><font face="Times New Roman">ping -c 1
node1</font></font><br>
<font face="Times New Roman"><font face="Times New Roman"><font face="Times New Roman"><font face="Times New Roman">ping -c 1
node2<br>
<br>
After doing the above ping commands on the master node, you
may want to do the above ping commands while on each of the
subnodes like node1 after first using for example:<br>
<br>
ssh node1<br>
<br>
For example, maybe on the master, it can resolve the ip
address from </font></font></font></font><font face="Times
New Roman"><font face="Times New Roman"><font face="Times New
Roman"><font face="Times New Roman"><font face="Times New
Roman">calcul.local. However, if you login into node2
(ssh node2), maybe node2 cannot resolve the ip address
from calcul.local. That may be another possible cause of
the problem.<br>
<br>
Unfortunately, since I don't have access to a system
having that exact same error, it is hard to see why those
errors are happening as there seems to be many possible
causes of that problem and not a single one.</font> <br>
<br>
Kind Regards,<br>
<br>
Gavin<br>
<br>
</font></font></font></font>
<div class="m_308653021920269363moz-cite-prefix">On 7/19/2018 5:05 AM, karima Physique
wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr"><b>Thank you dr <span style="color:rgb(32,33,36);font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif;font-size:0.875rem;letter-spacing:0.2px;white-space:nowrap">Gavin
Abo</span></b>
<div>
<div><font face="Roboto, RobotoDraft, Helvetica, Arial,
sans-serif" color="#202124"><span style="font-size:14px;letter-spacing:0.2px;white-space:nowrap"><b>I
checked the etc/hosts file and it is ok</b></span></font></div>
<div><font face="Roboto, RobotoDraft, Helvetica, Arial,
sans-serif" color="#202124"><span style="font-size:14px;letter-spacing:0.2px;white-space:nowrap"><b>but
why lapw1_mpi works fine and in all the nodes while
dstart_mi and lapw0_mpi do not work on the nodes</b></span></font></div>
</div>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr">Le jeu. 19 juil. 2018 à 04:23, Gavin Abo <<a href="mailto:gsabo@crimson.ua.edu" target="_blank">gsabo@crimson.ua.edu</a>>
a écrit :<br>
</div>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF">
<p>As the error message says, one possible cause is the
connection being blocked by a firewall.<br>
</p>
<p>Another possible cause is a ssh passwordless access
problem:<br>
</p>
<p><a class="m_308653021920269363m_6907484685154782269moz-txt-link-freetext" href="https://stackoverflow.com/questions/19565795/unable-to-execute-mpich2-on-multiple-machines-on-ubuntu-12-04-hydu-sock-connect" target="_blank">https://stackoverflow.com/questions/19565795/unable-to-execute-mpich2-on-multiple-machines-on-ubuntu-12-04-hydu-sock-connect</a></p>
<p>Yet, another possible cause is a problem resolving the
DNS hostname:</p>
<a class="m_308653021920269363m_6907484685154782269moz-txt-link-freetext" href="https://forums.suse.com/archive/index.php/t-6057.html" target="_blank">https://forums.suse.com/archive/index.php/t-6057.html</a><br>
<a class="m_308653021920269363m_6907484685154782269moz-txt-link-freetext" href="https://www.slothparadise.com/running-mpi-common-mpi-troubleshooting-problems/" target="_blank">https://www.slothparadise.com/running-mpi-common-mpi-troubleshooting-problems/</a><br>
<p>Since /etc/hosts usually cannot be edited by a user, the
cluster administrator would have to fix the hosts file if
that happens to be the source of the problem.</p>
<div class="m_308653021920269363m_6907484685154782269moz-cite-prefix">On
7/18/2018 6:07 PM, karima Physique wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">Dear wien2k users:
<div><br>
</div>
<div>Using the folowing machines files : </div>
<div>lapw0:master:12</div>
<div><span style="font-size:small;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial;float:none;display:inline">dstart:master:12</span></div>
<div> 1:master:12</div>
<div>1:node1:12</div>
<div> <span style="font-size:small;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial;float:none;display:inline">1:node2:12</span>
<br>
</div>
<div>......</div>
<div>the calculation works very well, but using the
following machines file:<br>
</div>
<div>
<div style="font-size:small;text-decoration-style:initial;text-decoration-color:initial">lapw0:master:12
<span style="background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial;float:none;display:inline">node1:12
<span style="font-size:small;text-decoration-style:initial;text-decoration-color:initial;float:none;display:inline">node2:12</span>
</span></div>
<div style="font-size:small;text-decoration-style:initial;text-decoration-color:initial"><span style="font-size:small;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial;float:none;display:inline">dstart:master:12
<span style="text-decoration-style:initial;text-decoration-color:initial;float:none;display:inline">node1:12
<span style="font-size:small;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial;float:none;display:inline">node2:12</span>
</span></span></div>
<div style="font-size:small;text-decoration-style:initial;text-decoration-color:initial">1:master:12</div>
<div style="font-size:small;text-decoration-style:initial;text-decoration-color:initial">1:node1:12</div>
<div style="font-size:small;text-decoration-style:initial;text-decoration-color:initial"><span style="font-size:small;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial;float:none;display:inline">1:node2:12</span><span> </span></div>
.......</div>
<div>I got the following error:<br>
</div>
<div><br>
</div>
<div>unable to get host adress calcul.local for (1)</div>
<div> <span style="font-size:small;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial;float:none;display:inline">unable
to<span> </span></span> connect to server <span style="font-size:small;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial;float:none;display:inline">calcul.local</span>
at port 44295 (chek for firewalls!)<br>
</div>
<div>we note that <span style="font-size:small;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial;float:none;display:inline">calcul.local
is the host to connect to w2web.</span></div>
<div><span style="background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial;float:none;display:inline">I
ask you any suggestions to solve this problem<br>
</span></div>
</div>
</blockquote>
</div>
_______________________________________________<br>
Wien mailing list<br>
<a href="mailto:Wien@zeus.theochem.tuwien.ac.at" target="_blank">Wien@zeus.theochem.tuwien.ac.at</a><br>
<a href="http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien" rel="noreferrer" target="_blank">http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien</a><br>
SEARCH the MAILING-LIST at: <a href="http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html" rel="noreferrer" target="_blank">http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html</a><br>
</blockquote>
</div>
<br>
<fieldset class="m_308653021920269363mimeAttachmentHeader"></fieldset>
<br>
<pre>_______________________________________________
Wien mailing list
<a class="m_308653021920269363moz-txt-link-abbreviated" href="mailto:Wien@zeus.theochem.tuwien.ac.at" target="_blank">Wien@zeus.theochem.tuwien.ac.at</a>
<a class="m_308653021920269363moz-txt-link-freetext" href="http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien" target="_blank">http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien</a>
SEARCH the MAILING-LIST at: <a class="m_308653021920269363moz-txt-link-freetext" href="http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html" target="_blank">http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html</a>
</pre>
</blockquote>
<br>
</div>
</blockquote></div>