[Wien] parallel ssh error

Indranil mal indranil.mal at gmail.com
Sat Sep 28 14:53:04 CEST 2019


Respected Sir, In my linux(Ubuntu 18.04 LTS) in ssh_config, and in
sshd_config there are two line already "SendEnv LANG LC_*" "AcceptEnv LANG
LC_*" respectively. However, ssh vsli1 'echo $WIENROOT' gives nothing
(blank).   The command ssh vsli1 'pwd $WIENROOT' print "/home/vlsi" the
common home directory and
ssh vlsi1 "env"
SSH_CONNECTION=172.27.46.251 44138 172.27.46.233 22
LANG=en_IN
XDG_SESSION_ID=47
USER=niel
PWD=/home/niel
HOME=/home/niel
SSH_CLIENT=172.27.46.251 44138 22
LC_NUMERIC=POSIX
MAIL=/var/mail/niel
SHELL=/bin/bash
SHLVL=1
LANGUAGE=en_IN:en
LOGNAME=niel
DBUS_SESSION_BUS_ADDRESS=unix:path=/run/user/1000/bus
XDG_RUNTIME_DIR=/run/user/1000
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games
_=/usr/bin/env

this is similar as server, and other nodes.


Sir After changing the parallel option file in $WIENROOT in server to

setenv TASKSET *"yes" from "no"*
if ( ! $?USE_REMOTE ) setenv USE_REMOTE 1
if ( ! $?MPI_REMOTE ) setenv MPI_REMOTE 1
setenv WIEN_GRANULARITY 1
setenv DELAY 0.1
setenv SLEEPY 1
setenv WIEN_MPIRUN "mpirun -np _NP_ -machinefile _HOSTS_ _EXEC_"
setenv CORES_PER_NODE 1

the error is not coming but the program is not increasing steps after lapw0
it stuck in lapw1


what should be the parallel option file in server and all client node?



On Fri, Sep 27, 2019 at 12:05 PM Peter Blaha <pblaha at theochem.tuwien.ac.at>
wrote:

> Ok. So the problem seems to be that in your linux the ssh does not
> send/accept the "environment".
>
> What do you get with:
>
> ssh vsli2 'echo $WIENROOT'
>
> If you have root permissions, I suggest to do the following:
>
> At least on my Linux (Suse) there is a  /etc/ssh   directory, with files
>
> ssh_config and sshd_config.
>
> Edit these files and add lines:
> SendEnv *      # in ssh_config
> AcceptEnv *    # in sshd_config
>
>
>
> On 9/27/19 11:20 AM, Indranil mal wrote:
> > Respected Sir, As per Your suggestion I have done the single process
> > with one iteration successfully encountered no issue in all the nodes.
> > However in parallel running facing the same  error
> >
> > grep: *scf1*: No such file or directory
> > cp: cannot stat '.in.tmp': No such file or directory
> > FERMI - Error
> > grep: *scf1*: No such file or directory
> > Parallel.scf1_1: No such file or directory.
> > bash: fixerror_lapw: command not found
> > bash: lapw1c: command not found
> > bash: fixerror_lapw: command not found
> > bash: lapw1c: command not found
> > bash: fixerror_lapw: command not found
> > bash: lapw1c: command not found
> > bash: fixerror_lapw: command not found
> > bash: lapw1c: command not found
> > bash: fixerror_lapw: command not found
> > bash: lapw1c: command not found
> > bash: fixerror_lapw: command not found
> > bash: lapw1c: command not found
> > bash: fixerror_lapw: command not found
> > bash: lapw1c: command not found
> > bash: fixerror_lapw: command not found
> > bash: lapw1c: command not found
> >   LAPW0 END
> > hup: Command not found.
> >
> > Previously I was doing a mistake with user name and home directory now
> in all the pc the user name and the home directory is same (/home/vlsi) is
> same and the working directory is accessible from every node.
> >
> >   (ls -l $WIENROOT/lapw1c
> > -rwxr-xr-x 1 vlsi vlsi 2151824 Sep 26 02:41 /servernode01/lapw1c) this
> reflects in all the pcs.
> >
> >
> >
> >
> > On Thu, Sep 26, 2019 at 1:27 PM Peter Blaha
> > <pblaha at theochem.tuwien.ac.at <mailto:pblaha at theochem.tuwien.ac.at>>
> wrote:
> >
> >     First of all, one of the errors was: lapw1c: command not found
> >
> >     You showed us only the existence of "lapw1", not "lapw1c" with the ls
> >     commands.
> >
> >     However, since you also have:  fixerror_lapw: command not found
> >
> >     I don't think that this is the problem.
> >
> >     -------------
> >     I'm more concerned about the different usernames/owners of lapw1 on
> >     different computers.
> >     It is not important who owns $WIENROOT/*, as long as everybody has
> r-x
> >     permissions.
> >
> >     However, what is your username and your home-directory on the
> different
> >     machines ? It must be the same ! And do you have access to the actual
> >     working directory ?
> >     In what directory did you start the calculations?
> >     Is it a directory called "Parallel" ? What is the full path of that
> on
> >     every computer (/casenode1/Parallel ?)
> >     ----------------------
> >
> >     First check would be:
> >
> >     On vlsi1 change into the working directory (Parallel ?) and run one
> >     iteration without parallelization:   run -i 1
> >
> >     then login to   ssh vsli2 (passwordless), cd into "Parallel" and do
> >     another non-parallel cycle.  Does it work ?
> >     -----------
> >
> >
> >     On 9/26/19 11:48 AM, Indranil mal wrote:
> >      > Dear developers and users
> >      >                                          I have 5 individual Linux
> >      > (Ubuntu) pc with intel i7 octa core processors and 16GB RAM in
> each
> >      > connected via a 1GBps LAN.  password less ssh working properly. I
> >     have
> >      > installed WIEN2K 19 in the one machine (M1 server) in the
> directory
> >      > "/servernode1" and the case directory is "/casenode1"  and
> >     through NFS I
> >      > have mounted the "servernode1", and "casenode1" in other four pcs
> >     with
> >      > same name local folders ("servernode1", and "casenode1") in them.
> >     I have
> >      > installed ,intel compilers, libxc, fftw, elpa in all the nodes
> >      > individually. I have manually edited the bash file  $WIENROOT
> >     path and
> >      > case directory and the WIEN2K options file. Keep all the value
> >     same in
> >      > all the client nodes as it is in the server node.
> >      >
> >      > alias cdw="cd /casenode1"
> >      > export OMP_NUM_THREADS=4
> >      > #export LD_LIBRARY_PATH=.....
> >      > export EDITOR="emacs"
> >      > export SCRATCH=./
> >      > export WIENROOT=/servernode1
> >      > export W2WEB_CASE_BASEDIR=/casenode1
> >      > export STRUCTEDIT_PATH=$WIENROOT/SRC_structeditor/bin
> >      >
> >      > Now when I am doing parallel calculations with all the client
> >     nodes in
> >      > machine file ,
> >      > # k-points are left, they will be distributed to the
> >     residual-machine_name.
> >      > #
> >      > 1:vlsi1
> >      > 1:vlsi2
> >      > 1:vlsi3
> >      > 1:vlsi4
> >      >
> >      > granularity:1
> >      > extrafine:1
> >      > #
> >      >
> >      >
> >      > and getting the following error
> >      >
> >      > grep: *scf1*: No such file or directory
> >      > cp: cannot stat '.in.tmp': No such file or directory
> >      > FERMI - Error
> >      > grep: *scf1*: No such file or directory
> >      > Parallel.scf1_1: No such file or directory.
> >      > bash: fixerror_lapw: command not found
> >      > bash: lapw1c: command not found
> >      > bash: fixerror_lapw: command not found
> >      > bash: lapw1c: command not found
> >      >   LAPW0 END
> >      > hup: Command not found.
> >      >
> >      > ###################Error file lapw2 error
> >      >   'LAPW2' - can't open unit: 30
> >      >   'LAPW2' -        filename: Parallel.energy_1
> >      > **  testerror: Error in Parallel LAPW2
> >      >
> >      > I have checked the with " ls -l $WIENROOT/lapw1" as suggested in
> the
> >      > previous mailing list and got the
> >      > -rwxr-xr-x 1 vlsi vlsi 2139552 Sep 26 02:41 /servernode1/lapw1 for
> >      > server (vlsi is the user name in server)
> >      > -rwxr-xr-x 1 vlsi1 vlsi1 2139552 Sep 26 02:41 /servernode1/lapw1
> (in
> >      > node1 the user name is vlsi1)
> >      > -rwxr-xr-x 1 vlsi2 vlsi2 2139552 Sep 26 02:41 /servernode1/lapw1
> (in
> >      > node2 the user name is vlsi2)
> >      > please help
> >      >
> >      >
> >      > thanking you
> >      > Indranil
> >      >
> >      >
> >      >
> >      >
> >      > _______________________________________________
> >      > Wien mailing list
> >      > Wien at zeus.theochem.tuwien.ac.at
> >     <mailto:Wien at zeus.theochem.tuwien.ac.at>
> >      > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
> >      > SEARCH the MAILING-LIST at:
> >
> http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
> >      >
> >
> >     --
> >
> >                                             P.Blaha
> >
>  --------------------------------------------------------------------------
> >     Peter BLAHA, Inst.f. Materials Chemistry, TU Vienna, A-1060 Vienna
> >     Phone: +43-1-58801-165300             FAX: +43-1-58801-165982
> >     Email: blaha at theochem.tuwien.ac.at
> >     <mailto:blaha at theochem.tuwien.ac.at>    WIEN2k: http://www.wien2k.at
> >     WWW: http://www.imc.tuwien.ac.at/TC_Blaha
> >
>  --------------------------------------------------------------------------
> >     _______________________________________________
> >     Wien mailing list
> >     Wien at zeus.theochem.tuwien.ac.at <mailto:
> Wien at zeus.theochem.tuwien.ac.at>
> >     http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
> >     SEARCH the MAILING-LIST at:
> >
> http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
> >
> >
> > _______________________________________________
> > Wien mailing list
> > Wien at zeus.theochem.tuwien.ac.at
> > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
> > SEARCH the MAILING-LIST at:
> http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
> >
>
> --
>
>                                        P.Blaha
> --------------------------------------------------------------------------
> Peter BLAHA, Inst.f. Materials Chemistry, TU Vienna, A-1060 Vienna
> Phone: +43-1-58801-165300             FAX: +43-1-58801-165982
> Email: blaha at theochem.tuwien.ac.at    WIEN2k: http://www.wien2k.at
> WWW:   http://www.imc.tuwien.ac.at/TC_Blaha
> --------------------------------------------------------------------------
> _______________________________________________
> Wien mailing list
> Wien at zeus.theochem.tuwien.ac.at
> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
> SEARCH the MAILING-LIST at:
> http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://zeus.theochem.tuwien.ac.at/pipermail/wien/attachments/20190928/a0b2edd5/attachment.html>


More information about the Wien mailing list