[Wien] parallel ssh error
Peter Blaha
pblaha at theochem.tuwien.ac.at
Mon Sep 30 08:59:07 CEST 2019
So there is progress as now the environment seems to be accepted in the
remote shell.
lapw1para (called by x_lapw, which is called by run_lapw -p) creates the
splitted klists-files (case.klist_1,...) and def files lapw1_1.def,...
It uses the $cwd variable and executes basically:
ssh vlsi1 "cd $cwd; lapw1c lapw1_1.def "
Does this work on your computers ?
On 9/29/19 7:16 PM, Indranil mal wrote:
> Now echo $WIENROOT is giving the $WIENROOT location.
>
> echo $WIENROOT/lapw*
>
> /home/username/WIEN2K/lapw0 /home/username/WIEN2K/lapw0_mpi
> /home/username/WIEN2K/lapw0para /home/username/WIEN2K/lapw0para_lapw
> /home/username/WIEN2K/lapw1 /home/username/WIEN2K/lapw1c
> /home/username/WIEN2K/lapw1c_mpi /home/username/WIEN2K/lapw1cpara
> /home/username/WIEN2K/lapw1_mpi /home/username/WIEN2K/lapw1para
> /home/username/WIEN2K/lapw1para_lapw /home/username/WIEN2K/lapw2
> /home/username/WIEN2K/lapw2c /home/username/WIEN2K/lapw2c_mpi
> /home/username/WIEN2K/lapw2cpara /home/username/WIEN2K/lapw2_mpi
> /home/username/WIEN2K/lapw2para /home/username/WIEN2K/lapw2para_lapw
> /home/username/WIEN2K/lapw3 /home/username/WIEN2K/lapw3c
> /home/username/WIEN2K/lapw5 /home/username/WIEN2K/lapw5c
> /home/username/WIEN2K/lapw7 /home/username/WIEN2K/lapw7c
> /home/username/WIEN2K/lapwdm /home/username/WIEN2K/lapwdmc
> /home/username/WIEN2K/lapwdmcpara /home/username/WIEN2K/lapwdmpara
> /home/username/WIEN2K/lapwdmpara_lapw /home/username/WIEN2K/lapwso
> /home/username/WIEN2K/lapwsocpara /home/username/WIEN2K/lapwso_mpi
> /home/username/WIEN2K/lapwsopara /home/username/WIEN2K/lapwsopara_lapw
>
> ssh vlsi1 'echo $WIENROOT/lapw*'
>
> /home/username/WIEN2K/lapw0 /home/username/WIEN2K/lapw0_mpi
> /home/username/WIEN2K/lapw0para /home/username/WIEN2K/lapw0para_lapw
> /home/username/WIEN2K/lapw1 /home/username/WIEN2K/lapw1c
> /home/username/WIEN2K/lapw1c_mpi /home/username/WIEN2K/lapw1cpara
> /home/username/WIEN2K/lapw1_mpi /home/username/WIEN2K/lapw1para
> /home/username/WIEN2K/lapw1para_lapw /home/username/WIEN2K/lapw2
> /home/username/WIEN2K/lapw2c /home/username/WIEN2K/lapw2c_mpi
> /home/username/WIEN2K/lapw2cpara /home/username/WIEN2K/lapw2_mpi
> /home/username/WIEN2K/lapw2para /home/username/WIEN2K/lapw2para_lapw
> /home/username/WIEN2K/lapw3 /home/username/WIEN2K/lapw3c
> /home/username/WIEN2K/lapw5 /home/username/WIEN2K/lapw5c
> /home/username/WIEN2K/lapw7 /home/username/WIEN2K/lapw7c
> /home/username/WIEN2K/lapwdm /home/username/WIEN2K/lapwdmc
> /home/username/WIEN2K/lapwdmcpara /home/username/WIEN2K/lapwdmpara
> /home/username/WIEN2K/lapwdmpara_lapw /home/username/WIEN2K/lapwso
> /home/username/WIEN2K/lapwsocpara /home/username/WIEN2K/lapwso_mpi
> /home/username/WIEN2K/lapwsopara /home/username/WIEN2K/lapwsopara_lapw
>
>
> However getting the same error
>
>
>
>
>> stop error
>
> grep: *scf1*: No such file or directory
> cp: cannot stat '.in.tmp': No such file or directory
> FERMI - Error
> grep: *scf1*: No such file or directory
> Parallel.scf1_1: No such file or directory.
> bash: fixerror_lapw: command not found
> bash: lapw1c: command not found
> bash: fixerror_lapw: command not found
> bash: lapw1c: command not found
> bash: fixerror_lapw: command not found
> bash: lapw1c: command not found
> bash: fixerror_lapw: command not found
> bash: lapw1c: command not found
> bash: fixerror_lapw: command not found
> bash: lapw1c: command not found
> bash: fixerror_lapw: command not found
> bash: lapw1c: command not found
> LAPW0 END
> hup: Command not found.
>
>
> and lapw2 error file
>
> 'LAPW2' - can't open unit: 30
> 'LAPW2' - filename: Parallel.energy_1
> ** testerror: Error in Parallel LAPW2
>
>
>
> On Sat, Sep 28, 2019 at 11:58 PM Gavin Abo <gsabo at crimson.ua.edu
> <mailto:gsabo at crimson.ua.edu>> wrote:
>
> The "sudo service sshd restart" step, which I forgot to copy and
> paste, that is missing is corrected below.
>
> On 9/28/2019 12:18 PM, Gavin Abo wrote:
>>
>> After you set both "SendEnv *" and "AcceptEnv *", did you restart
>> the sshd service [1]? The following illustrates steps that might
>> help you verify that WIENROOT appears on a remote vlsi node:
>>
>> username at computername:~$ echo $WIENROOT
>>
>> username at computername:~$ export WIENROOT=/servernode1
>> username at computername:~$ echo $WIENROOT
>> /servernode1
>> username at computername:~$ ssh vlsi
>> Welcome to Ubuntu 18.04.3 LTS (GNU/Linux 4.15.0-64-generic x86_64)
>> ...
>> Last login: Sat Sep 28 12:04:07 2019 from xxx.x.x.x
>> username at computername:~$ echo $WIENROOT
>>
>> username at computername:~$ exit
>> logout
>> Connection to vlsi closed.
>> username at computername:~$ sudo gedit /etc/ssh/ssh_config
>> [sudo] password for username:
>>
>> username at computername:~$ sudo gedit /etc/ssh/sshd_config
>>
>> username at computername:~$ grep SendEnv /etc/ssh/ssh_config
>> SendEnv LANG LC_* WIENROOT
>> username at computername:~$ grep AcceptEnv /etc/ssh/sshd_config
>> AcceptEnv LANG LC_* WIENROOT
>>
> username at computername:~$ sudo service sshd restart
>>
>> username at computername:~$ ssh vlsi
>> ...
>> username at computername:~$ echo $WIENROOT
>> /servernode1
>> username at computername:~$ exit
>>
>> [1]
>> https://askubuntu.com/questions/462968/take-changes-in-file-sshd-config-file-without-server-reboot
>>
>> On 9/28/2019 11:22 AM, Indranil mal wrote:
>>> Sir I have tried with " SetEnv * " Still nothing is coming with
>>> echo commad and user name by mistake I posted wrong Otherwise no
>>> issue with user name and I have set the parallel options file
>>> taksset "no" and remote options are 1 1 in server and client
>>> machines.
>>>
>>>
>>> On Sat, 28 Sep 2019 11:36 Gavin Abo, <gsabo at crimson.ua.edu
>>> <mailto:gsabo at crimson.ua.edu>> wrote:
>>>
>>>> Respected Sir, In my linux(Ubuntu 18.04 LTS) in ssh_config,
>>>> and in sshd_config there are two line already "SendEnv LANG
>>>> LC_*" "AcceptEnv LANG LC_*" respectively.
>>>
>>> The "LANG LC_*" probably only puts just the local language
>>> variables in the remote environment. Did you follow the
>>> previous advice [1] of trying to use "*" to put all variables
>>> from the local environment?
>>>
>>> [1]
>>> https://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/msg19049.html
>>>
>>>> However, ssh vsli1 'echo $WIENROOT' gives nothing (blank).
>>>
>>> That seems to be the main cause of the problem as it should
>>> not return (blank) but needs to return "/servernode1" as you
>>> previously mentioned [2].
>>>
>>> [2]
>>> https://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/msg19036.html
>>>
>>> Perhaps the message below is a clue. It you had set the
>>> WIENROOT variable in .bashrc of your /home/vlsi accounts on
>>> each system, you likely have to login and use that same
>>> /home/vlsi account on the head node as the output below seems
>>> to indicate login to a different /home/niel account.
>>> Alternatively, setting the WIENROOT variable in .bashrc of
>>> all /home/niel accounts on each node might work too.
>>>
>>>> The command ssh vsli1 'pwd $WIENROOT' print "/home/vlsi"
>>>> the common home directory and
>>>> ssh vlsi1 "env"
>>>> ...
>>>> USER=niel
>>>> PWD=/home/niel
>>>> HOME=/home/niel
>>>> ...
>>>> this is similar as server, and other nodes.
>>>>
>>>> Sir After changing the parallel option file in $WIENROOT in
>>>> server to
>>>>
>>>> setenv TASKSET *"yes" from "no"*
>>>> if ( ! $?USE_REMOTE ) setenv USE_REMOTE 1
>>>> if ( ! $?MPI_REMOTE ) setenv MPI_REMOTE 1
>>>> setenv WIEN_GRANULARITY 1
>>>> setenv DELAY 0.1
>>>> setenv SLEEPY 1
>>>> setenv WIEN_MPIRUN "mpirun -np _NP_ -machinefile _HOSTS_ _EXEC_"
>>>> setenv CORES_PER_NODE 1
>>>>
>>>> the error is not coming but the program is not increasing
>>>> steps after lapw0 it stuck in lapw1
>>>
>>> Since it seemed to be throwing an appropriate error message
>>> with TASKSET previously unlike when set to "yes", probably
>>> you should change it back to "no".
>>>
> _______________________________________________
> Wien mailing list
> Wien at zeus.theochem.tuwien.ac.at <mailto:Wien at zeus.theochem.tuwien.ac.at>
> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
> SEARCH the MAILING-LIST at:
> http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
>
>
> _______________________________________________
> Wien mailing list
> Wien at zeus.theochem.tuwien.ac.at
> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
> SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
>
--
P.Blaha
--------------------------------------------------------------------------
Peter BLAHA, Inst.f. Materials Chemistry, TU Vienna, A-1060 Vienna
Phone: +43-1-58801-165300 FAX: +43-1-58801-165982
Email: blaha at theochem.tuwien.ac.at WIEN2k: http://www.wien2k.at
WWW: http://www.imc.tuwien.ac.at/TC_Blaha
--------------------------------------------------------------------------
More information about the Wien
mailing list