[Wien] Fail to parallel calculation of lapw1 and lapw2 (testpara1 and testpara2)
Gavin Abo
gsabo at crimson.ua.edu
Sun Oct 28 14:14:47 CET 2018
What does "ls -al ~/.ssh/config" give you?
That error is reproducible with Ubuntu 18.04.1 LTS:
username at computername:~$ cat ~/.ssh/config
Host *
HostName 127.0.0.1
User username
ForwardX11Trusted yes
GatewayPorts yes
GSSAPIAuthentication yes
username at computername:~$ chmod 666 ~/.ssh/config
username at computername:~$ ls -al ~/.ssh/config
-rw-rw-rw- 1 username username 131 Oct 28 06:54 /home/username/.ssh/config
username at computername:~$ ssh localhost
Bad owner or permissions on /home/username/.ssh/config
Using a set of proper chmod (and chown) file permission indeed seems to
fix the problem [
https://serverfault.com/questions/253313/ssh-returns-bad-owner-or-permissions-on-ssh-config
]:
username at computername:~$ chmod 644 ~/.ssh/config
username at computername:~$ ls -al ~/.ssh/config
-rw-r--r-- 1 username username 131 Oct 28 06:54 /home/username/.ssh/config
username at computername:~$ ssh localhost
...
Last login: Sun Oct 28 06:54:48 2018 from 127.0.0.1
username at computername:~$
Also, you might have to change "User localhost" to "User username" and
HostName may need changed from 0.0.0.0 to the loopback address 127.0.0.1
[ https://en.wikipedia.org/wiki/Localhost ] in your config file, where
username has to be replaced by your actual user name.
On 10/28/2018 4:04 AM, Woohyeon Baek wrote:
>
> Dear administraters or technicians of WIEN2k,
>
>
>
> Hello. I am an user of WIEN2k v17.1 and now upgraded to 18.2.
>
>
>
> (The specification of my nodes is 2 CPUs with 56 threads in total
> (Xeon intel E5-2696 series) and CentOS 17.)
>
>
> (I had no installation problems for ./siteconfig when I
> compiled all with intel compilers with mpi, fftw, scalapack, mkl and
> libxc library.)
>
>
>
> I have a problem of parallel calculation of lapw1 and lapw2 modules
> through w2web with tunneling of putty.
>
>
> (The input text and results are in below.)
>
>
> When I tried to calculate my system, it showed constant error about
> *bad users or permissions* on config file.
>
>
> When I check the archives and googles to solve, they said that the
> problem is in authorizations. So
>
>
> 1. I already did ssh-keygen command and appending key_authorized but
> it did not make any difference.
>
>
> 2. I tried changing authorities of config file by chmod and chown
> commands but it did not worked. (I could not find different solutions
> except this.)
>
>
> 3. I checked the *.error files of testpara1 and 2 results and it just
> shows nothing but Error without any comments.
>
>
>
> When I tried without parallization for small size system (only 1 job),
> the calculation worked without problems.
>
>
>
> I also checked testpara of each lapw modules and lapw1 and 2 showed
> errors.
>
>
> It seems lapw1 runs without parallelization and lapw2 does not work.
>
>
>
> I would really appreciated if there has a way how to solve problems.
>
>
> I am really thank you for your help in advance.
>
>
>
> (I used just 4 threads for test due to long sentences. Of course I
> tried using full threads but it did not worked.)
>
>
>
> *.machines file*
>
> -----------------------------
>
> granularity:1
> 1:localhost:4 (I tried my username but it did not worked. I also
> tried 1:localhost, 1:localhost localhost:1 and 1:localhost 1:localhost)
> lapw0:localhost:2 localhost:2
> dstart:localhost:2 localhost:2
> nlvdw:localhost:2 localhost:2
>
> ------------------------------
>
>
> *~/.ssh/config*
>
> -------------------------
>
> Host *
>
> HostName 0.0.0.0 (I also tried my fixed IP but it did not worked)
>
> User localhost
>
> ForwardX11Trusted yes
>
> GatewayPorts yes
>
> GSSAPIAuthentication yes
>
> -------------------------
>
>
> *SCF results*
>
> -----------------------------------------------------------------------------------------------------------------------------------------------------
>
> changing 1.in2c changing 1.in2_ls changing 1.in2_st changing 1.in2_sy
> LAPW0 END [1] Done mpirun -np 4 -machinefile .machine0
> /home/User/software/WIEN2K/lapw0_mpi lapw0.def >> .time00 DFTD3 END
> *Bad owner or permissions on /home/User/.ssh/config* [1] + Exit 255 (
> $remote $remotemachine "cd $PWD;$t $ttt;rm -f .lock_$lockfile[$p]" )
> >> .time1_$loop cat: .time1_1: No such file or directory cat:
> .time1_1: No such file or directory 1.scf1up_1: No such file or
> directory. cat: No match. grep: No match. grep: No match. grep: No
> match. > stop error
>
> ------------------------------------------------------------------------------------------------------------------------------
>
> *testpara*
>
> ------------------------------------------------------------
>
> #####################################################
> # TESTPARA #
> #####################################################
>
> Test: LAPW1 in parallel mode (using .machines)
> Granularity set to 1
> Extrafine unset
> weights: 1
> sumw: 1
> k-points: 30
>
> klist: 30
> machines: localhost
> procs: 1
> weigh(old): 1
> sumw: 1
> granularity: 1
> weigh(new): 30
>
> Distribution of k-point (under ideal conditions)
> will be:
>
> 1 : localhost(30) 30k
>
> -------------------------------------------------------
>
>
> *testpara1*
>
> -------------------------------------------------------------
>
> ##################################################### # TESTPARA1 #
> ##################################################### Sun Oct 28
> 18:12:33 KST 2018 lapw1para is running 30 of 30 (100%) k-points
> distributed localhost: running localhost: not running localhost: not
> running localhost: not running
> ------------------------------------------------------
>
> *testpara2*
>
> --------------------------------------------------------------
>
> #####################################################
> # TESTPARA2 #
> #####################################################
>
> Sun Oct 28 18:12:47 KST 2018
>
> lapw2para exited due to an ERROR
> Check *.error files
>
> ---------------------------------------------------------------
>
>
>
> Sincerely,
>
>
> Woohyeon Baek
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://zeus.theochem.tuwien.ac.at/pipermail/wien/attachments/20181028/cc4d0332/attachment.html>
More information about the Wien
mailing list