[Wien] Fail to parallel calculation of lapw1 and lapw2 (testpara1 and testpara2)
Peter Blaha
pblaha at theochem.tuwien.ac.at
Sun Oct 28 16:29:54 CET 2018
You have User localhost in your config file ???
localhost should be your hostname, but not a user ???
I'd mv config to config_save (usually one does not need a config file).
Try:
ssh localhost
can you login without a username/password ??
If not, either the hostname is not supported, or your authorized_keys
file (why key_authorized ???) is wrong.
Am 28.10.2018 um 11:04 schrieb Woohyeon Baek:
> Dear administraters or technicians of WIEN2k,
>
>
>
> Hello. I am an user of WIEN2k v17.1 and now upgraded to 18.2.
>
>
>
> (The specification of my nodes is 2 CPUs with 56 threads in total (Xeon
> intel E5-2696 series) and CentOS 17.)
>
>
> (I had no installation problems for ./siteconfig when I
> compiled all with intel compilers with mpi, fftw, scalapack, mkl and
> libxc library.)
>
>
>
> I have a problem of parallel calculation of lapw1 and lapw2 modules
> through w2web with tunneling of putty.
>
>
> (The input text and results are in below.)
>
>
> When I tried to calculate my system, it showed constant error about *bad
> users or permissions* on config file.
>
>
> When I check the archives and googles to solve, they said that the
> problem is in authorizations. So
>
>
> 1. I already did ssh-keygen command and appending key_authorized but it
> did not make any difference.
>
>
> 2. I tried changing authorities of config file by chmod and chown
> commands but it did not worked. (I could not find different solutions
> except this.)
>
>
> 3. I checked the *.error files of testpara1 and 2 results and it just
> shows nothing but Error without any comments.
>
>
>
> When I tried without parallization for small size system (only 1 job),
> the calculation worked without problems.
>
>
>
> I also checked testpara of each lapw modules and lapw1 and 2 showed errors.
>
>
> It seems lapw1 runs without parallelization and lapw2 does not work.
>
>
>
> I would really appreciated if there has a way how to solve problems.
>
>
> I am really thank you for your help in advance.
>
>
>
> (I used just 4 threads for test due to long sentences. Of course I tried
> using full threads but it did not worked.)
>
>
>
> *.machines file*
>
> -----------------------------
>
> granularity:1
> 1:localhost:4 (I tried my username but it did not worked. I also
> tried 1:localhost, 1:localhost localhost:1 and 1:localhost 1:localhost)
> lapw0:localhost:2 localhost:2
> dstart:localhost:2 localhost:2
> nlvdw:localhost:2 localhost:2
>
> ------------------------------
>
>
> *~/.ssh/config*
>
> -------------------------
>
> Host *
>
> HostName 0.0.0.0 (I also tried my fixed IP but it did not worked)
>
> User localhost
>
> ForwardX11Trusted yes
>
> GatewayPorts yes
>
> GSSAPIAuthentication yes
>
> -------------------------
>
>
> *SCF results*
>
> -----------------------------------------------------------------------------------------------------------------------------------------------------
>
> changing 1.in2c changing 1.in2_ls changing 1.in2_st changing 1.in2_sy
> LAPW0 END [1] Done mpirun -np 4 -machinefile .machine0
> /home/User/software/WIEN2K/lapw0_mpi lapw0.def >> .time00 DFTD3 END *Bad
> owner or permissions on /home/User/.ssh/config* [1] + Exit 255 ( $remote
> $remotemachine "cd $PWD;$t $ttt;rm -f .lock_$lockfile[$p]" ) >>
> .time1_$loop cat: .time1_1: No such file or directory cat: .time1_1: No
> such file or directory 1.scf1up_1: No such file or directory. cat: No
> match. grep: No match. grep: No match. grep: No match. > stop error
>
> ------------------------------------------------------------------------------------------------------------------------------
>
> *testpara*
>
> ------------------------------------------------------------
>
> #####################################################
> # TESTPARA #
> #####################################################
>
> Test: LAPW1 in parallel mode (using .machines)
> Granularity set to 1
> Extrafine unset
> weights: 1
> sumw: 1
> k-points: 30
>
> klist: 30
> machines: localhost
> procs: 1
> weigh(old): 1
> sumw: 1
> granularity: 1
> weigh(new): 30
>
> Distribution of k-point (under ideal conditions)
> will be:
>
> 1 : localhost(30) 30k
>
> -------------------------------------------------------
>
>
> *testpara1*
>
> -------------------------------------------------------------
>
> ##################################################### # TESTPARA1 #
> ##################################################### Sun Oct 28
> 18:12:33 KST 2018 lapw1para is running 30 of 30 (100%) k-points
> distributed localhost: running localhost: not running localhost: not
> running localhost: not running
> ------------------------------------------------------
>
> *testpara2*
>
> --------------------------------------------------------------
>
> #####################################################
> # TESTPARA2 #
> #####################################################
>
> Sun Oct 28 18:12:47 KST 2018
>
> lapw2para exited due to an ERROR
> Check *.error files
>
> ---------------------------------------------------------------
>
>
>
> Sincerely,
>
>
> Woohyeon Baek
>
>
>
> _______________________________________________
> Wien mailing list
> Wien at zeus.theochem.tuwien.ac.at
> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
> SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
>
--
--------------------------------------------------------------------------
Peter BLAHA, Inst.f. Materials Chemistry, TU Vienna, A-1060 Vienna
Phone: +43-1-58801-165300 FAX: +43-1-58801-165982
Email: blaha at theochem.tuwien.ac.at WIEN2k: http://www.wien2k.at
WWW:
http://www.imc.tuwien.ac.at/tc_blaha-------------------------------------------------------------------------
More information about the Wien
mailing list