[Wien] ** testerror: Error in Parallel LAPW

Gavin Abo gabo13279 at gmail.com
Wed Jun 21 07:27:24 CEST 2023


The "Host key verification failed" is an error from ssh [1].


Thus, it seems like you need to fix your ssh so that WIEN2k can connect 
to your remote node (in your case below, it looks like the remote node 
is lxbk1177).


It looks like there is an ssh example on slide 10 of the WIEN2k 
presentation at [2].


I believe it is common to use ssh-keygen to create a key pair (private 
and public key) on the head node and then use ssh-copy-id to put the 
public key on each of the remote nodes.  However, ssh can be uniquely 
configured for a computer system.  So, you might want to search online 
for different examples on how ssh has been configured.  One example that 
might be helpful should be at [3].


[1] 
https://askubuntu.com/questions/45679/ssh-connection-problem-with-host-key-verification-failed-error

[2] 
https://www.bc.edu/content/dam/bc1/schools/mcas/physics/pdf/wien2k/PB-installation.pdf

[3] 
https://www.digitalocean.com/community/tutorials/ssh-essentials-working-with-ssh-servers-clients-and-keys


Kind Regards,

Gavin

WIEN2k user


On 6/20/2023 3:25 PM, Ilias Miroslav, doc. RNDr., PhD. wrote:
> Dear Professor Blaha,
>
> thanks, I used PATH variable extension instead of linking;
>
> it  crashed with the message  "Host key verification failed. "
>
> Herethe content of file 
> /lustre/ukt/milias/scratch/Wien2k_23.2_job.main.N1.n4.jid3009460/LvO2onQg/.machines:
> 1:lxbk1177
> 1:lxbk1177
> 1:lxbk1177
> 1:lxbk1177
> 1:lxbk1177
> 1:lxbk1177
> 1:lxbk1177
> 1:lxbk1177
>
> Job is running on lxbk1177, with 8 cpus allocated;
>
> and this is from log :
>
> running x dstart :
> starting parallel dstart at Tue 20 Jun 2023 05:16:21 PM CEST
> -------- .machine0 : processors
> running dstart in single mode
> STOP DSTART ENDS
> 10.249u 0.322s 0:11.19 94.3%    0+0k 158496+101160io 437pf+0w
>
> running 'run_lapw -p -ec 0.0001 -NI'
> STOP  LAPW0 END
> Host key verification failed.
> [1]  + Done                          ( ( $remote $machine[$p] "cd 
> $PWD;$set_OMP_NUM_THREADS;$t $taskset0 $exe ${def}_$loop.def ;fixerr
> or_lapw ${def}_$loop"; rm -f .lock_$lockfile[$p] ) >& .stdout1_$loop; 
> if ( -f .stdout1_$loop ) bashtime2csh.pl_lapw .stdout1_$loop > .
> temp1_$loop; grep \% .temp1_$loop >> .time1_$loop; grep -v \% 
> .temp1_$loop | perl -e "print stderr <STDIN>" )
> Host key verification failed.
> [1]  + Done                          ( ( $remote $machine[$p] "cd 
> $PWD;$set_OMP_NUM_THREADS;$t $taskset0 $exe ${def}_$loop.def 
> ;fixerror_lapw ${def}_$loop"; rm -f .lock_$lockfile[$p] ) >& .stdo
> ut1_$loop; if ( -f .stdout1_$loop ) bashtime2csh.pl_lapw 
> .stdout1_$loop > .temp1_$loop; grep \% .temp1_$loop >> .time1_$loop; 
> grep -v \% .temp1_$loop | perl -e "print stderr <STDIN>" )
> Host key verification failed.
> [1]  + Done                          ( ( $remote $machine[$p] "cd 
> $PWD;$set_OMP_NUM_THREADS;$t $taskset0 $exe ${def}_$loop.def 
> ;fixerror_lapw ${def}_$loop"; rm -f .lock_$lockfile[$p] ) >& .stdo
> ut1_$loop; if ( -f .stdout1_$loop ) bashtime2csh.pl_lapw 
> .stdout1_$loop > .temp1_$loop; grep \% .temp1_$loop >> .time1_$loop; 
> grep -v \% .temp1_$loop | perl -e "print stderr <STDIN>" )
> Host key verification failed.
> [1]  + Done                          ( ( $remote $machine[$p] "cd 
> $PWD;$set_OMP_NUM_THREADS;$t $taskset0 $exe ${def}_$loop.def 
> ;fixerror_lapw ${def}_$loop"; rm -f .lock_$lockfile[$p] ) >& .stdo
> ut1_$loop; if ( -f .stdout1_$loop ) bashtime2csh.pl_lapw 
> .stdout1_$loop > .temp1_$loop; grep \% .temp1_$loop >> .time1_$loop; 
> grep -v \% .temp1_$loop | perl -e "print stderr <STDIN>" )
> Host key verification failed.
> [1]  + Done                          ( ( $remote $machine[$p] "cd 
> $PWD;$set_OMP_NUM_THREADS;$t $taskset0 $exe ${def}_$loop.def 
> ;fixerror_lapw ${def}_$loop"; rm -f .lock_$lockfile[$p] ) >& .stdo
> ut1_$loop; if ( -f .stdout1_$loop ) bashtime2csh.pl_lapw 
> .stdout1_$loop > .temp1_$loop; grep \% .temp1_$loop >> .time1_$loop; 
> grep -v \% .temp1_$loop | perl -e "print stderr <STDIN>" )
> Host key verification failed.
> [1]  + Done                          ( ( $remote $machine[$p] "cd 
> $PWD;$set_OMP_NUM_THREADS;$t $taskset0 $exe ${def}_$loop.def 
> ;fixerror_lapw ${def}_$loop"; rm -f .lock_$lockfile[$p] ) >& .stdo
> ut1_$loop; if ( -f .stdout1_$loop ) bashtime2csh.pl_lapw 
> .stdout1_$loop > .temp1_$loop; grep \% .temp1_$loop >> .time1_$loop; 
> grep -v \% .temp1_$loop | perl -e "print stderr <STDIN>" )
> Host key verification failed.
> [1]  + Done                          ( ( $remote $machine[$p] "cd 
> $PWD;$set_OMP_NUM_THREADS;$t $taskset0 $exe ${def}_$loop.def 
> ;fixerror_lapw ${def}_$loop"; rm -f .lock_$lockfile[$p] ) >& .stdo
> ut1_$loop; if ( -f .stdout1_$loop ) bashtime2csh.pl_lapw 
> .stdout1_$loop > .temp1_$loop; grep \% .temp1_$loop >> .time1_$loop; 
> grep -v \% .temp1_$loop | perl -e "print stderr <STDIN>" )
> Host key verification failed.
> [1]    Done                          ( ( $remote $machine[$p] "cd 
> $PWD;$set_OMP_NUM_THREADS;$t $taskset0 $exe ${def}_$loop.def 
> ;fixerror_lapw ${def}_$loop"; rm -f .lock_$lockfile[$p] ) >& .stdo
> ut1_$loop; if ( -f .stdout1_$loop ) bashtime2csh.pl_lapw 
> .stdout1_$loop > .temp1_$loop; grep \% .temp1_$loop >> .time1_$loop; 
> grep -v \% .temp1_$loop | perl -e "print stderr <STDIN>" )
> LvO2onQg.scf1_1: No such file or directory.
> grep: *scf1*: No such file or directory
> STOP FERMI - Error
> cp: cannot stat '.in.tmp': No such file or directory
> grep: *scf1*: No such file or directory
>
> >   stop error
>
>
>
> file ":parallel"
>
> starting parallel lapw1 at Tue 20 Jun 2023 05:17:08 PM CEST
>     lxbk1177(4)      lxbk1177(3)      lxbk1177(3)      lxbk1177(3) 
>      lxbk1177(3)      lxbk1177(3)      lxbk1177(3)      l
> xbk1177(3)    Summary of lapw1para:
>   lxbk1177      k=25    user=0  wallclock=0
> <-  done at Tue 20 Jun 2023 05:17:14 PM CEST
> -----------------------------------------------------------------
> ->  starting Fermi on lxbk1177 at Tue 20 Jun 2023 05:17:15 PM CEST
> **  LAPW2 crashed at Tue 20 Jun 2023 05:17:16 PM CEST
> **  check ERROR FILES!
> -----------------------------------------------------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://zeus.theochem.tuwien.ac.at/pipermail/wien/attachments/20230620/b3ca5be4/attachment-0001.htm>


More information about the Wien mailing list