[Wien] ** testerror: Error in Parallel LAPW
Gavin Abo
gabo13279 at gmail.com
Wed Jun 21 07:27:24 CEST 2023
The "Host key verification failed" is an error from ssh [1].
Thus, it seems like you need to fix your ssh so that WIEN2k can connect
to your remote node (in your case below, it looks like the remote node
is lxbk1177).
It looks like there is an ssh example on slide 10 of the WIEN2k
presentation at [2].
I believe it is common to use ssh-keygen to create a key pair (private
and public key) on the head node and then use ssh-copy-id to put the
public key on each of the remote nodes. However, ssh can be uniquely
configured for a computer system. So, you might want to search online
for different examples on how ssh has been configured. One example that
might be helpful should be at [3].
[1]
https://askubuntu.com/questions/45679/ssh-connection-problem-with-host-key-verification-failed-error
[2]
https://www.bc.edu/content/dam/bc1/schools/mcas/physics/pdf/wien2k/PB-installation.pdf
[3]
https://www.digitalocean.com/community/tutorials/ssh-essentials-working-with-ssh-servers-clients-and-keys
Kind Regards,
Gavin
WIEN2k user
On 6/20/2023 3:25 PM, Ilias Miroslav, doc. RNDr., PhD. wrote:
> Dear Professor Blaha,
>
> thanks, I used PATH variable extension instead of linking;
>
> it crashed with the message "Host key verification failed. "
>
> Herethe content of file
> /lustre/ukt/milias/scratch/Wien2k_23.2_job.main.N1.n4.jid3009460/LvO2onQg/.machines:
> 1:lxbk1177
> 1:lxbk1177
> 1:lxbk1177
> 1:lxbk1177
> 1:lxbk1177
> 1:lxbk1177
> 1:lxbk1177
> 1:lxbk1177
>
> Job is running on lxbk1177, with 8 cpus allocated;
>
> and this is from log :
>
> running x dstart :
> starting parallel dstart at Tue 20 Jun 2023 05:16:21 PM CEST
> -------- .machine0 : processors
> running dstart in single mode
> STOP DSTART ENDS
> 10.249u 0.322s 0:11.19 94.3% 0+0k 158496+101160io 437pf+0w
>
> running 'run_lapw -p -ec 0.0001 -NI'
> STOP LAPW0 END
> Host key verification failed.
> [1] + Done ( ( $remote $machine[$p] "cd
> $PWD;$set_OMP_NUM_THREADS;$t $taskset0 $exe ${def}_$loop.def ;fixerr
> or_lapw ${def}_$loop"; rm -f .lock_$lockfile[$p] ) >& .stdout1_$loop;
> if ( -f .stdout1_$loop ) bashtime2csh.pl_lapw .stdout1_$loop > .
> temp1_$loop; grep \% .temp1_$loop >> .time1_$loop; grep -v \%
> .temp1_$loop | perl -e "print stderr <STDIN>" )
> Host key verification failed.
> [1] + Done ( ( $remote $machine[$p] "cd
> $PWD;$set_OMP_NUM_THREADS;$t $taskset0 $exe ${def}_$loop.def
> ;fixerror_lapw ${def}_$loop"; rm -f .lock_$lockfile[$p] ) >& .stdo
> ut1_$loop; if ( -f .stdout1_$loop ) bashtime2csh.pl_lapw
> .stdout1_$loop > .temp1_$loop; grep \% .temp1_$loop >> .time1_$loop;
> grep -v \% .temp1_$loop | perl -e "print stderr <STDIN>" )
> Host key verification failed.
> [1] + Done ( ( $remote $machine[$p] "cd
> $PWD;$set_OMP_NUM_THREADS;$t $taskset0 $exe ${def}_$loop.def
> ;fixerror_lapw ${def}_$loop"; rm -f .lock_$lockfile[$p] ) >& .stdo
> ut1_$loop; if ( -f .stdout1_$loop ) bashtime2csh.pl_lapw
> .stdout1_$loop > .temp1_$loop; grep \% .temp1_$loop >> .time1_$loop;
> grep -v \% .temp1_$loop | perl -e "print stderr <STDIN>" )
> Host key verification failed.
> [1] + Done ( ( $remote $machine[$p] "cd
> $PWD;$set_OMP_NUM_THREADS;$t $taskset0 $exe ${def}_$loop.def
> ;fixerror_lapw ${def}_$loop"; rm -f .lock_$lockfile[$p] ) >& .stdo
> ut1_$loop; if ( -f .stdout1_$loop ) bashtime2csh.pl_lapw
> .stdout1_$loop > .temp1_$loop; grep \% .temp1_$loop >> .time1_$loop;
> grep -v \% .temp1_$loop | perl -e "print stderr <STDIN>" )
> Host key verification failed.
> [1] + Done ( ( $remote $machine[$p] "cd
> $PWD;$set_OMP_NUM_THREADS;$t $taskset0 $exe ${def}_$loop.def
> ;fixerror_lapw ${def}_$loop"; rm -f .lock_$lockfile[$p] ) >& .stdo
> ut1_$loop; if ( -f .stdout1_$loop ) bashtime2csh.pl_lapw
> .stdout1_$loop > .temp1_$loop; grep \% .temp1_$loop >> .time1_$loop;
> grep -v \% .temp1_$loop | perl -e "print stderr <STDIN>" )
> Host key verification failed.
> [1] + Done ( ( $remote $machine[$p] "cd
> $PWD;$set_OMP_NUM_THREADS;$t $taskset0 $exe ${def}_$loop.def
> ;fixerror_lapw ${def}_$loop"; rm -f .lock_$lockfile[$p] ) >& .stdo
> ut1_$loop; if ( -f .stdout1_$loop ) bashtime2csh.pl_lapw
> .stdout1_$loop > .temp1_$loop; grep \% .temp1_$loop >> .time1_$loop;
> grep -v \% .temp1_$loop | perl -e "print stderr <STDIN>" )
> Host key verification failed.
> [1] + Done ( ( $remote $machine[$p] "cd
> $PWD;$set_OMP_NUM_THREADS;$t $taskset0 $exe ${def}_$loop.def
> ;fixerror_lapw ${def}_$loop"; rm -f .lock_$lockfile[$p] ) >& .stdo
> ut1_$loop; if ( -f .stdout1_$loop ) bashtime2csh.pl_lapw
> .stdout1_$loop > .temp1_$loop; grep \% .temp1_$loop >> .time1_$loop;
> grep -v \% .temp1_$loop | perl -e "print stderr <STDIN>" )
> Host key verification failed.
> [1] Done ( ( $remote $machine[$p] "cd
> $PWD;$set_OMP_NUM_THREADS;$t $taskset0 $exe ${def}_$loop.def
> ;fixerror_lapw ${def}_$loop"; rm -f .lock_$lockfile[$p] ) >& .stdo
> ut1_$loop; if ( -f .stdout1_$loop ) bashtime2csh.pl_lapw
> .stdout1_$loop > .temp1_$loop; grep \% .temp1_$loop >> .time1_$loop;
> grep -v \% .temp1_$loop | perl -e "print stderr <STDIN>" )
> LvO2onQg.scf1_1: No such file or directory.
> grep: *scf1*: No such file or directory
> STOP FERMI - Error
> cp: cannot stat '.in.tmp': No such file or directory
> grep: *scf1*: No such file or directory
>
> > stop error
>
>
>
> file ":parallel"
>
> starting parallel lapw1 at Tue 20 Jun 2023 05:17:08 PM CEST
> lxbk1177(4) lxbk1177(3) lxbk1177(3) lxbk1177(3)
> lxbk1177(3) lxbk1177(3) lxbk1177(3) l
> xbk1177(3) Summary of lapw1para:
> lxbk1177 k=25 user=0 wallclock=0
> <- done at Tue 20 Jun 2023 05:17:14 PM CEST
> -----------------------------------------------------------------
> -> starting Fermi on lxbk1177 at Tue 20 Jun 2023 05:17:15 PM CEST
> ** LAPW2 crashed at Tue 20 Jun 2023 05:17:16 PM CEST
> ** check ERROR FILES!
> -----------------------------------------------------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://zeus.theochem.tuwien.ac.at/pipermail/wien/attachments/20230620/b3ca5be4/attachment-0001.htm>
More information about the Wien
mailing list