[Wien] optic program crashed

Gavin Abo gsabo at crimson.ua.edu
Fri Feb 19 03:35:18 CET 2016


At 
http://www.nersc.gov/users/software/applications/materials-science/wien2k/ 
, the last line in the job file has a star (*) after .machine.  It seems 
to missing in the last line of your job file.  Without it, the old 
.machines is not removed and maybe that prevents the new .machines file 
from being created.

Also, I suggest you talk to the consultant that administrates the 
cluster.  They should be able to tell you better why you are getting the 
error "ssh: connect to host nid01855 port 204: Connection refused".  
They might have a firewall setup to block port 204 or might have 
disabled ssh access to node nid01855.

On 2/18/2016 8:31 AM, Dr. K. C. Bhamu wrote:
> Dear Users and developers
>
> I ran my job via slurm job file on  a remote server (2 nodes/64 cores) 
> everything went fine upto DOSS but when I ran "x optic -p" through job 
> file the below mentioned message occurred:
>
> [1] 1371
> ssh: connect to host nid01855 port 204: Connection refused^M
> [1]  + Exit 255                      ( $remote $machine[$p] "cd 
> $PWD;$t $taskset0 $exe ${def}_${loop}.def;rm -f .lock_$lockfile[$p]" ) 
> >> .timeop_$loop
> [1] 1375
> ssh: connect to host nid01855 port 204: Connection refused^M
> [1]  + Exit 255                      ( $remote $machine[$p] "cd 
> $PWD;$t $taskset0 $exe ${def}_${loop}.def;rm -f .lock_$lockfile[$p]" ) 
> >> .timeop_$loop
> [1] 1379
> ssh: connect to host nid01855 port 204: Connection refused^M
> [1]  + Exit 255                      ( $remote $machine[$p] "cd 
> $PWD;$t $taskset0 $exe ${def}_${loop}.def;rm -f .lock_$lockfile[$p]" ) 
> >> .timeop_$loop
> ***  OPTIC crashed!*
> 0.840u 1.800s 1:50.21 2.3%      0+0k 82495+1135io 4pf+0w
> error: command /usr/common/software/wien2k-ccm/14.2/opticpara 
> optic.def failed
> ...............
>
> I went through the list and found couples of threads but the error is 
> not solved.
>
> Please look for this.
>
> The job was successfully complied on a local two CPU based cluster 
> (4GB RAM each)
>
> The job file was:
> --------------------------------------------------------
> #!/bin/bash -l
> #SBATCH -N 2
> #SBATCH -n 64
> #SBATCH -t 00:20:00
> #SBATCH -p regular
> #SBATCH -J orthorhombic_1
> #SBATCH --ccm
>
> #module load wien2k-ccm
> #generating .machines file for k-point and mpi parallel lapw1/2
> let ntasks_per_kgroup=1
> gen.machines -m $ntasks_per_kgroup
>
> #need to disable SLURM envs hereafter
> unset `env|grep SLURM_|awk -F= '{print $1}'`
>
> #put your Wien2k command here
> x optic -p
> #remove leftover .machines file
> rm -fr .machine
> ---------------------------------------------------------------------------
> *
>
> *
> regards
> Bhamu*
> *
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://zeus.theochem.tuwien.ac.at/pipermail/wien/attachments/20160218/e69cce6c/attachment-0001.html>


More information about the Wien mailing list