[Wien] parallel task stops at the end of LAPW1

Pawel Lesniak lesniak at ifmpan.poznan.pl
Tue Jul 21 18:08:36 CEST 2009


Lyudmila Dobysheva pisze:
> Dear WIEN-users,
>
> On a new computer I have met a problem that cannot solve.
> Non-parallel calculation works well, but the parallel over k-points stops in 
> lapw1. 
> More precisely in lapw1para_lapw: the execution does not return from the line:
> (cd $PWD;$t $exe ${def}_$loop.def;rm -f .lock_$lockfile[$p]) >>.time1_$loop &
>
> It performs entirely lapw1_N (up to      STOP ' LAPW1 END'), 
> cleans the uplapw1_N.error files, 
> and stops.
> top shows that there are 8*lapw1,
> 1+8* lapw1_para's 
> and one reappearing sleep. All with zero CPU %.
>
> .lockfiles exist in the directory.
>   
If .lockfiles exists, then

rm -f .lock_$lockfile[$p]

has not been executed. So lapw1 binary is not running correctly.
Try running
lapw1 uplapw1_1.def
from case directory.
What C-shell are you using? If it's tcsh 6.15.01 or newer, you have to 
apply a small patch to your wien2k package.

> .time1_ contains only "localhost(18)"
>
> I think the problem lies in command &
> Could you please advice me something?
>   
& at the end means that command will be executed in background, so 
script can carry on without waiting for command to be finished.


Regards,
Pawel Lesniak



More information about the Wien mailing list