[Wien] lapw1cpara script WIEN2k_08.1
Peter Blaha
pblaha at theochem.tuwien.ac.at
Thu Feb 28 00:09:52 CET 2008
Yes, I agree with your analysis.
The square brackets must be used only if one references a variable, like
$a[$i]
but when using it for eg. a filename it must be .machine$p
This bug creates problems only for more than 9 k-points.
Thank's for the report + analysis of the problem.
aurelio schrieb:
> Hello,
> I am using WIEN2k_08.1 to run in parallel a case with 23 k points using 23(k point)*4(mpi fine grain) processors in a
> Linux SUSE Linux Enterprise Server 10 (ia64) cluster. The program fail in lapw1cpara unable to start mpirun correctly
> complainig about the machinefile in some cases and running multiple k points in the same node in others.
> Checking the script lapw1cpara I have notice the following:
>
> In lines 492-495:
>
> set helpout=`cat .machine[$p]`
> echo -n "$helpout(${kpl[$loop]}) " >.time1_$loop
> set ttt=(`echo $mpirun | sed -e "s^_NP_^$number_per_job[$p]^" -e "s^_EXEC_^$WIENROOT/${exe}_mpi ${def}_$loop.def^" -e
> "s^_HOSTS_^.machine[$p]^"`)
>
> we can see how the machine file is got from the regular expression expansion .machine[$p]. That means, in the case, for
> example, the 10th k point calculation, that .machine[$p] is expanded by the shell to ".machine1" and for example for the
> 12th k point to .machine1 .machine2.
> It is quite easy to test in the run folder:
>
> $ ls .machine[10]
> .machine1
> $ ls .machine[12]
> .machine1 .machine2
>
> This final result in a wrong machinefile for the calculation. I think the solution is to change .machine[$p] for .machine$p.
>
> What do you think?
>
> Thanks in advance for your help
>
> Aurelio Rodríguez
>
>
--
-----------------------------------------
Peter Blaha
Inst. Materials Chemistry, TU Vienna
Getreidemarkt 9, A-1060 Vienna, Austria
Tel: +43-1-5880115671
Fax: +43-1-5880115698
email: pblaha at theochem.tuwien.ac.at
-----------------------------------------
More information about the Wien
mailing list