[Wien] "if: Expression Syntax." in fine-grained parallel lapw2
Peter Blaha
pblaha at theochem.tuwien.ac.at
Tue Sep 25 11:08:29 CEST 2007
Thank's again for the report.
You are testing (and finding) all possible errors in our scripts (and provide a
very clear analysis!). Yes, I never tested lapw2_vector_split AND more k-points.
I think a possible fix (not completely secure, since one could have the idea of
having a different vector_split for each k-point...) is changing the lines you
listed below by:
if ( $vector_split != '' ) then
set loop=0
while ($loop < $maxproc)
@ loop ++
if ($vector_split > $number_per_job2[$loop] ) set
vector_split=$number_per_job2[$loop]
@ number_per_job2[$loop] = ( $number_per_job2[$loop] / $vector_split ) *
$vector_split
end
endif
(Sorry for the line breaks due to my stupid email system).
Please test it!
Regards
PS: lapw2_vector_split is an optional line in .machines and can be omitted (if
it is set to one). It is only necessary if you see that the memory of lapw2mpi
grows larger than what you actually have on your machines and starts paging.
Steven Hahn schrieb:
> I am receiving another error message during the scf cyle, this time from
> lapw2. If I use just fine-grained parallelization(one line in my
> .machines file), everything runs fine. If I use both fine and
> course-grained parallelization (nodes on two separate lines) I receive
> the error message "if:Expression Syntax." I traced it back to line
> 143-148 of lapw2para_lapw
>
> set number_per_job2 = `cut -f4 -d: $tmp2`
>
> if ( $vector_split != '' ) then
> if ($vector_split > $number_per_job2 ) set vector_split=$number_per_job2
> @ number_per_job2 = ( $number_per_job2 / $vector_split ) * $vector_split
> endif
>
> If I run with this .machines file
>
> 1:node020 node018 node018 node018
> granularity:1
> extrafine:1
> lapw2_vector_split:1
>
> I get the following values
>
> tmp2 = 1 : node020 : 888 : 4 : 1
> number_per_job2 = 4
> vector_split = 1
>
> If I run with this .machines file
>
> 1:node020 node018
> 1:node018 node018
> granularity:1
> extrafine:1
> lapw2_vector_split:1
>
> I get this:
>
> tmp2=
> 1 : node020 : 444 : 2 : 1
> 2 : node018 : 444 : 2 : 2
> number_per_job2 = 2 2
> vector_split = 1
>
> The script crashes because it can't tell if cannot compare $vector_split
> (whose value is 1) to $number_per_job2 (whose values is 2 2) because the
> second is not a number.
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> Wien mailing list
> Wien at zeus.theochem.tuwien.ac.at
> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
--
P.Blaha
--------------------------------------------------------------------------
Peter BLAHA, Inst.f. Materials Chemistry, TU Vienna, A-1060 Vienna
Phone: +43-1-58801-15671 FAX: +43-1-58801-15698
Email: blaha at theochem.tuwien.ac.at WWW: http://info.tuwien.ac.at/theochem/
--------------------------------------------------------------------------
More information about the Wien
mailing list