[Wien] Error in lapw1para_lapw script causing errors when running parallel lapw2
Peter Blaha
pblaha at theochem.tuwien.ac.at
Wed Jun 17 22:34:14 CEST 2009
Of course I do have the "official" lapw1para_lapw.
And it has a line 444 with minus signs:
@ tail = $klist - $kold - 1 # here
If I put an echo $tail,$klist,$kold
after that line it gives on my tcs (and apparently on all other WIEN2k
users so far:
3,45,45
So, obviously the tcsh evaluates this "wrong" (in terms of mathematics),
but "correct" (in terms what it should do).
One can verify this on the commandline:
set a=10
set b=5
@ c = $a - $b
echo $c # will give 5
@ c ? $a - $b - 1
echo $c # gives 6 !!!!
What is your Linux ? tcsh (csh ?) version.
> It depends on what you have in line 444th of lapw1para_lapw.
>
> 430 set kold = $kbegin
> 431 if ($loop > $multi && $?extrafine) then
> 432 @ head = $kbegin
> 433 set tail = 1
> 434 @ kbegin = $kbegin + 1
> 435 else
> 436 @ head = $kbegin + $weigh[$p] - 1
> 437 set tail = $weigh[$p]
> 438 @ kbegin = $kbegin + $weigh[$p]
> 439 endif
> 440
> 441
> 442 if ($head >= $klist) then
> 443 set head = $klist
> 444 @ tail = $klist - $kold - 1 # here
> 445 endif
>
> Generation of 5-th part of klist follows:
> $klist = 47, $kbegin = 45, $kold = 45 before line 430th
> in line 436th, head = 45 + 11 - 1 = 55
> in line 437th, tail = 11
> in line 438th, kbegin = 45 + 11 = 56
>
> What is the problem, we can see in lines 442-445:
> 442th: head = 55 >= klist = 47, so we follow lines 443-445.
> 443rd: head = 47 -> this is fine
> 444th: tail = 47 - 45 -1 = 1 (while it should be 47 - 45 + 1 = 3)
> Of course this will produce only last line from klist, and this will be
> k-point number 47. We are missing k-points number 45 and 46.
> This error is quite obvious to me, and it's really amazing for me, that
> you are getting correct results.
>
> So if you are getting correct split of k-points in lapw1para_lapw, then
> 1) you have @ tail = $klist - $kold + 1 in line 444th (and also
> corrected "same thing" in line 248th of testpara_lapw).
> or
> 2) your (t)csh evaluates expressions from right to left.
> or
> 3) code which I've downloaded from wien2k site is different from the one
> you are testing on
>> It produces 5 !!! klists, the latter one with the remaining 3 k-points
>> and the
>> "fastest" cpu will get this junk.
>>
>> So from my point of view it works perfectly well.
> It's very strange. Could you download a copy of code from wien2k site
> and check it on freshly downloaded TiC (or other) test case in k-parallel.
>
> We have here 3 different environments (different distributions) of Linux
> on x86_64, all producing the same errors.
>> > Indeed I am using $SCRATCH variable. I've also checked -it switch
>> and it
>> > works with k-points splitted 2/2/2/2/1 on 4 cpus.
>>
>> No, you cannot use iterative diag, because with 4 lines in .machines, but
>> actually 5 !! junks, you don't know on which computer the 5th junk
>> will be executed
>> (it will be the fastest, but that can change from iteration to
>> iteration and you will
>> not have an old vector file.
> Unless $SCRATCH is on distributed filesystem, when each node can see all
> parts of vector file ($case.vector_*), I guess.
>
>
> I'd be very gratefull if you could check what I'm writing about on
> freshly downloaded code/data (available to download for registered users).
>
>
> Pawel Lesniak
>
> _______________________________________________
> Wien mailing list
> Wien at zeus.theochem.tuwien.ac.at
> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
More information about the Wien
mailing list