[Wien] Fixed! - Error when parsing klist in parallel
wiener at arcscluster.caltech.edu
wiener at arcscluster.caltech.edu
Thu Oct 7 02:31:02 CEST 2004
Hello again,
Thank you Torsten for pointing out that those case.klist_N files are
created by lapw1para. After looking into this script, it turns out that
the first line in the following excerpt (in the "kloop" loop) is causing
our problem:
head -$head $caseklist.tmp | tail -$tail > ${caseklist}_$loop
echo "END" >>${caseklist}_$loop
cut -c-35 ${caseklist}_$loop |sed "1r head.diff" >$tmp
sed -f script $tmp >${caseklist}_$loop
The syntax used here for "head" and "tail" is deprecated in head version
5.2.1 and tail 5.0.91 currently available with Gentoo linux. To be more
precise they are not YET deprecated but they return a warning saying they
will be, before returning the proper outputs from the head or tail
commands. This extra warning line causes the first line of the script
above to not work appropriately.
They should be replaced by the newer syntax "head -n $head" and "tail -n
$tail".
Hope this helps.
Regards,
Olivier.
~~~~~~~~~~~~~~~~~~~
>Hello,
>probably something is wrong with your installation. Is your csh corrupt?
>The division is done by a script, lapw1para. You need functioning unix
>commands to do that - head, tail, csh, awk, grep...
>It works for me on a dual-Opteron cluster running
>Red Hat Enterprise Linux AS release 3 (Taroon Update 1)
>also with PGI-5.1. However, I get problems (of another kind) if I use
>-O5 with the PGI compiler.
>Best regards,
>Torsten Andersen.
>wiener at arcscluster.caltech.edu wrote:
> Hello,
>
> We are trying to use the k-point parallel version of wien2k on a dual
> opteron cluster running gentoo linux.
> We compiled wien2k with PGI-5.1 and linking against the acml-2.0
> libraries. The serial code works on all inidividual nodes. However,
> when we try to run a "k-point parallel" (using ssh) calculation, the
> various case.klist_N files are messed up and start with the END flag.
> For example, below is a case.klist and the case.klist_1 (the others look
> the same) that results from trying a parallel run. We are using 6 nodes
> in this case and the .machines file is also appended below.
> This produces a READ error on all the called nodes like so :
>
> PGFIO-F-231/formatted read/unit=4/error on data conversion.
> File name = bccV_1e3k.klist_1 formatted, sequential access record
=
> 1
> In source file inilpw.f, at line number 405
>
> Has anyone had the same problem? Any suggestions?
> Thanks,
> Olivier.
>
More information about the Wien
mailing list