[Wien] Fixed! - Error when parsing klist in parallel

wiener at arcscluster.caltech.edu wiener at arcscluster.caltech.edu
Thu Oct 7 02:31:02 CEST 2004


Hello again,
 
Thank you Torsten for pointing out that those case.klist_N files are 
created by lapw1para. After looking into this script, it turns out that 
the first line in the following excerpt (in the "kloop" loop) is causing 
our problem:
 
head -$head $caseklist.tmp | tail -$tail > ${caseklist}_$loop
echo "END" >>${caseklist}_$loop
cut -c-35 ${caseklist}_$loop |sed "1r head.diff" >$tmp
sed -f script $tmp >${caseklist}_$loop
 
The syntax used here for "head" and "tail" is deprecated in head version 
5.2.1 and tail 5.0.91 currently available with Gentoo linux. To be more 
precise they are not YET deprecated but they return a warning saying they 
will be, before returning the proper outputs from the head or tail 
commands. This extra warning line causes the first line of the script 
above to not work appropriately.
They should be replaced by the newer syntax "head -n $head" and "tail -n 
$tail".
Hope this helps.
Regards,
 
Olivier.
 
~~~~~~~~~~~~~~~~~~~
>Hello,
>probably something is wrong with your installation. Is your csh corrupt?

>The division is done by a script, lapw1para. You need functioning unix 
>commands to do that - head, tail, csh, awk, grep...

>It works for me on a dual-Opteron cluster running

>Red Hat Enterprise Linux AS release 3 (Taroon Update 1)

>also with PGI-5.1. However, I get problems (of another kind) if I use 
>-O5 with the PGI compiler.

>Best regards,
>Torsten Andersen.


>wiener at arcscluster.caltech.edu wrote:
> Hello,
> 
> We are trying to use the k-point parallel version of wien2k on a dual 
> opteron cluster running gentoo linux.
> We compiled wien2k with PGI-5.1 and linking against the acml-2.0 
> libraries. The serial code works on all inidividual nodes. However, 
> when we try to run a "k-point parallel" (using ssh) calculation, the 
> various case.klist_N files are messed up and start with the END flag.
> For example, below is a case.klist and the case.klist_1 (the others look
> the same) that results from trying a parallel run. We are using 6 nodes
> in this case and the .machines file is also appended below. 
> This produces a READ error on all the called nodes like so :
> 
> PGFIO-F-231/formatted read/unit=4/error on data conversion.
>  File name = bccV_1e3k.klist_1    formatted, sequential access   record 
= 
> 1
>  In source file inilpw.f, at line number 405
> 
> Has anyone had the same problem? Any suggestions?
> Thanks,
> Olivier.
> 





More information about the Wien mailing list