[Wien] Is it possible to use a local scratch directory when (# of k-points)/( of nodes) != integer?

Steven Hahn shahn at iastate.edu
Wed Oct 31 16:47:23 CET 2007


Dear Professor Blaha,

Thank you for the suggestion to change the weights assigned to each  
process. I have since modified my PBS script to set the weights  
automatically. If anyone is interested, I copied the necessary  
changes to the example script below.

Steve

#example for k-point parallel lapw1/2
sed "/END/q" <*.klist > klist.tmp
set kpts=`cat klist.tmp | wc -l`
@ kpts --
set procs=`cat $PBS_NODEFILE | wc -l`
@ weight = $kpts / $procs
@ remainder = $kpts - $weight * $procs

set i=1
while ($i <= $aa[1] )
@ tmp = $weight + ( $remainder > 0 )
echo -n $tmp >>.machines
echo -n ':' >>.machines
@ remainder --
head -$i .machines_current |tail -1 >> .machines
@ i ++
end
echo 'granularity:1' >>.machines


On Oct 25, 2007, at 1:51 AM, Peter Blaha wrote:

> For using $SCRATCH you must have a "fixed distribution of the k-points
> to the different nodes, otherwise it could happen that lapw2 for
> junk X is done on a different processor (because of load balancing).
>
> The only way out is to specify a machines file such that all k- 
> points will
> be distributed at once. If the number of nodes does not fit to the  
> k-points,
> you can play around with the "weights" of the machines, until the
> distribution fits.
> 3:node1
> 2:node2
> granularity:1
> will distribute 5 k-points on 2 processors without "rest".  
> (testpara_lapw)
>
> Steven Hahn schrieb:
>> Dear WIEN2k users and developers,
>>
>> I am having trouble running WIEN2k calculations on our cluster when
>> the number of k-points produced by "x kgen" does not have a
>> reasonable factor to use for the number of processes. For example,
>> with 246 k-points I'd like to be able to use more than 6 cores, but
>> the calculation isn't large enough to efficiently use 41 cores. Even
>> if the load balancing is no longer perfect, the calculation could
>> still be completed much faster with 13 or 19 cores. Also, I prefer
>> running with four processes per nodes so that my lapw1 and lapw2
>> processes are not scattered amongst the nodes in the cluster. The
>> home directory is too slow to consider using for scratch. Everything
>> I've tried so far has produced intermittent errors like "forrtl:
>> severe (24): end-of-file during read, unit 10, file /var/scratch/
>> shahn/case_1/case_1.vector_16" in lapw2. Once I realized the problem
>> is because the (# of k-points)/( of nodes) != integer, I have tried:
>>
>> 1) removing extrafine:1 from my PBS script. The calculation still
>> crashes occasionally because the residual k-points are not always
>> running on the same node. I got the misconception that extrafine:1 is
>> compatible with granularity:1 and a local scratch disk from the
>> example PBS script (http://www.wien2k.at/reg_user/faq/pbs.job).
>> Should the extrafine:1 line be removed from the example?
>>
>> 2) replacing one line in my .machines file with residue:(name of
>> node). I'm not sure why this is failing. Testpara_lapw shows only one
>> list of k-points for each core.
>>
>> Before I start modifying scripts, I was wondering if it is possible
>> to still use the existing scripts, a local scratch directory, and
>> overcome the  (# of k-points)/( of nodes) = integer requirement? Is
>> this limitation just in the lapwpara_lapw1 and lapwpara_lapw2
>> scripts, or is there a more fundamental concern? Would others be
>> interested in being able to run with any number of processes?
>>
>> Sincerely,
>> Steven Hahn
>>
>>
>> _______________________________________________
>> Wien mailing list
>> Wien at zeus.theochem.tuwien.ac.at
>> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
>
> -- 
>
>                                        P.Blaha
> ---------------------------------------------------------------------- 
> ----
> Peter BLAHA, Inst.f. Materials Chemistry, TU Vienna, A-1060 Vienna
> Phone: +43-1-58801-15671             FAX: +43-1-58801-15698
> Email: blaha at theochem.tuwien.ac.at    WWW: http://info.tuwien.ac.at/ 
> theochem/
> ---------------------------------------------------------------------- 
> ----
> _______________________________________________
> Wien mailing list
> Wien at zeus.theochem.tuwien.ac.at
> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
>



More information about the Wien mailing list