[Wien] k-point parallel job in distributed file system
Ravindran Ponniah
ravindran.ponniah at kjemi.uio.no
Thu Aug 17 11:15:42 CEST 2006
Hello,
We are trying to do k-point parallel wien2k job in a linux cluster
which has distributed file system. Though we are able to do k-point
parallel calculation, we have a problem in assigning a
common work space ($SCRATCH) to read/write all input/output files. This
means that, for example, if we do a 10 kpoint calculation in 10 nodes, all
the 10 nodes should communicate to the common working area through ssh to
read/write files. This slows down the performance and also the network.
So far we have done k-point parallel calculations in supercomputers with
shared memory and hence we never had such a problem. Is it
possible to do k-point parallel calculations in distributed file system
without any common working area?
I have received the following from the system expert here.
###
Hmm, I've been looking through the jungle of scipts which constitutes
wien2k, and it is clear to me that
this way of paralellizing isn't meant for distributed filesystems (local
disks on nodes). Unless the
wien2k people have a solution, I don't think we will get around this
without some major reprogramming. At
least it seems so to me, but I must admit that I don't have the complete
overview of todo tasks.
Also a quick google of the proble, did not provide a solution.
This is very efficient for SMP types of machines, but is a bit
ad-hoc for cluster type computers.
On the bright side, it doesn't seem taht the program does a lot of disk
read/write in the long run. Only 10-20 min bursts of 10 megs/sek.
####
Looking forward your responses to do the computation more efficently.
Best regards
Ravi
More information about the Wien
mailing list