[Wien] k-point parallelization in WIEN2K_09.1

Peter Blaha pblaha at theochem.tuwien.ac.at
Sun Jun 13 20:39:56 CEST 2010


Can you    ssh node120 ps
without supplying a password ?

Try x lapw1 -p on the commandline.
What exactly is the "error" ?

How many k-points do you have ? ( 4 ?)

Content of .machine1 and .processes

While x lapw1 -p is running, do a    ps -ef |grep lapw

It probably would cost 10 seconds to find out what the problem is,
but without info ..... ??

PS:
Your .machines file is most likely a rather "useless" one. The mpi-lapw1
diagonalization (SCALAPACK) is almost a factor of 2 slower than the serial
version, thus your speedup by using 2 processors in mpi-mode will be very small.

> My .machines file:
> 
> granularity:1
> 1:node119 node122
> 1:node130 node131
> 1:node127 node135
> 1:node120 node138
> lapw0:node119:1 node122:1 node130:1 node131:1 node127:1 node135:1 node120:1 node138:1 
> 
> node119 is a master node.
> 
> In lapw1 four processes (for different k-points) run on node119,
> node122, node131, node135, node138 contain single process per node,
> node130, node127, node120 remain idle. 
> 
> I can't understand:
> 
> 1) Why lapw1 works in combined parallelization mode, while it crashes in k-point parallelization mode.
> 2) Why node130, node127, node120 remain idle.
> 
> I will be thankful for any pointers.
> 
> Kakhaber Jandieri
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> Wien mailing list
> Wien at zeus.theochem.tuwien.ac.at
> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien

-- 
-----------------------------------------
Peter Blaha
Inst. Materials Chemistry, TU Vienna
Getreidemarkt 9, A-1060 Vienna, Austria
Tel: +43-1-5880115671
Fax: +43-1-5880115698
email: pblaha at theochem.tuwien.ac.at
-----------------------------------------


More information about the Wien mailing list