[Wien] k-point parallelization in WIEN2K_09.1
Peter Blaha
pblaha at theochem.tuwien.ac.at
Mon Jun 14 07:03:28 CEST 2010
I do NOT believe that k-point parallel with an older WIEN2k was possible
(unless you set it up with "rsh" instead of "ssh" and defined a .rhosts file).
Anyway, k-parallel does not use mpi at all and you have to read the requirements
specified in the UG.
Kakhaber Jandieri schrieb:
> Dear Prof. Blaha,
>
> Thank you for your reply.
>
>> Can you ssh node120 ps
>> without supplying a password ?
>
> No, I can't ssh the nodes without password supply, but in my
> parallel_options I have setenv MPI_REMOTE 0. I thought that our cluster
> has a shared memory architecture, since the MPI-parallelization works
> without any problem for 1 k-point. I cheeked the corresponding nodes.
> All they were loaded. May be I misunderstood something. Are the
> requirements for MPI-parallelization different from that for k-point
> paralleization?
>
>> Try x lapw1 -p on the commandline.
>> What exactly is the "error" ?
>
> Just now, to try your suggestions, I ran new task with k-point
> parallelization. The .machines file is:
> granularity:1
> 1:node120
> 1:node127
> 1:node121
> 1:node123
>
> with node120 as a master node.
>
> The output of x lapw -p is:
> starting parallel lapw1 at Sun Jun 13 22:44:08 CEST 2010
> -> starting parallel LAPW1 jobs at Sun Jun 13 22:44:08 CEST 2010
> running LAPW1 in parallel mode (using .machines)
> 4 number_of_parallel_jobs
> [1] 31314
> [2] 31341
> [3] 31357
> [4] 31373
> Permission denied, please try again.
> Permission denied, please try again.
> Received disconnect from 172.26.6.120: 2: Too many authentication
> failures for kakhaber
> [1] Done ( ( $remote $machine[$p] ...
> Permission denied, please try again.
> Permission denied, please try again.
> Received disconnect from 172.26.6.127: 2: Too many authentication
> failures for kakhaber
> Permission denied, please try again.
> Permission denied, please try again.
> Received disconnect from 172.26.6.121: 2: Too many authentication
> failures for kakhaber
> [3] - Done ( ( $remote $machine[$p] ...
> [2] - Done ( ( $remote $machine[$p] ...
> Permission denied, please try again.
> Permission denied, please try again.
> Received disconnect from 172.26.6.123: 2: Too many authentication
> failures for kakhaber
> [4] Done ( ( $remote $machine[$p] ...
> node120(1) node127(1) node121(1) node123(1) **
> LAPW1 crashed!
> cat: No match.
> 0.116u 0.324s 0:11.88 3.6% 0+0k 0+864io 0pf+0w
> error: command /home/kakhaber/WIEN2K_09/lapw1cpara -c lapw1.def failed
>
>> How many k-points do you have ? ( 4 ?)
>
> Yes, I have 4 k-points.
>
>> Content of .machine1 and .processes
>
> marc-hn:~/wien_work/GaAsB> cat .machine1 node120
> marc-hn:~/wien_work/GaAsB> cat .machine2
> node127
> marc-hn:~/wien_work/GaAsB> cat .machine3
> node121
> marc-hn:~/wien_work/GaAsB> cat .machine4
> node123
>
> marc-hn:~/wien_work/GaAsB> cat .processes
> init:node120
> init:node127
> init:node121
> init:node123
> 1 : node120 : 1 : 1 : 1
> 2 : node127 : 1 : 1 : 2
> 3 : node121 : 1 : 1 : 3
> 4 : node123 : 1 : 1 : 4
>
>> While x lapw1 -p is running, do a ps -ef |grep lapw
>
> I had not enough time to do it - the program crashed before.
>
>> Your .machines file is most likely a rather "useless" one. The mpi-lapw1
>> diagonalization (SCALAPACK) is almost a factor of 2 slower than the
>> serial
>> version, thus your speedup by using 2 processors in mpi-mode will be
>> very small.
>
> Yes, I know, but I am simply trying to arrange the calculations using
> Wien2K. For "real" calculations I will use much more processors.
>
> And finally, for additional information. As I wrote in my previous
> letters, in
> WIEN2k_08.1 k-point parallelization works, but all processes are running
> on master node and all other reserved nodes are idle. I forgot to
> mention: this is true for lapw1 only. Lapw2 is distributed among all
> reserved nodes.
>
> Thank you one again. I am looking forward for your further advices.
>
>
> Dr. Kakhaber Jandieri
> Department of Physics
> Philipps University Marburg
> Tel:+49 6421 2824159 (2825704)
>
>
> _______________________________________________
> Wien mailing list
> Wien at zeus.theochem.tuwien.ac.at
> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
--
-----------------------------------------
Peter Blaha
Inst. Materials Chemistry, TU Vienna
Getreidemarkt 9, A-1060 Vienna, Austria
Tel: +43-1-5880115671
Fax: +43-1-5880115698
email: pblaha at theochem.tuwien.ac.at
-----------------------------------------
More information about the Wien
mailing list