[Wien] running k-point parallel across nodes
Laurence Marks
L-marks at northwestern.edu
Wed Oct 16 20:00:29 CEST 2013
You have to make sure that the executables path is known on the other
nodes, It looks like you are using ksh (I am not very familiar with
it) so you need to have appropriate lines in the initialization files
( .kshrc ??? as a guess) to set this up. The script .userconfig does
this for bash/csh, not sure about ksh (never tried).
On Wed, Oct 16, 2013 at 12:51 PM, Oliver Albertini <ora at georgetown.edu> wrote:
> Hello,
>
> To run k-point parallel across different nodes, is it enough to simply have
> w2k installed on all the nodes along with pw-less ssh? I have pw-less ssh
> working among the nodes, but when I try to run another kpoint on another
> node, the shell cannot find the executables:
>
> $ x lapw1 -p
> starting parallel lapw1 at Wed Oct 16 10:48:53 PDT 2013
> -> starting parallel LAPW1 jobs at Wed Oct 16 10:48:53 PDT 2013
> running LAPW1 in parallel mode (using .machines)
> 2 number_of_parallel_jobs
> [1] 4653272
> [2] 3605022
> ksh: lapw1c: not found.
> ksh: fixerror_lapw: not found.
> ksh: /home/oliver/data/wiendir/benchmark/test_case: not found.
> ksh: lapw1c: not found.
> ksh: fixerror_lapw: not found.
> [2] - Done ( ( $remote $machine[$p] ...
> [1] + Done ( ( $remote $machine[$p] ...
> stblade01(1) 0.000u 0.000s 0.1 0.00% 0+0k 0+0io 0pf+0w
> stblade02(1) 0.000u 0.000s 0 0.00% 0+0k 0+0io 0pf+0w
> test_case.scf1_1: A file or directory in the path name does not exist.
> Summary of lapw1para:
> stblade01 k=1 user=0 wallclock=6
> stblade02 k=1 user=0 wallclock=0
> 0.1u 0.1s 0:02 8% 0+0k 0+0io 0pf+0w
>
>
> .machines:
> 1:stblade01
> 1:stblade02
>
> Sincerely,
>
> Oliver
>
>
--
Professor Laurence Marks
Department of Materials Science and Engineering
Northwestern University
www.numis.northwestern.edu 1-847-491-3996
"Research is to see what everybody else has seen, and to think what
nobody else has thought"
Albert Szent-Gyorgi
More information about the Wien
mailing list