[Wien] did parallel calculation take effect?

Peter Blaha pblaha at theochem.tuwien.ac.at
Fri Oct 10 16:20:22 CEST 2008


You specified:

Shared Memory Architecture? yes

With this option it is assuming that your machine as a whole is a shared memory machine
(eg. a single machine with dual quadcore cpus would qualify as 8-core shared memory),
and thus does not use "ssh" to start the parallel jobs.

You configure with siteconfig, that you have no shared memory machine and use
ssh to connect to the different machines.

vinct8954 schrieb:
> dear wien users:
> i met a confused about parallel calculation.
> i compile the wien2k codes in my clusters with out any errors.and the 
> single calculation works well. but when i running programs in parallel 
> mode, i met some puzzled problem. /the .machines file as followed:/
> 1:console
> 1:c0101
> 1:c0102
> 1:c0103
> 1:c0104
> 1:c0105
> 1:c0106
> 1:c0107
> ...............
> granularity:1
> extrafine:1
> /the cycles seems to be nomal,which dayfile is :/
> running lapw0 in single mode
> 15.949u 1.807s 0:19.06 93.0%    0+0k 0+0io 29pf+0w
>  >   lapw1  -p   (22:31:08) starting parallel lapw1 at Wed Sep 17 
> 22:31:08 CST 2008
> ->  starting parallel LAPW1 jobs at Wed Sep 17 22:31:08 CST 2008
> running LAPW1 in parallel mode (using .machines)
> 8 number_of_parallel_jobs
>      console(79) 254.184u 16.657s 9:07.02 49.5% 0+0k 0+0io 38pf+0w
>      c0101(79) 250.019u 16.482s 8:59.24 49.4%   0+0k 0+0io 27pf+0w
>      c0102(79) 253.406u 16.350s 9:04.43 49.5%   0+0k 0+0io 9pf+0w
>      c0103(79) 254.532u 17.161s 9:06.57 49.7%   0+0k 0+0io 0pf+0w
>      c0104(79) 252.878u 15.813s 9:00.49 49.7%   0+0k 0+0io 0pf+0w
>      c0105(79) 254.152u 15.739s 9:03.59 49.6%   0+0k 0+0io 0pf+0w
>      c0106(79) 254.164u 15.906s 9:01.19 49.9%   0+0k 0+0io 0pf+0w
>      c0107(79) 254.787u 16.461s 9:04.39 49.8%   0+0k 0+0io 0pf+0w
>      c0101(1) 3.607u 0.272s 0:05.37 72.0%       0+0k 0+0io 0pf+0w
>      c0102(1) 3.650u 0.242s 0:04.21 92.3%       0+0k 0+0io 0pf+0w
>      c0104(1) 3.270u 0.225s 0:03.64 95.8%       0+0k 0+0io 0pf+0w
>    Summary of lapw1para:
>    console       k=79    user=254.184    wallclock=547.02
>    c0101         k=80    user=253.626    wallclock=544.61
>    c0102         k=80    user=257.056    wallclock=548.64
>    c0103         k=79    user=254.532    wallclock=546.57
>    c0104         k=80    user=256.148    wallclock=544.13
>    c0105         k=79    user=254.152    wallclock=543.59
>    c0106         k=79    user=254.164    wallclock=541.19
>    c0107         k=79    user=254.787    wallclock=544.39
> 2039.101u 132.877s 9:14.54 391.6%       0+0k 0+0io 76pf+0w
> ............................................
>  
> /but when i check the ps of every Compute Node , there is no lapw1 in 
> other nodes at all. all the lapw1(or lapw2)_1.def lapw1(or lapw2)_2.def 
> are running in master nodes. did the parallel calculation take effect? 
> apparently , i used "runsp_lapw -p"for parallel./
>  
> /long for some suggestions./
>  
> Appendix:
> the details of my clusters' environment as following:
>  
> the system: fedora 8 and intel ifort 9.1 and the detail of compile options:
> Current settings:
>  O   Compiler options:        -O3 -FR -w -mp1 -prec_div -pad -ip -xP
>  L   Linker Flags:            -L/export/mathlib/cmkl81/lib/em64t -lguide 
> -lpthread -lsvml
>  P   Preprocessor flags       '-DParallel'
>  R   R_LIB (LAPACK+BLAS):     -L/export/mathlib/cmkl81/lib/em64t 
> -lmkl_lapack64 -lmkl_em64t -lguide -lpthread
>  
> Shared Memory Architecture? yes
>  
> the  MPI and Scalapack options:
>  RP  RP_LIB(SCALAPACK+PBLAS): -L/export/mathlib/cmkl81/lib/em64t 
> -lmkl_scalapack -lmkl_blacs_intelmpi20 -lmkl_lapack -lmkl_em64t -lguide 
> -lpthread
>      FP  FPOPT(par.comp.options): -O3 -FR -w -mp1 -prec_div -pad -ip -xP
>      MP  MPIRUN commando        : mpirun -np _NP_ -machinefile _HOSTS_ 
> _EXEC_
> 
> ------------------------------------------------------------------------
> 雅虎邮箱,您的终生邮箱! <http://cn.mail.yahoo.com/>
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> Wien mailing list
> Wien at zeus.theochem.tuwien.ac.at
> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien

-- 

                                      P.Blaha
--------------------------------------------------------------------------
Peter BLAHA, Inst.f. Materials Chemistry, TU Vienna, A-1060 Vienna
Phone: +43-1-58801-15671             FAX: +43-1-58801-15698
Email: blaha at theochem.tuwien.ac.at    WWW: http://info.tuwien.ac.at/theochem/
--------------------------------------------------------------------------



More information about the Wien mailing list