[Wien] I still have problem with wienk in parallel mode

Laurence Marks L-marks at northwestern.edu
Tue Dec 27 20:02:09 CET 2011


It is hard to know as you have not provided us with enough
information, so we can only guess. Most likely is that you have setup
the problem wrong, for instance bad RMTs, bad case.in1c or other. Read
the file lapw1.error to see if it has anything, and also the various
output files. Beyond this:

a) Did you compile the mpi versions? If not, then what you are using
will not work. There are two ways to run Wien2k in parallel, one uses
mpi and is needed for big jobs, the other does not use mpi and is
often simpler for small jobs.
b) Edit parallel_options and put "setenv debug 1" in (remove it later)
then do "x lapw1 -p" from the terminal. This will give you more
output.
c) Check that you have ssh enabled to the compute nodes (I don't think
you need the .local at the end)

A comment. You have setup your .machines file to run 5 tasks for
lapw1, each using 4 cpu's. Some mpi versions are not smart and with
what you have will run both tasks on compute-0-0 using the same cores.

2011/12/27 Nilton <nilton.dantas at gmail.com>:
> Dear P. Blaha,
> sorry but I still have problem with wien in parallel mode.
> I try set up the parallel options as you suggested and I got the error
> message:
>
> ------------------------------------------------------message from
> console------------------------------
> [nilton at bodesking case]$ run_lapw -p
>  LAPW0 END
> cat: No match.
> ----------------------------------------------------------------------------------------------------
>
> this is the content of my parallel_options file:
>  --------------------------------------------------------------------------------------------------------------------
> setenv USE_REMOTE 1
> setenv WIEN_GRANULARITY 3
> setenv WIEN_MPIRUN "mpirun -np _NP_ -machinefile _HOSTS_ _EXEC_"
> --------------------------------------------------------------------------------------------------------------------
>
> you can see below that after lapw0 the lapw1c para is running. After this
> the run_lapw stop.
>
> -------------------------------------------------from top
> ---------------------------------------------------------------
>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+
> COMMAND
>  3957 root      15   0  393m  23m 7820 S  1.3  0.2   9:23.18
> Xorg
> 11099 nilton    21   0 66468 1344  884 S  0.7  0.0   0:00.02
> lapw1cpara
> 11038 nilton    15   0 12772 1184  828 R  0.3  0.0   0:00.04
> top
> 11039 nilton    19   0 68684 1448  968 S  0.3  0.0   0:00.02
> run_lapw
> 11090
> --------------------------------------------------------------------------------------------------------------------------
>
> ----------------------------------------------from
> case.dayfile--------------------------------------------------------------------------
> Calculating case in
> /home/nilton/pesquisa/dftCalc/calWien/gaxtl1-xas/050/case/case
> on bodesking.uefs.br with PID 25425
>
>     start     (Tue Dec 27 12:29:12 BRT 2011) with lapw0 (100/20 to go)
>
>     cycle 1     (Tue Dec 27 12:29:12 BRT 2011)     (100/20 to go)
>
>>   lapw0 -p    (12:29:12) starting parallel lapw0 at Tue Dec 27 12:29:12
>> BRT 2011
> --------
> running lapw0 in single mode
> 9.775u 0.276s 0:10.06 99.8%    0+0k 0+0io 0pf+0w
>>   lapw1  -c -p     (12:29:22) starting parallel lapw1 at Tue Dec 27
>> 12:29:22 BRT 2011
> ->  starting parallel LAPW1 jobs at Tue Dec 27 12:29:22 BRT 2011
> running LAPW1 in parallel mode (using .machines)
> 5 number_of_parallel_jobs
> **  LAPW1 crashed!
> 0.643u 0.849s 0:22.63 6.5%    0+0k 0+0io 0pf+0w
> error: command   /home/nilton/wien2k/lapw1cpara -c lapw1.def   failed
>
>>   stop error
> ------------------------------------------------------------------------------------------------------------------
>
> ----------------------------from .machines
> file-----------------------------------------------------
> 1:bodesking.uefs.br:4
> 1:compute-0-0.local:4
> 1:compute-0-0.local:4
> 1:compute-0-1.local:4
> 1:compute-0-1.local:4
>
> Thanks in advance.
> Nilton
> --
> Nilton S. Dantas
> Universidade Estadual de Feira de Santana
> Departamento de Ciências Exatas
> Área de Informática
> Av. Transnordestina, S/N, Bairro Novo Horizonte
> CEP 44036900 - Feira de Santana, Bahia, Brasil
> Tel./Fax +55 75 31618086
> http://www2.ecomp.uefs.br/
>
>
> _______________________________________________
> Wien mailing list
> Wien at zeus.theochem.tuwien.ac.at
> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
>



-- 
Professor Laurence Marks
Department of Materials Science and Engineering
Northwestern University
www.numis.northwestern.edu 1-847-491-3996
"Research is to see what everybody else has seen, and to think what
nobody else has thought"
Albert Szent-Gyorgi


More information about the Wien mailing list