[Wien] I still have problem with wienk in parallel mode
Laurence Marks
L-marks at northwestern.edu
Tue Dec 27 20:02:09 CET 2011
It is hard to know as you have not provided us with enough
information, so we can only guess. Most likely is that you have setup
the problem wrong, for instance bad RMTs, bad case.in1c or other. Read
the file lapw1.error to see if it has anything, and also the various
output files. Beyond this:
a) Did you compile the mpi versions? If not, then what you are using
will not work. There are two ways to run Wien2k in parallel, one uses
mpi and is needed for big jobs, the other does not use mpi and is
often simpler for small jobs.
b) Edit parallel_options and put "setenv debug 1" in (remove it later)
then do "x lapw1 -p" from the terminal. This will give you more
output.
c) Check that you have ssh enabled to the compute nodes (I don't think
you need the .local at the end)
A comment. You have setup your .machines file to run 5 tasks for
lapw1, each using 4 cpu's. Some mpi versions are not smart and with
what you have will run both tasks on compute-0-0 using the same cores.
2011/12/27 Nilton <nilton.dantas at gmail.com>:
> Dear P. Blaha,
> sorry but I still have problem with wien in parallel mode.
> I try set up the parallel options as you suggested and I got the error
> message:
>
> ------------------------------------------------------message from
> console------------------------------
> [nilton at bodesking case]$ run_lapw -p
> LAPW0 END
> cat: No match.
> ----------------------------------------------------------------------------------------------------
>
> this is the content of my parallel_options file:
> --------------------------------------------------------------------------------------------------------------------
> setenv USE_REMOTE 1
> setenv WIEN_GRANULARITY 3
> setenv WIEN_MPIRUN "mpirun -np _NP_ -machinefile _HOSTS_ _EXEC_"
> --------------------------------------------------------------------------------------------------------------------
>
> you can see below that after lapw0 the lapw1c para is running. After this
> the run_lapw stop.
>
> -------------------------------------------------from top
> ---------------------------------------------------------------
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+
> COMMAND
> 3957 root 15 0 393m 23m 7820 S 1.3 0.2 9:23.18
> Xorg
> 11099 nilton 21 0 66468 1344 884 S 0.7 0.0 0:00.02
> lapw1cpara
> 11038 nilton 15 0 12772 1184 828 R 0.3 0.0 0:00.04
> top
> 11039 nilton 19 0 68684 1448 968 S 0.3 0.0 0:00.02
> run_lapw
> 11090
> --------------------------------------------------------------------------------------------------------------------------
>
> ----------------------------------------------from
> case.dayfile--------------------------------------------------------------------------
> Calculating case in
> /home/nilton/pesquisa/dftCalc/calWien/gaxtl1-xas/050/case/case
> on bodesking.uefs.br with PID 25425
>
> start (Tue Dec 27 12:29:12 BRT 2011) with lapw0 (100/20 to go)
>
> cycle 1 (Tue Dec 27 12:29:12 BRT 2011) (100/20 to go)
>
>> lapw0 -p (12:29:12) starting parallel lapw0 at Tue Dec 27 12:29:12
>> BRT 2011
> --------
> running lapw0 in single mode
> 9.775u 0.276s 0:10.06 99.8% 0+0k 0+0io 0pf+0w
>> lapw1 -c -p (12:29:22) starting parallel lapw1 at Tue Dec 27
>> 12:29:22 BRT 2011
> -> starting parallel LAPW1 jobs at Tue Dec 27 12:29:22 BRT 2011
> running LAPW1 in parallel mode (using .machines)
> 5 number_of_parallel_jobs
> ** LAPW1 crashed!
> 0.643u 0.849s 0:22.63 6.5% 0+0k 0+0io 0pf+0w
> error: command /home/nilton/wien2k/lapw1cpara -c lapw1.def failed
>
>> stop error
> ------------------------------------------------------------------------------------------------------------------
>
> ----------------------------from .machines
> file-----------------------------------------------------
> 1:bodesking.uefs.br:4
> 1:compute-0-0.local:4
> 1:compute-0-0.local:4
> 1:compute-0-1.local:4
> 1:compute-0-1.local:4
>
> Thanks in advance.
> Nilton
> --
> Nilton S. Dantas
> Universidade Estadual de Feira de Santana
> Departamento de Ciências Exatas
> Área de Informática
> Av. Transnordestina, S/N, Bairro Novo Horizonte
> CEP 44036900 - Feira de Santana, Bahia, Brasil
> Tel./Fax +55 75 31618086
> http://www2.ecomp.uefs.br/
>
>
> _______________________________________________
> Wien mailing list
> Wien at zeus.theochem.tuwien.ac.at
> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
>
--
Professor Laurence Marks
Department of Materials Science and Engineering
Northwestern University
www.numis.northwestern.edu 1-847-491-3996
"Research is to see what everybody else has seen, and to think what
nobody else has thought"
Albert Szent-Gyorgi
More information about the Wien
mailing list