[Wien] Parallel run problems with version 19.1
Peter Blaha
pblaha at theochem.tuwien.ac.at
Mon Jul 22 20:54:37 CEST 2019
Please:
1) does x lapw0 work ???
2) list your .machines file. In particular: for TiC use only 2 cores
(because of 2 atoms)
3) ls -als *output00*
4) what is at the end of *.output0000 ??? Please check for any errors.
Is your fftw-mpi compiled with the same compiler as wien2k ??
Am 22.07.2019 um 20:45 schrieb Ricardo Moreira:
> I had it at 4 as per the default value suggested during configuration
> but I changed it to 1 now. In spite of that, "x lapw0 -p" still did not
> generate case.vspup or case.vspdn.
>
> On Mon, 22 Jul 2019 at 19:01, <tran at theochem.tuwien.ac.at
> <mailto:tran at theochem.tuwien.ac.at>> wrote:
>
> Do you have the variable OMP_NUM_THREADS set in your .bashrc or .cshrc
> file? If yes and the value is greater than 1, then set it to 1 and
> execute agian "x lapw0 -p".
>
> On Monday 2019-07-22 18:39, Ricardo Moreira wrote:
>
> >Date: Mon, 22 Jul 2019 18:39:45
> >From: Ricardo Moreira <ricardopachecomoreira at gmail.com
> <mailto:ricardopachecomoreira at gmail.com>>
> >Reply-To: A Mailing list for WIEN2k users
> <wien at zeus.theochem.tuwien.ac.at
> <mailto:wien at zeus.theochem.tuwien.ac.at>>
> >To: A Mailing list for WIEN2k users
> <wien at zeus.theochem.tuwien.ac.at
> <mailto:wien at zeus.theochem.tuwien.ac.at>>
> >Subject: Re: [Wien] Parallel run problems with version 19.1
> >
> >That is indeed the case, neither case.vspup or case.vspdn were
> generated after running "x lapw0 -p".
> >
> >On Mon, 22 Jul 2019 at 17:09, <tran at theochem.tuwien.ac.at
> <mailto:tran at theochem.tuwien.ac.at>> wrote:
> > It seems that lapw0 does not generate case.vspup and
> > case.vspdn (and case.vsp for non-spin-polarized calculation).
> > Can you confirm that by executing "x lapw0 -p" on the command
> > line?
> >
> > On Monday 2019-07-22 17:45, Ricardo Moreira wrote:
> >
> > >Date: Mon, 22 Jul 2019 17:45:51
> > >From: Ricardo Moreira <ricardopachecomoreira at gmail.com
> <mailto:ricardopachecomoreira at gmail.com>>
> > >Reply-To: A Mailing list for WIEN2k users
> <wien at zeus.theochem.tuwien.ac.at
> <mailto:wien at zeus.theochem.tuwien.ac.at>>
> > >To: A Mailing list for WIEN2k users
> <wien at zeus.theochem.tuwien.ac.at
> <mailto:wien at zeus.theochem.tuwien.ac.at>>
> > >Subject: Re: [Wien] Parallel run problems with version 19.1
> > >
> > >The command "ls *vsp*" returns only the files
> "TiC.vspdn_st" and
> > >"TiC.vsp_st", so it would appear that the file is not
> created at all when
> > >using the -p switch to runsp_lapw.
> > >
> > >On Mon, 22 Jul 2019 at 16:29, <tran at theochem.tuwien.ac.at
> <mailto:tran at theochem.tuwien.ac.at>> wrote:
> > > Is the file TiC.vspup emtpy?
> > >
> > > On Monday 2019-07-22 17:24, Ricardo Moreira wrote:
> > >
> > > >Date: Mon, 22 Jul 2019 17:24:42
> > > >From: Ricardo Moreira
> <ricardopachecomoreira at gmail.com
> <mailto:ricardopachecomoreira at gmail.com>>
> > > >Reply-To: A Mailing list for WIEN2k users
> > > <wien at zeus.theochem.tuwien.ac.at
> <mailto:wien at zeus.theochem.tuwien.ac.at>>
> > > >To: A Mailing list for WIEN2k users
> > > <wien at zeus.theochem.tuwien.ac.at
> <mailto:wien at zeus.theochem.tuwien.ac.at>>
> > > >Subject: Re: [Wien] Parallel run problems with
> version 19.1
> > > >
> > > >Hi and thanks for the reply,
> > > >Regarding serial calculations, yes in both non
> spin-polarized
> > > and spin-polarized everything runs properly in the
> cases you
> > > described. As
> > > >for parallel, it fails in both cases, with the error I
> > > indicated in my previous email.
> > > >
> > > >Best Regards,
> > > >Ricardo Moreira
> > > >
> > > >On Mon, 22 Jul 2019 at 16:09,
> <tran at theochem.tuwien.ac.at <mailto:tran at theochem.tuwien.ac.at>>
> > > wrote:
> > > > Hi,
> > > >
> > > > What you should never do is to mix
> spin-polarized and
> > > > non-spin-polarized is the same directory.
> > > >
> > > > Since Your explanations about
> > > spin-polarized/non-spin-polarized are a
> > > > bit confusing, the question is:
> > > >
> > > > Does the calculation run properly (in parallel and
> > > serial) if everything
> > > > (init_lapw and run_lapw) in a directory is
> done from the
> > > beginning in
> > > > non-spin-polarized? Same question with
> spin-polarized.
> > > >
> > > > F. Tran
> > > >
> > > > On Monday 2019-07-22 16:37, Ricardo Moreira wrote:
> > > >
> > > > >Date: Mon, 22 Jul 2019 16:37:30
> > > > >From: Ricardo Moreira
> <ricardopachecomoreira at gmail.com
> <mailto:ricardopachecomoreira at gmail.com>>
> > > > >Reply-To: A Mailing list for WIEN2k users
> > > <wien at zeus.theochem.tuwien.ac.at
> <mailto:wien at zeus.theochem.tuwien.ac.at>>
> > > > >To: wien at zeus.theochem.tuwien.ac.at
> <mailto:wien at zeus.theochem.tuwien.ac.at>
> > > > >Subject: [Wien] Parallel run problems with
> version 19.1
> > > > >
> > > > >Dear Wien2k users,
> > > > >I am running Wien2k on a computer cluster,
> compiled with
> > > the GNU compilers version 7.2.3, OpenMPI with the
> operating
> > > system
> > > > Scientific Linux release
> > > > >7.4. Since changing from version 18.2 to 19.1
> I've been
> > > unable to run Wien2k in parallel (neither mpi or simple
> > > k-parallel
> > > > seem to work), with
> > > > >calculations aborting with the following message:
> > > > >
> > > > > start (Mon Jul 22 14:49:31 WEST
> 2019) with
> > > lapw0 (40/99 to go)
> > > > >
> > > > > cycle 1 (Mon Jul 22 14:49:31 WEST 2019)
> > > (40/99 to go)
> > > > >
> > > > >> lapw0 -p (14:49:31) starting parallel
> lapw0 at
> > > Mon Jul 22 14:49:31 WEST 2019
> > > > >-------- .machine0 : 8 processors
> > > > >0.058u 0.160s 0:03.50 6.0% 0+0k 48+344io
> 5pf+0w
> > > > >> lapw1 -up -p (14:49:35) starting
> parallel
> > > lapw1 at Mon Jul 22 14:49:35 WEST 2019
> > > > >-> starting parallel LAPW1 jobs at Mon Jul
> 22 14:49:35
> > > WEST 2019
> > > > >running LAPW1 in parallel mode (using .machines)
> > > > >2 number_of_parallel_jobs
> > > > > ava01 ava01 ava01 ava01(8) ava21
> ava21 ava21
> > > ava21(8) Summary of lapw1para:
> > > > > ava01 k=8 user=0 wallclock=0
> > > > > ava21 k=16 user=0 wallclock=0
> > > > >** LAPW1 crashed!
> > > > >0.164u 0.306s 0:03.82 12.0% 0+0k
> 112+648io 1pf+0w
> > > > >error: command
> > > /homes/fc-up201202493/WIEN2k_19.1/lapw1para -up
> uplapw1.def
> > > failed
> > > > >
> > > > >> stop error
> > > > >
> > > > >Inspecting the error files I find that the
> error printed
> > > to uplapw1.error is:
> > > > >
> > > > >** Error in Parallel LAPW1
> > > > >** LAPW1 STOPPED at Mon Jul 22 14:49:39 WEST
> 2019
> > > > >** check ERROR FILES!
> > > > > 'INILPW' - can't open unit: 18
> > >
> > >
> > > >
> > > > > 'INILPW' - filename: TiC.vspup
> > >
> > >
> > > >
> > > > > 'INILPW' - status: old form:
> > > formatted
> > >
> > > >
> > > > > 'LAPW1' - INILPW aborted unsuccessfully.
> > > > > 'INILPW' - can't open unit: 18
> > >
> > >
> > > >
> > > > > 'INILPW' - filename: TiC.vspup
> > >
> > >
> > > >
> > > > > 'INILPW' - status: old form:
> > > formatted
> > >
> > > >
> > > > > 'LAPW1' - INILPW aborted unsuccessfully.
> > > > >
> > > > >As this error message on previous posts to
> the mailing
> > > lists is often pointed out as being due to running
> init_lapw for
> > > a non
> > > > spin-polarized case
> > > > >and then using runsp_lapw I should clarify
> that this
> > > also occurs when attempting to run a non
> spin-polarized case and
> > > instead
> > > > of TiC.vspup it
> > > > >changes to TiC.vsp in the error message.
> > > > >I should point out, for it may be related to
> this issue
> > > that serial runs have the problem that after I
> perform my first
> > > > simulation on a folder if I
> > > > >first start with a spin-polarized case and
> then do
> > > another init_lapw for non spin-polarized and attempt
> to do
> > > run_lapw I get
> > > > the errors as in before
> > > > >of "can't open unit: 18" (this also occurs if
> I first
> > > run a non spin-polarized simulation and then attempt
> to do a
> > > > spin-polarized one on the same
> > > > >folder). The workaround I found for this was
> making a
> > > new folder, but since the error message is also
> related to
> > > > TiC.vsp/vspup I thought I would
> > > > >point it out still.
> > > > >Lastly, I should mention that I deleted the line
> > > "15,'$file.tmp$updn',
> 'scratch','unformatted',0" from
> > > x_lapw as I
> > > > previously had an error in
> > > > >lapw2 reported elsewhere on the mailing list,
> that
> > > Professor Blaha indicated was solved by deleting the
> > > aforementioned line
> > > > (and indeed it was).
> > > > >Whether or not this could possibly be related
> to the
> > > issues I'm having now, I have no idea, so I felt it
> right to
> > > point out.
> > > > >Thanks in advance for any assistance that
> might be
> > > provided.
> > > > >
> > > > >Best Regards,
> > > > >Ricardo Moreira
> > > > >
> > > > >_______________________________________________
> > > > Wien mailing list
> > > > Wien at zeus.theochem.tuwien.ac.at
> <mailto:Wien at zeus.theochem.tuwien.ac.at>
> > > > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
> > > > SEARCH the MAILING-LIST at:
> > >
> http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
> > > >
> > > >
> > > >_______________________________________________
> > > Wien mailing list
> > > Wien at zeus.theochem.tuwien.ac.at
> <mailto:Wien at zeus.theochem.tuwien.ac.at>
> > > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
> > > SEARCH the MAILING-LIST at:
> > >
> http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
> > >
> > >
> > >_______________________________________________
> > Wien mailing list
> > Wien at zeus.theochem.tuwien.ac.at
> <mailto:Wien at zeus.theochem.tuwien.ac.at>
> > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
> > SEARCH the MAILING-LIST at:
> http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
> >
> >
> >_______________________________________________
> Wien mailing list
> Wien at zeus.theochem.tuwien.ac.at <mailto:Wien at zeus.theochem.tuwien.ac.at>
> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
> SEARCH the MAILING-LIST at:
> http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
>
>
> _______________________________________________
> Wien mailing list
> Wien at zeus.theochem.tuwien.ac.at
> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
> SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
>
--
--------------------------------------------------------------------------
Peter BLAHA, Inst.f. Materials Chemistry, TU Vienna, A-1060 Vienna
Phone: +43-1-58801-165300 FAX: +43-1-58801-165982
Email: blaha at theochem.tuwien.ac.at WIEN2k: http://www.wien2k.at
WWW:
http://www.imc.tuwien.ac.at/tc_blaha-------------------------------------------------------------------------
More information about the Wien
mailing list