[Wien] Parallel run problems with version 19.1
tran at theochem.tuwien.ac.at
tran at theochem.tuwien.ac.at
Mon Jul 22 20:50:19 CEST 2019
More questions:
Was the calculation really initialized with spin-polarization?
If not, then only case.vsp is generated.
What is the message on the screen when "x lapw0 -p" is executed?
On Monday 2019-07-22 20:45, Ricardo Moreira wrote:
>Date: Mon, 22 Jul 2019 20:45:22
>From: Ricardo Moreira <ricardopachecomoreira at gmail.com>
>Reply-To: A Mailing list for WIEN2k users <wien at zeus.theochem.tuwien.ac.at>
>To: A Mailing list for WIEN2k users <wien at zeus.theochem.tuwien.ac.at>
>Subject: Re: [Wien] Parallel run problems with version 19.1
>
>I had it at 4 as per the default value suggested during configuration but I changed it to 1 now. In spite of that, "x lapw0 -p" still did not generate
>case.vspup or case.vspdn.
>
>On Mon, 22 Jul 2019 at 19:01, <tran at theochem.tuwien.ac.at> wrote:
> Do you have the variable OMP_NUM_THREADS set in your .bashrc or .cshrc
> file? If yes and the value is greater than 1, then set it to 1 and
> execute agian "x lapw0 -p".
>
> On Monday 2019-07-22 18:39, Ricardo Moreira wrote:
>
> >Date: Mon, 22 Jul 2019 18:39:45
> >From: Ricardo Moreira <ricardopachecomoreira at gmail.com>
> >Reply-To: A Mailing list for WIEN2k users <wien at zeus.theochem.tuwien.ac.at>
> >To: A Mailing list for WIEN2k users <wien at zeus.theochem.tuwien.ac.at>
> >Subject: Re: [Wien] Parallel run problems with version 19.1
> >
> >That is indeed the case, neither case.vspup or case.vspdn were generated after running "x lapw0 -p".
> >
> >On Mon, 22 Jul 2019 at 17:09, <tran at theochem.tuwien.ac.at> wrote:
> > It seems that lapw0 does not generate case.vspup and
> > case.vspdn (and case.vsp for non-spin-polarized calculation).
> > Can you confirm that by executing "x lapw0 -p" on the command
> > line?
> >
> > On Monday 2019-07-22 17:45, Ricardo Moreira wrote:
> >
> > >Date: Mon, 22 Jul 2019 17:45:51
> > >From: Ricardo Moreira <ricardopachecomoreira at gmail.com>
> > >Reply-To: A Mailing list for WIEN2k users <wien at zeus.theochem.tuwien.ac.at>
> > >To: A Mailing list for WIEN2k users <wien at zeus.theochem.tuwien.ac.at>
> > >Subject: Re: [Wien] Parallel run problems with version 19.1
> > >
> > >The command "ls *vsp*" returns only the files "TiC.vspdn_st" and
> > >"TiC.vsp_st", so it would appear that the file is not created at all when
> > >using the -p switch to runsp_lapw.
> > >
> > >On Mon, 22 Jul 2019 at 16:29, <tran at theochem.tuwien.ac.at> wrote:
> > > Is the file TiC.vspup emtpy?
> > >
> > > On Monday 2019-07-22 17:24, Ricardo Moreira wrote:
> > >
> > > >Date: Mon, 22 Jul 2019 17:24:42
> > > >From: Ricardo Moreira <ricardopachecomoreira at gmail.com>
> > > >Reply-To: A Mailing list for WIEN2k users
> > > <wien at zeus.theochem.tuwien.ac.at>
> > > >To: A Mailing list for WIEN2k users
> > > <wien at zeus.theochem.tuwien.ac.at>
> > > >Subject: Re: [Wien] Parallel run problems with version 19.1
> > > >
> > > >Hi and thanks for the reply,
> > > >Regarding serial calculations, yes in both non spin-polarized
> > > and spin-polarized everything runs properly in the cases you
> > > described. As
> > > >for parallel, it fails in both cases, with the error I
> > > indicated in my previous email.
> > > >
> > > >Best Regards,
> > > >Ricardo Moreira
> > > >
> > > >On Mon, 22 Jul 2019 at 16:09, <tran at theochem.tuwien.ac.at>
> > > wrote:
> > > > Hi,
> > > >
> > > > What you should never do is to mix spin-polarized and
> > > > non-spin-polarized is the same directory.
> > > >
> > > > Since Your explanations about
> > > spin-polarized/non-spin-polarized are a
> > > > bit confusing, the question is:
> > > >
> > > > Does the calculation run properly (in parallel and
> > > serial) if everything
> > > > (init_lapw and run_lapw) in a directory is done from the
> > > beginning in
> > > > non-spin-polarized? Same question with spin-polarized.
> > > >
> > > > F. Tran
> > > >
> > > > On Monday 2019-07-22 16:37, Ricardo Moreira wrote:
> > > >
> > > > >Date: Mon, 22 Jul 2019 16:37:30
> > > > >From: Ricardo Moreira <ricardopachecomoreira at gmail.com>
> > > > >Reply-To: A Mailing list for WIEN2k users
> > > <wien at zeus.theochem.tuwien.ac.at>
> > > > >To: wien at zeus.theochem.tuwien.ac.at
> > > > >Subject: [Wien] Parallel run problems with version 19.1
> > > > >
> > > > >Dear Wien2k users,
> > > > >I am running Wien2k on a computer cluster, compiled with
> > > the GNU compilers version 7.2.3, OpenMPI with the operating
> > > system
> > > > Scientific Linux release
> > > > >7.4. Since changing from version 18.2 to 19.1 I've been
> > > unable to run Wien2k in parallel (neither mpi or simple
> > > k-parallel
> > > > seem to work), with
> > > > >calculations aborting with the following message:
> > > > >
> > > > > start (Mon Jul 22 14:49:31 WEST 2019) with
> > > lapw0 (40/99 to go)
> > > > >
> > > > > cycle 1 (Mon Jul 22 14:49:31 WEST 2019)
> > > (40/99 to go)
> > > > >
> > > > >> lapw0 -p (14:49:31) starting parallel lapw0 at
> > > Mon Jul 22 14:49:31 WEST 2019
> > > > >-------- .machine0 : 8 processors
> > > > >0.058u 0.160s 0:03.50 6.0% 0+0k 48+344io 5pf+0w
> > > > >> lapw1 -up -p (14:49:35) starting parallel
> > > lapw1 at Mon Jul 22 14:49:35 WEST 2019
> > > > >-> starting parallel LAPW1 jobs at Mon Jul 22 14:49:35
> > > WEST 2019
> > > > >running LAPW1 in parallel mode (using .machines)
> > > > >2 number_of_parallel_jobs
> > > > > ava01 ava01 ava01 ava01(8) ava21 ava21 ava21
> > > ava21(8) Summary of lapw1para:
> > > > > ava01 k=8 user=0 wallclock=0
> > > > > ava21 k=16 user=0 wallclock=0
> > > > >** LAPW1 crashed!
> > > > >0.164u 0.306s 0:03.82 12.0% 0+0k 112+648io 1pf+0w
> > > > >error: command
> > > /homes/fc-up201202493/WIEN2k_19.1/lapw1para -up uplapw1.def
> > > failed
> > > > >
> > > > >> stop error
> > > > >
> > > > >Inspecting the error files I find that the error printed
> > > to uplapw1.error is:
> > > > >
> > > > >** Error in Parallel LAPW1
> > > > >** LAPW1 STOPPED at Mon Jul 22 14:49:39 WEST 2019
> > > > >** check ERROR FILES!
> > > > > 'INILPW' - can't open unit: 18
> > >
> > >
> > > >
> > > > > 'INILPW' - filename: TiC.vspup
> > >
> > >
> > > >
> > > > > 'INILPW' - status: old form:
> > > formatted
> > >
> > > >
> > > > > 'LAPW1' - INILPW aborted unsuccessfully.
> > > > > 'INILPW' - can't open unit: 18
> > >
> > >
> > > >
> > > > > 'INILPW' - filename: TiC.vspup
> > >
> > >
> > > >
> > > > > 'INILPW' - status: old form:
> > > formatted
> > >
> > > >
> > > > > 'LAPW1' - INILPW aborted unsuccessfully.
> > > > >
> > > > >As this error message on previous posts to the mailing
> > > lists is often pointed out as being due to running init_lapw for
> > > a non
> > > > spin-polarized case
> > > > >and then using runsp_lapw I should clarify that this
> > > also occurs when attempting to run a non spin-polarized case and
> > > instead
> > > > of TiC.vspup it
> > > > >changes to TiC.vsp in the error message.
> > > > >I should point out, for it may be related to this issue
> > > that serial runs have the problem that after I perform my first
> > > > simulation on a folder if I
> > > > >first start with a spin-polarized case and then do
> > > another init_lapw for non spin-polarized and attempt to do
> > > run_lapw I get
> > > > the errors as in before
> > > > >of "can't open unit: 18" (this also occurs if I first
> > > run a non spin-polarized simulation and then attempt to do a
> > > > spin-polarized one on the same
> > > > >folder). The workaround I found for this was making a
> > > new folder, but since the error message is also related to
> > > > TiC.vsp/vspup I thought I would
> > > > >point it out still.
> > > > >Lastly, I should mention that I deleted the line
> > > "15,'$file.tmp$updn', 'scratch','unformatted',0" from
> > > x_lapw as I
> > > > previously had an error in
> > > > >lapw2 reported elsewhere on the mailing list, that
> > > Professor Blaha indicated was solved by deleting the
> > > aforementioned line
> > > > (and indeed it was).
> > > > >Whether or not this could possibly be related to the
> > > issues I'm having now, I have no idea, so I felt it right to
> > > point out.
> > > > >Thanks in advance for any assistance that might be
> > > provided.
> > > > >
> > > > >Best Regards,
> > > > >Ricardo Moreira
> > > > >
> > > > >_______________________________________________
> > > > Wien mailing list
> > > > Wien at zeus.theochem.tuwien.ac.at
> > > > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
> > > > SEARCH the MAILING-LIST at:
> > > http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
> > > >
> > > >
> > > >_______________________________________________
> > > Wien mailing list
> > > Wien at zeus.theochem.tuwien.ac.at
> > > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
> > > SEARCH the MAILING-LIST at:
> > > http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
> > >
> > >
> > >_______________________________________________
> > Wien mailing list
> > Wien at zeus.theochem.tuwien.ac.at
> > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
> > SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
> >
> >
> >_______________________________________________
> Wien mailing list
> Wien at zeus.theochem.tuwien.ac.at
> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
> SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
>
>
>
More information about the Wien
mailing list