[Wien] Parallel run problems with version 19.1

Peter Blaha pblaha at theochem.tuwien.ac.at
Mon Jul 22 20:54:37 CEST 2019


Please:
1) does   x lapw0   work ???
2) list your .machines file. In particular: for TiC use only 2 cores 
(because of 2 atoms)
3) ls -als *output00*
4) what is at the end of *.output0000  ??? Please check for any errors.

Is your fftw-mpi compiled with the same compiler as wien2k ??


Am 22.07.2019 um 20:45 schrieb Ricardo Moreira:
> I had it at 4 as per the default value suggested during configuration 
> but I changed it to 1 now. In spite of that, "x lapw0 -p" still did not 
> generate case.vspup or case.vspdn.
> 
> On Mon, 22 Jul 2019 at 19:01, <tran at theochem.tuwien.ac.at 
> <mailto:tran at theochem.tuwien.ac.at>> wrote:
> 
>     Do you have the variable OMP_NUM_THREADS set in your .bashrc or .cshrc
>     file? If yes and the value is greater than 1, then set it to 1 and
>     execute agian "x lapw0 -p".
> 
>     On Monday 2019-07-22 18:39, Ricardo Moreira wrote:
> 
>      >Date: Mon, 22 Jul 2019 18:39:45
>      >From: Ricardo Moreira <ricardopachecomoreira at gmail.com
>     <mailto:ricardopachecomoreira at gmail.com>>
>      >Reply-To: A Mailing list for WIEN2k users
>     <wien at zeus.theochem.tuwien.ac.at
>     <mailto:wien at zeus.theochem.tuwien.ac.at>>
>      >To: A Mailing list for WIEN2k users
>     <wien at zeus.theochem.tuwien.ac.at
>     <mailto:wien at zeus.theochem.tuwien.ac.at>>
>      >Subject: Re: [Wien] Parallel run problems with version 19.1
>      >
>      >That is indeed the case, neither case.vspup or case.vspdn were
>     generated after running "x lapw0 -p".
>      >
>      >On Mon, 22 Jul 2019 at 17:09, <tran at theochem.tuwien.ac.at
>     <mailto:tran at theochem.tuwien.ac.at>> wrote:
>      >      It seems that lapw0 does not generate case.vspup and
>      >      case.vspdn (and case.vsp for non-spin-polarized calculation).
>      >      Can you confirm that by executing "x lapw0 -p" on the command
>      >      line?
>      >
>      >      On Monday 2019-07-22 17:45, Ricardo Moreira wrote:
>      >
>      >      >Date: Mon, 22 Jul 2019 17:45:51
>      >      >From: Ricardo Moreira <ricardopachecomoreira at gmail.com
>     <mailto:ricardopachecomoreira at gmail.com>>
>      >      >Reply-To: A Mailing list for WIEN2k users
>     <wien at zeus.theochem.tuwien.ac.at
>     <mailto:wien at zeus.theochem.tuwien.ac.at>>
>      >      >To: A Mailing list for WIEN2k users
>     <wien at zeus.theochem.tuwien.ac.at
>     <mailto:wien at zeus.theochem.tuwien.ac.at>>
>      >      >Subject: Re: [Wien] Parallel run problems with version 19.1
>      >      >
>      >      >The command "ls *vsp*" returns only the files
>     "TiC.vspdn_st" and
>      >      >"TiC.vsp_st", so it would appear that the file is not
>     created at all when
>      >      >using the -p switch to runsp_lapw.
>      >      >
>      >      >On Mon, 22 Jul 2019 at 16:29, <tran at theochem.tuwien.ac.at
>     <mailto:tran at theochem.tuwien.ac.at>> wrote:
>      >      >      Is the file TiC.vspup emtpy?
>      >      >
>      >      >      On Monday 2019-07-22 17:24, Ricardo Moreira wrote:
>      >      >
>      >      >      >Date: Mon, 22 Jul 2019 17:24:42
>      >      >      >From: Ricardo Moreira
>     <ricardopachecomoreira at gmail.com
>     <mailto:ricardopachecomoreira at gmail.com>>
>      >      >      >Reply-To: A Mailing list for WIEN2k users
>      >      >      <wien at zeus.theochem.tuwien.ac.at
>     <mailto:wien at zeus.theochem.tuwien.ac.at>>
>      >      >      >To: A Mailing list for WIEN2k users
>      >      >      <wien at zeus.theochem.tuwien.ac.at
>     <mailto:wien at zeus.theochem.tuwien.ac.at>>
>      >      >      >Subject: Re: [Wien] Parallel run problems with
>     version 19.1
>      >      >      >
>      >      >      >Hi and thanks for the reply,
>      >      >      >Regarding serial calculations, yes in both non
>     spin-polarized
>      >      >      and spin-polarized everything runs properly in the
>     cases you
>      >      >      described. As
>      >      >      >for parallel, it fails in both cases, with the error I
>      >      >      indicated in my previous email.
>      >      >      >
>      >      >      >Best Regards,
>      >      >      >Ricardo Moreira
>      >      >      >
>      >      >      >On Mon, 22 Jul 2019 at 16:09,
>     <tran at theochem.tuwien.ac.at <mailto:tran at theochem.tuwien.ac.at>>
>      >      >      wrote:
>      >      >      >      Hi,
>      >      >      >
>      >      >      >      What you should never do is to mix
>     spin-polarized and
>      >      >      >      non-spin-polarized is the same directory.
>      >      >      >
>      >      >      >      Since Your explanations about
>      >      >      spin-polarized/non-spin-polarized are a
>      >      >      >      bit confusing, the question is:
>      >      >      >
>      >      >      >      Does the calculation run properly (in parallel and
>      >      >      serial) if everything
>      >      >      >      (init_lapw and run_lapw) in a directory is
>     done from the
>      >      >      beginning in
>      >      >      >      non-spin-polarized? Same question with
>     spin-polarized.
>      >      >      >
>      >      >      >      F. Tran
>      >      >      >
>      >      >      >      On Monday 2019-07-22 16:37, Ricardo Moreira wrote:
>      >      >      >
>      >      >      >      >Date: Mon, 22 Jul 2019 16:37:30
>      >      >      >      >From: Ricardo Moreira
>     <ricardopachecomoreira at gmail.com
>     <mailto:ricardopachecomoreira at gmail.com>>
>      >      >      >      >Reply-To: A Mailing list for WIEN2k users
>      >      >      <wien at zeus.theochem.tuwien.ac.at
>     <mailto:wien at zeus.theochem.tuwien.ac.at>>
>      >      >      >      >To: wien at zeus.theochem.tuwien.ac.at
>     <mailto:wien at zeus.theochem.tuwien.ac.at>
>      >      >      >      >Subject: [Wien] Parallel run problems with
>     version 19.1
>      >      >      >      >
>      >      >      >      >Dear Wien2k users,
>      >      >      >      >I am running Wien2k on a computer cluster,
>     compiled with
>      >      >      the GNU compilers version 7.2.3, OpenMPI with the
>     operating
>      >      >      system
>      >      >      >      Scientific Linux release
>      >      >      >      >7.4. Since changing from version 18.2 to 19.1
>     I've been
>      >      >      unable to run Wien2k in parallel (neither mpi or simple
>      >      >      k-parallel
>      >      >      >      seem to work), with
>      >      >      >      >calculations aborting with the following message:
>      >      >      >      >
>      >      >      >      >    start       (Mon Jul 22 14:49:31 WEST
>     2019) with
>      >      >      lapw0 (40/99 to go)
>      >      >      >      >
>      >      >      >      >    cycle 1     (Mon Jul 22 14:49:31 WEST 2019)
>      >      >      (40/99 to go)
>      >      >      >      >
>      >      >      >      >>   lapw0   -p  (14:49:31) starting parallel
>     lapw0 at
>      >      >      Mon Jul 22 14:49:31 WEST 2019
>      >      >      >      >-------- .machine0 : 8 processors
>      >      >      >      >0.058u 0.160s 0:03.50 6.0%      0+0k 48+344io
>     5pf+0w
>      >      >      >      >>   lapw1  -up -p       (14:49:35) starting
>     parallel
>      >      >      lapw1 at Mon Jul 22 14:49:35 WEST 2019
>      >      >      >      >->  starting parallel LAPW1 jobs at Mon Jul
>     22 14:49:35
>      >      >      WEST 2019
>      >      >      >      >running LAPW1 in parallel mode (using .machines)
>      >      >      >      >2 number_of_parallel_jobs
>      >      >      >      >     ava01 ava01 ava01 ava01(8)      ava21
>     ava21 ava21
>      >      >      ava21(8)    Summary of lapw1para:
>      >      >      >      >   ava01         k=8     user=0  wallclock=0
>      >      >      >      >   ava21         k=16    user=0  wallclock=0
>      >      >      >      >**  LAPW1 crashed!
>      >      >      >      >0.164u 0.306s 0:03.82 12.0%     0+0k
>     112+648io 1pf+0w
>      >      >      >      >error: command
>      >      >      /homes/fc-up201202493/WIEN2k_19.1/lapw1para -up
>     uplapw1.def
>      >      >      failed
>      >      >      >      >
>      >      >      >      >>   stop error
>      >      >      >      >
>      >      >      >      >Inspecting the error files I find that the
>     error printed
>      >      >      to uplapw1.error is:
>      >      >      >      >
>      >      >      >      >**  Error in Parallel LAPW1
>      >      >      >      >**  LAPW1 STOPPED at Mon Jul 22 14:49:39 WEST
>     2019
>      >      >      >      >**  check ERROR FILES!
>      >      >      >      > 'INILPW' - can't open unit:  18
>      >      >
>      >      >
>      >      >      >
>      >      >      >      > 'INILPW' -        filename: TiC.vspup
>      >      >
>      >      >
>      >      >      >
>      >      >      >      > 'INILPW' -          status: old          form:
>      >      >      formatted
>      >      >
>      >      >      >
>      >      >      >      > 'LAPW1' - INILPW aborted unsuccessfully.
>      >      >      >      > 'INILPW' - can't open unit:  18
>      >      >
>      >      >
>      >      >      >
>      >      >      >      > 'INILPW' -        filename: TiC.vspup
>      >      >
>      >      >
>      >      >      >
>      >      >      >      > 'INILPW' -          status: old          form:
>      >      >      formatted
>      >      >
>      >      >      >
>      >      >      >      > 'LAPW1' - INILPW aborted unsuccessfully.
>      >      >      >      >
>      >      >      >      >As this error message on previous posts to
>     the mailing
>      >      >      lists is often pointed out as being due to running
>     init_lapw for
>      >      >      a non
>      >      >      >      spin-polarized case
>      >      >      >      >and then using runsp_lapw I should clarify
>     that this
>      >      >      also occurs when attempting to run a non
>     spin-polarized case and
>      >      >      instead
>      >      >      >      of TiC.vspup it
>      >      >      >      >changes to TiC.vsp in the error message.
>      >      >      >      >I should point out, for it may be related to
>     this issue
>      >      >      that serial runs have the problem that after I
>     perform my first
>      >      >      >      simulation on a folder if I
>      >      >      >      >first start with a spin-polarized case and
>     then do
>      >      >      another init_lapw for non spin-polarized and attempt
>     to do
>      >      >      run_lapw I get
>      >      >      >      the errors as in before
>      >      >      >      >of "can't open unit: 18" (this also occurs if
>     I first
>      >      >      run a non spin-polarized simulation and then attempt
>     to do a
>      >      >      >      spin-polarized one on the same
>      >      >      >      >folder). The workaround I found for this was
>     making a
>      >      >      new folder, but since the error message is also
>     related to
>      >      >      >      TiC.vsp/vspup I thought I would
>      >      >      >      >point it out still.
>      >      >      >      >Lastly, I should mention that I deleted the line
>      >      >      "15,'$file.tmp$updn',      
>     'scratch','unformatted',0" from
>      >      >      x_lapw as I
>      >      >      >      previously had an error in
>      >      >      >      >lapw2 reported elsewhere on the mailing list,
>     that
>      >      >      Professor Blaha indicated was solved by deleting the
>      >      >      aforementioned line
>      >      >      >      (and indeed it was).
>      >      >      >      >Whether or not this could possibly be related
>     to the
>      >      >      issues I'm having now, I have no idea, so I felt it
>     right to
>      >      >      point out.
>      >      >      >      >Thanks in advance for any assistance that
>     might be
>      >      >      provided.
>      >      >      >      >
>      >      >      >      >Best Regards,
>      >      >      >      >Ricardo Moreira
>      >      >      >      >
>      >      >      >      >_______________________________________________
>      >      >      >      Wien mailing list
>      >      >      > Wien at zeus.theochem.tuwien.ac.at
>     <mailto:Wien at zeus.theochem.tuwien.ac.at>
>      >      >      > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
>      >      >      >      SEARCH the MAILING-LIST at:
>      >      >
>     http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
>      >      >      >
>      >      >      >
>      >      >      >_______________________________________________
>      >      >      Wien mailing list
>      >      > Wien at zeus.theochem.tuwien.ac.at
>     <mailto:Wien at zeus.theochem.tuwien.ac.at>
>      >      > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
>      >      >      SEARCH the MAILING-LIST at:
>      >      >
>     http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
>      >      >
>      >      >
>      >      >_______________________________________________
>      >      Wien mailing list
>      > Wien at zeus.theochem.tuwien.ac.at
>     <mailto:Wien at zeus.theochem.tuwien.ac.at>
>      > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
>      >      SEARCH the MAILING-LIST at:
>     http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
>      >
>      >
>      >_______________________________________________
>     Wien mailing list
>     Wien at zeus.theochem.tuwien.ac.at <mailto:Wien at zeus.theochem.tuwien.ac.at>
>     http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
>     SEARCH the MAILING-LIST at:
>     http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
> 
> 
> _______________________________________________
> Wien mailing list
> Wien at zeus.theochem.tuwien.ac.at
> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
> SEARCH the MAILING-LIST at:  http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
> 

-- 
--------------------------------------------------------------------------
Peter BLAHA, Inst.f. Materials Chemistry, TU Vienna, A-1060 Vienna
Phone: +43-1-58801-165300             FAX: +43-1-58801-165982
Email: blaha at theochem.tuwien.ac.at    WIEN2k: http://www.wien2k.at
WWW: 
http://www.imc.tuwien.ac.at/tc_blaha------------------------------------------------------------------------- 



More information about the Wien mailing list