[Wien] Parallel run problems with version 19.1

Ricardo Moreira ricardopachecomoreira at gmail.com
Mon Jul 22 16:37:30 CEST 2019


Dear Wien2k users,

I am running Wien2k on a computer cluster, compiled with the GNU compilers
version 7.2.3, OpenMPI with the operating system Scientific Linux release
7.4. Since changing from version 18.2 to 19.1 I've been unable to run
Wien2k in parallel (neither mpi or simple k-parallel seem to work), with
calculations aborting with the following message:

    start       (Mon Jul 22 14:49:31 WEST 2019) with lapw0 (40/99 to go)

    cycle 1     (Mon Jul 22 14:49:31 WEST 2019)         (40/99 to go)

>   lapw0   -p  (14:49:31) starting parallel lapw0 at Mon Jul 22 14:49:31
WEST 2019
-------- .machine0 : 8 processors
0.058u 0.160s 0:03.50 6.0%      0+0k 48+344io 5pf+0w
>   lapw1  -up -p       (14:49:35) starting parallel lapw1 at Mon Jul 22
14:49:35 WEST 2019
->  starting parallel LAPW1 jobs at Mon Jul 22 14:49:35 WEST 2019
running LAPW1 in parallel mode (using .machines)
2 number_of_parallel_jobs
     ava01 ava01 ava01 ava01(8)      ava21 ava21 ava21 ava21(8)    Summary
of lapw1para:
   ava01         k=8     user=0  wallclock=0
   ava21         k=16    user=0  wallclock=0
**  LAPW1 crashed!
0.164u 0.306s 0:03.82 12.0%     0+0k 112+648io 1pf+0w
error: command   /homes/fc-up201202493/WIEN2k_19.1/lapw1para -up
uplapw1.def   failed

>   stop error

Inspecting the error files I find that the error printed to uplapw1.error
is:

**  Error in Parallel LAPW1
**  LAPW1 STOPPED at Mon Jul 22 14:49:39 WEST 2019
**  check ERROR FILES!
 'INILPW' - can't open unit:  18

 'INILPW' -        filename: TiC.vspup

 'INILPW' -          status: old          form: formatted

 'LAPW1' - INILPW aborted unsuccessfully.
 'INILPW' - can't open unit:  18

 'INILPW' -        filename: TiC.vspup

 'INILPW' -          status: old          form: formatted

 'LAPW1' - INILPW aborted unsuccessfully.

As this error message on previous posts to the mailing lists is often
pointed out as being due to running init_lapw for a non spin-polarized case
and then using runsp_lapw I should clarify that this also occurs when
attempting to run a non spin-polarized case and instead of TiC.vspup it
changes to TiC.vsp in the error message.
I should point out, for it may be related to this issue that serial runs
have the problem that after I perform my first simulation on a folder if I
first start with a spin-polarized case and then do another init_lapw for
non spin-polarized and attempt to do run_lapw I get the errors as in before
of "can't open unit: 18" (this also occurs if I first run a non
spin-polarized simulation and then attempt to do a spin-polarized one on
the same folder). The workaround I found for this was making a new folder,
but since the error message is also related to TiC.vsp/vspup I thought I
would point it out still.
Lastly, I should mention that I deleted the line "15,'$file.tmp$updn',
  'scratch','unformatted',0" from x_lapw as I previously had an error in
lapw2 reported elsewhere on the mailing list, that Professor Blaha
indicated was solved by deleting the aforementioned line (and indeed it
was). Whether or not this could possibly be related to the issues I'm
having now, I have no idea, so I felt it right to point out.
Thanks in advance for any assistance that might be provided.

Best Regards,
Ricardo Moreira
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://zeus.theochem.tuwien.ac.at/pipermail/wien/attachments/20190722/f15bea73/attachment.html>


More information about the Wien mailing list