[Wien] Parallel run problems with version 19.1

tran at theochem.tuwien.ac.at tran at theochem.tuwien.ac.at
Tue Jul 23 15:37:28 CEST 2019


Are you sure that libfftw3_mpi.so.3 is really there?
Where it should be is indicated in the Makefile of SRC_lapw0
(the path is FFTWROOT combined with FFTW_LIB).


On Tuesday 2019-07-23 15:24, Ricardo Moreira wrote:

>Date: Tue, 23 Jul 2019 15:24:25
>From: Ricardo Moreira <ricardopachecomoreira at gmail.com>
>Reply-To: A Mailing list for WIEN2k users <wien at zeus.theochem.tuwien.ac.at>
>To: A Mailing list for WIEN2k users <wien at zeus.theochem.tuwien.ac.at>
>Subject: Re: [Wien] Parallel run problems with version 19.1
>
>Yes, the calculation was initialized with spin-polarization, x lapw0 generates case.vspup and case.vspdn and runsp_lapw runs without issue until
>convergence is reached. Regarding the message that is shown, it is as follows:
>
>starting parallel lapw0 at Tue Jul 23 14:06:25 WEST 2019
>-------- .machine0 : 2 processors
>[1] 18397
>/homes/fc-up201202493/WIEN2k_19.1/lapw0_mpi: error while loading shared libraries: libfftw3_mpi.so.3: cannot open shared object file: No such file or
>directory
>--------------------------------------------------------------------------
>Primary job  terminated normally, but 1 process returned
>a non-zero exit code. Per user-direction, the job has been aborted.
>--------------------------------------------------------------------------
>/homes/fc-up201202493/WIEN2k_19.1/lapw0_mpi: error while loading shared libraries: libfftw3_mpi.so.3: cannot open shared object file: No such file or
>directory
>[1]    Exit 127                      mpirun -np 2 -machinefile .machine0 /homes/fc-up201202493/WIEN2k_19.1/lapw0_mpi lapw0.def >> .time00
>0.059u 0.133s 0:03.36 5.3%      0+0k 1312+240io 6pf+0w
>
>I looiked at the lib folder for fftw and the file is definitely there so I'm not sure what the cause for this would be.
>
>As for Professor Blaha's questions, I shall attempt to answer in order:
>1) Yes it does work
>2)The .machines file is:
>#
>1:ava18:1
>1:ava18:1
>lapw0:ava18:2
>granularity:1
>extrafine:1
>3) ls -als *output00* returns that there is no such file or directory but there is a file called TiC.output0 so I'll assume this is the file of
>interest here. The output os ls -als TiC.output0 is "68 -rw-r--r-- 1 fc-up201202493 cfp 65791 Jul 22 19:59 TiC.output0."
>4) The end of TiC.output0 has the following:
>
>   =====>>> CPU-TIME SUMMARY
>            TOTAL CPU/WALL-TIME USED :     2.8     100. PERCENT    2.8     100. PERCENT
>            TIME MULTIPOLMOMENTS:           0.0          1. PERCENT    0.0       1. PERCENT
>            TIME COULOMB POT INT:              0.0          0. PERCENT    0.0       0. PERCENT
>            TIME COULOMB POT RMT:            0.0       0. PERCENT    0.0       0. PERCENT
>            TIME COULOMB POT SPH:            0.0       1. PERCENT    0.0       1. PERCENT
>            TIME XCPOT SPHERES  :              1.8      64. PERCENT    1.8      63. PERCENT
>            TIME XCPOT INTERST  :                0.8      29. PERCENT    0.8      29. PERCENT
>            TIME TOTAL ENERGY   :                0.1       2. PERCENT    0.1       2. PERCENT
>            TIME REAN0, REAN3   :                  0.1       0. PERCENT    0.1       0. PERCENT
>            TIME REANALYSE      :                    0.0       2. PERCENT    0.1       2. PERCENT
>
>(the spacings are a bit off compared to what shows up on the actual file).
>
>Lastly regarding fftw-mpi. I had to update the GNU compilers I was previously using for version 18.2 as they were deemed to be too old a version by
>./siteconfig_lapw. As such I compiled a new version of OpenMPI with the newer version of the compilers. I wasn't sure if I had done the same for fftw
>so I went and recompiled fftw and then recompiled Wien2k version 19.1 afterwards but the error persists so this does not seem to be the cause of the
>it.
> 
>      On Mon, 22 Jul 2019 at 19:54, Peter Blaha <pblaha at theochem.tuwien.ac.at> wrote:
>      Please:
>      1) does   x lapw0   work ???
>      2) list your .machines file. In particular: for TiC use only 2 cores
>      (because of 2 atoms)
>      3) ls -als *output00*
>      4) what is at the end of *.output0000  ??? Please check for any errors.
>
>      Is your fftw-mpi compiled with the same compiler as wien2k ??
>
>
>      Am 22.07.2019 um 20:45 schrieb Ricardo Moreira:
>      > I had it at 4 as per the default value suggested during configuration
>      > but I changed it to 1 now. In spite of that, "x lapw0 -p" still did not
>      > generate case.vspup or case.vspdn.
>      >
>      > On Mon, 22 Jul 2019 at 19:01, <tran at theochem.tuwien.ac.at
>      > <mailto:tran at theochem.tuwien.ac.at>> wrote:
>      >
>      >     Do you have the variable OMP_NUM_THREADS set in your .bashrc or .cshrc
>      >     file? If yes and the value is greater than 1, then set it to 1 and
>      >     execute agian "x lapw0 -p".
>      >
>      >     On Monday 2019-07-22 18:39, Ricardo Moreira wrote:
>      >
>      >      >Date: Mon, 22 Jul 2019 18:39:45
>      >      >From: Ricardo Moreira <ricardopachecomoreira at gmail.com
>      >     <mailto:ricardopachecomoreira at gmail.com>>
>      >      >Reply-To: A Mailing list for WIEN2k users
>      >     <wien at zeus.theochem.tuwien.ac.at
>      >     <mailto:wien at zeus.theochem.tuwien.ac.at>>
>      >      >To: A Mailing list for WIEN2k users
>      >     <wien at zeus.theochem.tuwien.ac.at
>      >     <mailto:wien at zeus.theochem.tuwien.ac.at>>
>      >      >Subject: Re: [Wien] Parallel run problems with version 19.1
>      >      >
>      >      >That is indeed the case, neither case.vspup or case.vspdn were
>      >     generated after running "x lapw0 -p".
>      >      >
>      >      >On Mon, 22 Jul 2019 at 17:09, <tran at theochem.tuwien.ac.at
>      >     <mailto:tran at theochem.tuwien.ac.at>> wrote:
>      >      >      It seems that lapw0 does not generate case.vspup and
>      >      >      case.vspdn (and case.vsp for non-spin-polarized calculation).
>      >      >      Can you confirm that by executing "x lapw0 -p" on the command
>      >      >      line?
>      >      >
>      >      >      On Monday 2019-07-22 17:45, Ricardo Moreira wrote:
>      >      >
>      >      >      >Date: Mon, 22 Jul 2019 17:45:51
>      >      >      >From: Ricardo Moreira <ricardopachecomoreira at gmail.com
>      >     <mailto:ricardopachecomoreira at gmail.com>>
>      >      >      >Reply-To: A Mailing list for WIEN2k users
>      >     <wien at zeus.theochem.tuwien.ac.at
>      >     <mailto:wien at zeus.theochem.tuwien.ac.at>>
>      >      >      >To: A Mailing list for WIEN2k users
>      >     <wien at zeus.theochem.tuwien.ac.at
>      >     <mailto:wien at zeus.theochem.tuwien.ac.at>>
>      >      >      >Subject: Re: [Wien] Parallel run problems with version 19.1
>      >      >      >
>      >      >      >The command "ls *vsp*" returns only the files
>      >     "TiC.vspdn_st" and
>      >      >      >"TiC.vsp_st", so it would appear that the file is not
>      >     created at all when
>      >      >      >using the -p switch to runsp_lapw.
>      >      >      >
>      >      >      >On Mon, 22 Jul 2019 at 16:29, <tran at theochem.tuwien.ac.at
>      >     <mailto:tran at theochem.tuwien.ac.at>> wrote:
>      >      >      >      Is the file TiC.vspup emtpy?
>      >      >      >
>      >      >      >      On Monday 2019-07-22 17:24, Ricardo Moreira wrote:
>      >      >      >
>      >      >      >      >Date: Mon, 22 Jul 2019 17:24:42
>      >      >      >      >From: Ricardo Moreira
>      >     <ricardopachecomoreira at gmail.com
>      >     <mailto:ricardopachecomoreira at gmail.com>>
>      >      >      >      >Reply-To: A Mailing list for WIEN2k users
>      >      >      >      <wien at zeus.theochem.tuwien.ac.at
>      >     <mailto:wien at zeus.theochem.tuwien.ac.at>>
>      >      >      >      >To: A Mailing list for WIEN2k users
>      >      >      >      <wien at zeus.theochem.tuwien.ac.at
>      >     <mailto:wien at zeus.theochem.tuwien.ac.at>>
>      >      >      >      >Subject: Re: [Wien] Parallel run problems with
>      >     version 19.1
>      >      >      >      >
>      >      >      >      >Hi and thanks for the reply,
>      >      >      >      >Regarding serial calculations, yes in both non
>      >     spin-polarized
>      >      >      >      and spin-polarized everything runs properly in the
>      >     cases you
>      >      >      >      described. As
>      >      >      >      >for parallel, it fails in both cases, with the error I
>      >      >      >      indicated in my previous email.
>      >      >      >      >
>      >      >      >      >Best Regards,
>      >      >      >      >Ricardo Moreira
>
>
>


More information about the Wien mailing list