[Wien] Parallel run problems with version 19.1

Ricardo Moreira ricardopachecomoreira at gmail.com
Tue Jul 23 15:24:25 CEST 2019


Yes, the calculation was initialized with spin-polarization, x lapw0
generates case.vspup and case.vspdn and runsp_lapw runs without issue until
convergence is reached. Regarding the message that is shown, it is as
follows:

starting parallel lapw0 at Tue Jul 23 14:06:25 WEST 2019
-------- .machine0 : 2 processors
[1] 18397
/homes/fc-up201202493/WIEN2k_19.1/lapw0_mpi: error while loading shared
libraries: libfftw3_mpi.so.3: cannot open shared object file: No such file
or directory
--------------------------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
/homes/fc-up201202493/WIEN2k_19.1/lapw0_mpi: error while loading shared
libraries: libfftw3_mpi.so.3: cannot open shared object file: No such file
or directory
[1]    Exit 127                      mpirun -np 2 -machinefile .machine0
/homes/fc-up201202493/WIEN2k_19.1/lapw0_mpi lapw0.def >> .time00
0.059u 0.133s 0:03.36 5.3%      0+0k 1312+240io 6pf+0w

I looiked at the lib folder for fftw and the file is definitely there so
I'm not sure what the cause for this would be.

As for Professor Blaha's questions, I shall attempt to answer in order:
1) Yes it does work
2)The .machines file is:
#
1:ava18:1
1:ava18:1
lapw0:ava18:2
granularity:1
extrafine:1
3) ls -als *output00* returns that there is no such file or directory but
there is a file called TiC.output0 so I'll assume this is the file of
interest here. The output os ls -als TiC.output0 is "68 -rw-r--r-- 1
fc-up201202493 cfp 65791 Jul 22 19:59 TiC.output0."
4) The end of TiC.output0 has the following:

   =====>>> CPU-TIME SUMMARY
            TOTAL CPU/WALL-TIME USED :     2.8     100. PERCENT    2.8
100. PERCENT
            TIME MULTIPOLMOMENTS:           0.0          1. PERCENT    0.0
      1. PERCENT
            TIME COULOMB POT INT:              0.0          0. PERCENT
 0.0       0. PERCENT
            TIME COULOMB POT RMT:            0.0       0. PERCENT    0.0
    0. PERCENT
            TIME COULOMB POT SPH:            0.0       1. PERCENT    0.0
    1. PERCENT
            TIME XCPOT SPHERES  :              1.8      64. PERCENT    1.8
     63. PERCENT
            TIME XCPOT INTERST  :                0.8      29. PERCENT
 0.8      29. PERCENT
            TIME TOTAL ENERGY   :                0.1       2. PERCENT
 0.1       2. PERCENT
            TIME REAN0, REAN3   :                  0.1       0. PERCENT
 0.1       0. PERCENT
            TIME REANALYSE      :                    0.0       2. PERCENT
 0.1       2. PERCENT

(the spacings are a bit off compared to what shows up on the actual file).

Lastly regarding fftw-mpi. I had to update the GNU compilers I was
previously using for version 18.2 as they were deemed to be too old a
version by ./siteconfig_lapw. As such I compiled a new version of OpenMPI
with the newer version of the compilers. I wasn't sure if I had done the
same for fftw so I went and recompiled fftw and then recompiled Wien2k
version 19.1 afterwards but the error persists so this does not seem to be
the cause of the it.


> On Mon, 22 Jul 2019 at 19:54, Peter Blaha <pblaha at theochem.tuwien.ac.at>
> wrote:
>
>> Please:
>> 1) does   x lapw0   work ???
>> 2) list your .machines file. In particular: for TiC use only 2 cores
>> (because of 2 atoms)
>> 3) ls -als *output00*
>> 4) what is at the end of *.output0000  ??? Please check for any errors.
>>
>> Is your fftw-mpi compiled with the same compiler as wien2k ??
>>
>>
>> Am 22.07.2019 um 20:45 schrieb Ricardo Moreira:
>> > I had it at 4 as per the default value suggested during configuration
>> > but I changed it to 1 now. In spite of that, "x lapw0 -p" still did not
>> > generate case.vspup or case.vspdn.
>> >
>> > On Mon, 22 Jul 2019 at 19:01, <tran at theochem.tuwien.ac.at
>> > <mailto:tran at theochem.tuwien.ac.at>> wrote:
>> >
>> >     Do you have the variable OMP_NUM_THREADS set in your .bashrc or
>> .cshrc
>> >     file? If yes and the value is greater than 1, then set it to 1 and
>> >     execute agian "x lapw0 -p".
>> >
>> >     On Monday 2019-07-22 18:39, Ricardo Moreira wrote:
>> >
>> >      >Date: Mon, 22 Jul 2019 18:39:45
>> >      >From: Ricardo Moreira <ricardopachecomoreira at gmail.com
>> >     <mailto:ricardopachecomoreira at gmail.com>>
>> >      >Reply-To: A Mailing list for WIEN2k users
>> >     <wien at zeus.theochem.tuwien.ac.at
>> >     <mailto:wien at zeus.theochem.tuwien.ac.at>>
>> >      >To: A Mailing list for WIEN2k users
>> >     <wien at zeus.theochem.tuwien.ac.at
>> >     <mailto:wien at zeus.theochem.tuwien.ac.at>>
>> >      >Subject: Re: [Wien] Parallel run problems with version 19.1
>> >      >
>> >      >That is indeed the case, neither case.vspup or case.vspdn were
>> >     generated after running "x lapw0 -p".
>> >      >
>> >      >On Mon, 22 Jul 2019 at 17:09, <tran at theochem.tuwien.ac.at
>> >     <mailto:tran at theochem.tuwien.ac.at>> wrote:
>> >      >      It seems that lapw0 does not generate case.vspup and
>> >      >      case.vspdn (and case.vsp for non-spin-polarized
>> calculation).
>> >      >      Can you confirm that by executing "x lapw0 -p" on the
>> command
>> >      >      line?
>> >      >
>> >      >      On Monday 2019-07-22 17:45, Ricardo Moreira wrote:
>> >      >
>> >      >      >Date: Mon, 22 Jul 2019 17:45:51
>> >      >      >From: Ricardo Moreira <ricardopachecomoreira at gmail.com
>> >     <mailto:ricardopachecomoreira at gmail.com>>
>> >      >      >Reply-To: A Mailing list for WIEN2k users
>> >     <wien at zeus.theochem.tuwien.ac.at
>> >     <mailto:wien at zeus.theochem.tuwien.ac.at>>
>> >      >      >To: A Mailing list for WIEN2k users
>> >     <wien at zeus.theochem.tuwien.ac.at
>> >     <mailto:wien at zeus.theochem.tuwien.ac.at>>
>> >      >      >Subject: Re: [Wien] Parallel run problems with version 19.1
>> >      >      >
>> >      >      >The command "ls *vsp*" returns only the files
>> >     "TiC.vspdn_st" and
>> >      >      >"TiC.vsp_st", so it would appear that the file is not
>> >     created at all when
>> >      >      >using the -p switch to runsp_lapw.
>> >      >      >
>> >      >      >On Mon, 22 Jul 2019 at 16:29, <tran at theochem.tuwien.ac.at
>> >     <mailto:tran at theochem.tuwien.ac.at>> wrote:
>> >      >      >      Is the file TiC.vspup emtpy?
>> >      >      >
>> >      >      >      On Monday 2019-07-22 17:24, Ricardo Moreira wrote:
>> >      >      >
>> >      >      >      >Date: Mon, 22 Jul 2019 17:24:42
>> >      >      >      >From: Ricardo Moreira
>> >     <ricardopachecomoreira at gmail.com
>> >     <mailto:ricardopachecomoreira at gmail.com>>
>> >      >      >      >Reply-To: A Mailing list for WIEN2k users
>> >      >      >      <wien at zeus.theochem.tuwien.ac.at
>> >     <mailto:wien at zeus.theochem.tuwien.ac.at>>
>> >      >      >      >To: A Mailing list for WIEN2k users
>> >      >      >      <wien at zeus.theochem.tuwien.ac.at
>> >     <mailto:wien at zeus.theochem.tuwien.ac.at>>
>> >      >      >      >Subject: Re: [Wien] Parallel run problems with
>> >     version 19.1
>> >      >      >      >
>> >      >      >      >Hi and thanks for the reply,
>> >      >      >      >Regarding serial calculations, yes in both non
>> >     spin-polarized
>> >      >      >      and spin-polarized everything runs properly in the
>> >     cases you
>> >      >      >      described. As
>> >      >      >      >for parallel, it fails in both cases, with the
>> error I
>> >      >      >      indicated in my previous email.
>> >      >      >      >
>> >      >      >      >Best Regards,
>> >      >      >      >Ricardo Moreira
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://zeus.theochem.tuwien.ac.at/pipermail/wien/attachments/20190723/a8767c23/attachment-0001.html>


More information about the Wien mailing list