[Wien] Parallel run problems with version 19.1
Ricardo Moreira
ricardopachecomoreira at gmail.com
Tue Jul 23 15:24:25 CEST 2019
Yes, the calculation was initialized with spin-polarization, x lapw0
generates case.vspup and case.vspdn and runsp_lapw runs without issue until
convergence is reached. Regarding the message that is shown, it is as
follows:
starting parallel lapw0 at Tue Jul 23 14:06:25 WEST 2019
-------- .machine0 : 2 processors
[1] 18397
/homes/fc-up201202493/WIEN2k_19.1/lapw0_mpi: error while loading shared
libraries: libfftw3_mpi.so.3: cannot open shared object file: No such file
or directory
--------------------------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
/homes/fc-up201202493/WIEN2k_19.1/lapw0_mpi: error while loading shared
libraries: libfftw3_mpi.so.3: cannot open shared object file: No such file
or directory
[1] Exit 127 mpirun -np 2 -machinefile .machine0
/homes/fc-up201202493/WIEN2k_19.1/lapw0_mpi lapw0.def >> .time00
0.059u 0.133s 0:03.36 5.3% 0+0k 1312+240io 6pf+0w
I looiked at the lib folder for fftw and the file is definitely there so
I'm not sure what the cause for this would be.
As for Professor Blaha's questions, I shall attempt to answer in order:
1) Yes it does work
2)The .machines file is:
#
1:ava18:1
1:ava18:1
lapw0:ava18:2
granularity:1
extrafine:1
3) ls -als *output00* returns that there is no such file or directory but
there is a file called TiC.output0 so I'll assume this is the file of
interest here. The output os ls -als TiC.output0 is "68 -rw-r--r-- 1
fc-up201202493 cfp 65791 Jul 22 19:59 TiC.output0."
4) The end of TiC.output0 has the following:
=====>>> CPU-TIME SUMMARY
TOTAL CPU/WALL-TIME USED : 2.8 100. PERCENT 2.8
100. PERCENT
TIME MULTIPOLMOMENTS: 0.0 1. PERCENT 0.0
1. PERCENT
TIME COULOMB POT INT: 0.0 0. PERCENT
0.0 0. PERCENT
TIME COULOMB POT RMT: 0.0 0. PERCENT 0.0
0. PERCENT
TIME COULOMB POT SPH: 0.0 1. PERCENT 0.0
1. PERCENT
TIME XCPOT SPHERES : 1.8 64. PERCENT 1.8
63. PERCENT
TIME XCPOT INTERST : 0.8 29. PERCENT
0.8 29. PERCENT
TIME TOTAL ENERGY : 0.1 2. PERCENT
0.1 2. PERCENT
TIME REAN0, REAN3 : 0.1 0. PERCENT
0.1 0. PERCENT
TIME REANALYSE : 0.0 2. PERCENT
0.1 2. PERCENT
(the spacings are a bit off compared to what shows up on the actual file).
Lastly regarding fftw-mpi. I had to update the GNU compilers I was
previously using for version 18.2 as they were deemed to be too old a
version by ./siteconfig_lapw. As such I compiled a new version of OpenMPI
with the newer version of the compilers. I wasn't sure if I had done the
same for fftw so I went and recompiled fftw and then recompiled Wien2k
version 19.1 afterwards but the error persists so this does not seem to be
the cause of the it.
> On Mon, 22 Jul 2019 at 19:54, Peter Blaha <pblaha at theochem.tuwien.ac.at>
> wrote:
>
>> Please:
>> 1) does x lapw0 work ???
>> 2) list your .machines file. In particular: for TiC use only 2 cores
>> (because of 2 atoms)
>> 3) ls -als *output00*
>> 4) what is at the end of *.output0000 ??? Please check for any errors.
>>
>> Is your fftw-mpi compiled with the same compiler as wien2k ??
>>
>>
>> Am 22.07.2019 um 20:45 schrieb Ricardo Moreira:
>> > I had it at 4 as per the default value suggested during configuration
>> > but I changed it to 1 now. In spite of that, "x lapw0 -p" still did not
>> > generate case.vspup or case.vspdn.
>> >
>> > On Mon, 22 Jul 2019 at 19:01, <tran at theochem.tuwien.ac.at
>> > <mailto:tran at theochem.tuwien.ac.at>> wrote:
>> >
>> > Do you have the variable OMP_NUM_THREADS set in your .bashrc or
>> .cshrc
>> > file? If yes and the value is greater than 1, then set it to 1 and
>> > execute agian "x lapw0 -p".
>> >
>> > On Monday 2019-07-22 18:39, Ricardo Moreira wrote:
>> >
>> > >Date: Mon, 22 Jul 2019 18:39:45
>> > >From: Ricardo Moreira <ricardopachecomoreira at gmail.com
>> > <mailto:ricardopachecomoreira at gmail.com>>
>> > >Reply-To: A Mailing list for WIEN2k users
>> > <wien at zeus.theochem.tuwien.ac.at
>> > <mailto:wien at zeus.theochem.tuwien.ac.at>>
>> > >To: A Mailing list for WIEN2k users
>> > <wien at zeus.theochem.tuwien.ac.at
>> > <mailto:wien at zeus.theochem.tuwien.ac.at>>
>> > >Subject: Re: [Wien] Parallel run problems with version 19.1
>> > >
>> > >That is indeed the case, neither case.vspup or case.vspdn were
>> > generated after running "x lapw0 -p".
>> > >
>> > >On Mon, 22 Jul 2019 at 17:09, <tran at theochem.tuwien.ac.at
>> > <mailto:tran at theochem.tuwien.ac.at>> wrote:
>> > > It seems that lapw0 does not generate case.vspup and
>> > > case.vspdn (and case.vsp for non-spin-polarized
>> calculation).
>> > > Can you confirm that by executing "x lapw0 -p" on the
>> command
>> > > line?
>> > >
>> > > On Monday 2019-07-22 17:45, Ricardo Moreira wrote:
>> > >
>> > > >Date: Mon, 22 Jul 2019 17:45:51
>> > > >From: Ricardo Moreira <ricardopachecomoreira at gmail.com
>> > <mailto:ricardopachecomoreira at gmail.com>>
>> > > >Reply-To: A Mailing list for WIEN2k users
>> > <wien at zeus.theochem.tuwien.ac.at
>> > <mailto:wien at zeus.theochem.tuwien.ac.at>>
>> > > >To: A Mailing list for WIEN2k users
>> > <wien at zeus.theochem.tuwien.ac.at
>> > <mailto:wien at zeus.theochem.tuwien.ac.at>>
>> > > >Subject: Re: [Wien] Parallel run problems with version 19.1
>> > > >
>> > > >The command "ls *vsp*" returns only the files
>> > "TiC.vspdn_st" and
>> > > >"TiC.vsp_st", so it would appear that the file is not
>> > created at all when
>> > > >using the -p switch to runsp_lapw.
>> > > >
>> > > >On Mon, 22 Jul 2019 at 16:29, <tran at theochem.tuwien.ac.at
>> > <mailto:tran at theochem.tuwien.ac.at>> wrote:
>> > > > Is the file TiC.vspup emtpy?
>> > > >
>> > > > On Monday 2019-07-22 17:24, Ricardo Moreira wrote:
>> > > >
>> > > > >Date: Mon, 22 Jul 2019 17:24:42
>> > > > >From: Ricardo Moreira
>> > <ricardopachecomoreira at gmail.com
>> > <mailto:ricardopachecomoreira at gmail.com>>
>> > > > >Reply-To: A Mailing list for WIEN2k users
>> > > > <wien at zeus.theochem.tuwien.ac.at
>> > <mailto:wien at zeus.theochem.tuwien.ac.at>>
>> > > > >To: A Mailing list for WIEN2k users
>> > > > <wien at zeus.theochem.tuwien.ac.at
>> > <mailto:wien at zeus.theochem.tuwien.ac.at>>
>> > > > >Subject: Re: [Wien] Parallel run problems with
>> > version 19.1
>> > > > >
>> > > > >Hi and thanks for the reply,
>> > > > >Regarding serial calculations, yes in both non
>> > spin-polarized
>> > > > and spin-polarized everything runs properly in the
>> > cases you
>> > > > described. As
>> > > > >for parallel, it fails in both cases, with the
>> error I
>> > > > indicated in my previous email.
>> > > > >
>> > > > >Best Regards,
>> > > > >Ricardo Moreira
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://zeus.theochem.tuwien.ac.at/pipermail/wien/attachments/20190723/a8767c23/attachment-0001.html>
More information about the Wien
mailing list