[Wien] Compilation issues for lapw0 v.13 on Mac OS X (request for code review)

Sat Aug 16 00:20:52 CEST 2014

Dear WIEN2k community,

I have identified some issues in v.13 of the SRC_lapw0 code that prevented
me from compiling a working Mac OS X version.  I am also submitting a
solution.  The details are below my signature.  After the code edits, I can
compile and run lapw0 correctly using the recommended compilation settings
(no additional flags like -O0 or -fp-model strict).

The issue is somewhat sloppy code, legal in Fortran but vulnerable to
compiler trip-ups, in the "drho" section of lapw0.  It is quite possible
that these issues do not occur on Linux, as memory is managed differently
on the different OS's (e.g. which variables go on stack or heap depending
on their allocation).  Nevertheless it seems useful to report.  They occur
for me with both gfortran and ifort, indicating it's not just Intel
goofiness.  I would be grateful for feedback from the developers or the
community.  I'm happy to send source files - didn't include them now as I'm
not sure the mailing list accepts attachments.

Finally, are total energies expected to be identical between wien2k_12 and
wien2k_13?  Or are small differences ~10^-4 or ~10^-5 normal?

Cheers,

Kevin Jorissen

Details:
OS X 10.9.4
ifort 14.0.1 or ifort 14.0.2 or gfortran 4.9.0

Test:
* Standard "run_lapw" for HOPG graphite.
* The test is passed if the calculation:  a/ does not crash ; b/ converges
to approximately the same total energy (grep :ENE case.scf) as a wien2k-12
calculation (compiled with the same compiler as wien2k-13 and with "safe"
options (no optimization)).  Initialization is done with that same
wien2k-12 version, i.e. I test only the "SCF" programs.

Outcome:
* The test did not pass.  The wien2k-13 calculation fails almost always: A
seclr4 error in lapw1 following NaN in the case.scf0 and case.output0, so
the problem is in lapw0.  Case.scf0 points to the xcpot1 section.  wien2k.at
reports updates to the "drho" in v.13.  The crash occurs in varying SCF
iteration for each run, indicating an "instability", i.e. numerical noise
or memory issues.
* Trying different configurations didn't help.  E.g. "safe" compilation
options (-O0 -fp-model strict ...) ; e.g. swapping ifort for gfortran ;
trying different fftw libraries ...

Solution (after tedious debugging ...):
Several small code changes in drho.f and dergl.f to correct array size
issues.

DRHO.F
* The array "r" passed to dergl is size "nrad" (allocated elsewhere) while
dergl expects an array of size "n=jri(jatom)".  E.g. 881 vs. 781 for my
test.
Solution (it could be done just by restricting call(...r(1:n)...) also):

      real*8 ::  rlocal(n)

      rlocal(:)=0.d0

      do i=1,min(size(r),n)

         rlocal(i) = r(i)

      enddo

!        call dergl(n,r,c,g,even,g0)

!        call dergl(n,r,g,g2,.not.even,g0)

        call dergl(n,rlocal,c,g,even,g0)

        call dergl(n,rlocal,g,g2,.not.even,g0)

DERGL.F
* In calls to Fornberg, the array "A(0:Ninp)" is made with a larger array
"RW(I)".  Solution: replace with appropriate subarray of size Ninp+1:

!                call  Fornberg(Ninp,Minp,RW(J),RW(J-N1),D)

                call  Fornberg(Ninp,Minp,RW(J),RW(J-N1:J-N1+Ninp),D)

!                call  Fornberg(Ninp,Minp,R(J),R(JL),D)

                call  Fornberg(Ninp,Minp,R(J),R(JL:JL+Ninp),D)

!      call  Fornberg(Ninp,Minp,0.0D0,RW(-N1),D)

      call  Fornberg(Ninp,Minp,0.0D0,RW(-N1:-N1+Ninp),D)

* In subroutine dergl_spline, rw is not initialized correctly for the
"even" case: the lower point rw(-n+1) is uninitialized and could be
anything.  Solution:

       else

                do J=1,n-1

                        rw(-J+1)=-r(j)

                        w1(-J+1)= f(J)

                enddo

                rw(-n+1)=-r(n) ! initialize this field!

                w1(-n+1)=f(n)  ! Technically this is redundant (see w1=0
above).

        endif

PS xcpot1.f has some unrelated issue with sigmamaxup and irsigmaup, which
gets used without being initialized or something like that.  I don't
remember the details now, and it doesn't seem to hinder the program, but it
popped up during a runtime check at some point - I believe it was triggered
by a WRITE statement containing r(irsigmaup) when irsigmaup can be
uninitialized.

TEST SCRIPT:

~/science/graphite13-opt-test% more ../MyWienTest

#!/bin/tcsh -f

echo Running test in `pwd`

echo WIENROOT is $WIENROOT

rm -f *scf* *clm* *broy* *tmp* *vec* *def *error :log

x dstart -d

~/science/wien2k-12-noopt/dstart dstart.def

run -ec 0.000001

exit 0
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://zeus.theochem.tuwien.ac.at/pipermail/wien/attachments/20140815/a0d7ac8e/attachment.htm>