[Wien] Possible bugs from use of uninitialized variables

Laurence Marks L-marks at northwestern.edu
Mon Oct 14 18:00:57 CEST 2013


Thanks. The line
       PM1=min(1.D0,PM1)
should be commented out, it is a legacy of when I moved some code
around and does nothing.

N.B., A=0.D0 is the same as A=(0.D0,0.D0) -- there is an implicit
conversion at least for ifort and gfortran (and I believe always),
e.g. test

        complex a
        a=(0.d0,1.d0)
        write(6,*)a
        a=0.D0
        write(6,*)a
        end

On Mon, Oct 14, 2013 at 10:43 AM, Pavel Ondračka
<pavel.ondracka at email.cz> wrote:
> Laurence Marks píše v Po 14. 10. 2013 v 08:46 -0500:
>> I was travelling so did not have access to the source. You are correct
>> and it is technically a bug which apparently neither -ftrapuv nor
>> gfortran picked up. I have added
>>
>>                 ROKMIX = 0.D0 ; ROKSC = 0.D0 ; ROKVL = 0.D0 ; ROKOLD = 0.D0
> This is probably just a typo on your side, however ROKMIX = 0.D0 will
> initialize just the real part... ROKMIX = (0.D0, 0.D0) is needed (at
> lest on gfortran and xlf90)
>>                 KVECVL = 0 ; KVECSC = 0 ; KVCOLD = 0
>>
>> after the allocation, and at some stage will add comparable to other
>> allocations.
>>
>> That said, it should have no effect on the performance of mixer, and
>> if it does then there is a serious bug somewhere else. The test on
>> line 1504 of your version is just for some output of how the PW's are
>> changing for monitoring as Peter has noticed that converging these can
>> sometimes be tricky. There are a few other similar traps just for
>> cleaning things up.
> I didn't saw any actual problems except the crash in f7splt, I was just
> being extra cautious, since I imagine the AIX 5.3 build doesn't see much
> testing.
>>
>> Can you please send the lines for LimitDMIX8n.F in your version since
>> it is probably different from mine and I do not see anything relevant
>> at line 135.
> LimitDMIX8n.F:135    PM1=min(1.D0,PM1)
> attached is the stock LimitDMIX8n.F from WIEN2k_13.1 version
>
>> N.B., concerning valgrind and mixer (for instance) not deallocating
>> everything. There is a routine called "cleaner.f" which is in
>> SRC_mixer for this purpose. However, with a multiply branching code it
>> is very, very, very hard to trap everything and I have ended up hoping
>> that the compiler does what it is supposed to. It is also very hard to
>> protect for all possible systems and compilers particularly if I have
>> no access to them. And.....mpi is much worse.
> It wasn't actually leaking that much at all, just the small leak in
> lapw1 (not counting the allocated memory still accessible on exit)
>
> Basically even though my platform of interest is AIX 5.3, the testing
> with valgrind was done on linux with gfortran, because valgrind isn't
> working on AIX. As I've said before, my main motivation for this vagrind
> exercise was just that I was really concerned that clearly invalid code
> wasn't crashing at all neither with ifort (because it is initialized to
> 0 defaultly) nor with gfortran (it was probably optimized away, because
> with -O0 it crash in the same way) and I was wondering if there are more
> similar cases that are not severe enough to result in crash but rather
> some unwanted behavior.
>
> Also all the testing was done just with the TiC sample, so I probably
> haven't tested that many codepaths at all...
>
> Anyway thanks for looking into this and let me know if you need any more
> info.
>
>> On Thu, Oct 10, 2013 at 10:47 AM, Pavel Ondračka
>> <pavel.ondracka at email.cz> wrote:
>> > Laurence Marks píše v Čt 10. 10. 2013 v 09:40 -0500:
>> >> Sorry, but I agree with Peter & I am 99.9% certain that this is not a
>> >> bug in the mixer. The way to test this is (with ifort) to use
>> >> -ftrapuv ;
>> > That is not true. According to ifort docs:
>> > -ftrapuv
>> > The option sets any uninitialized local variables that are allocated on
>> > the _stack_ to a value that is typically interpreted as a very large
>> > integer or an invalid address.
>> >
>> > However the ROKMIX is allocated on the _heap_
>> >
>> > Just try the test I've described earlier, set the imaginary part by hand
>> > right after allocation and check the value right before the if check
>> > either with debugger or just write it to stdout and you will see the
>> > value doesn't change.
>> >>  the arrays are not set in mixer.F but elsewhere within some
>> >> complicated subroutines. This flag forces a fault if an undefined
>> >> variable is used.
>> >
>> >
>> >> ---------------------------
>> >> Professor Laurence Marks
>> >> Department of Materials Science and Engineering
>> >> Northwestern University
>> >> www.numis.northwestern.edu 1-847-491-3996
>> >> "Research is to see what everybody else has seen, and to think what
>> >> nobody else has thought"
>> >> Albert Szent-Gyorgi
>> >>
>> >>
>> >> On Oct 10, 2013 7:23 AM, "Pavel Ondračka" <pavel.ondracka at email.cz>
>> >> wrote:
>> >>         Peter Blaha píše v Čt 10. 10. 2013 v 12:22 +0200:
>> >>         > I'm not very familiar with valgrid, but all those reports
>> >>         seem not
>> >>         > relevant to me.
>> >>         > It seems to complain about all allocated arrays, which are
>> >>         not set to
>> >>         > zero globally. But this does NOT mean that one uses
>> >>         uninitialized
>> >>         > variables or assumes that the compiler sets them to zero.
>> >>         Valgrind doesn't complain about uninitialized variables in
>> >>         general. Just
>> >>         when they are used in such way that they can alter the flow of
>> >>         the
>> >>         program or generally alter the externally-visible behavior.
>> >>
>> >>         I'll take the problem from mixer as an example, because
>> >>         contrary to the
>> >>         rest of the problems, here I'm 99% sure it is a real bug.
>> >>
>> >>         The complex array ROKMIX in mixer.F, is only partially
>> >>         initialized. The
>> >>         real part of the complex number is initialized somewhere
>> >>         (don't know
>> >>         exactly where, since I haven't tracked it that much), however
>> >>         the
>> >>         imaginary part is not. You can check this by setting the
>> >>         imaginary part
>> >>         to some specific value right after the allocation, and then
>> >>         set a
>> >>         breakpoint or print its value at line
>> >>         mixer.F:1504  if(abs(ROKMIX(J,I)).gt.1D-12)then
>> >>         and you will see, it has the same value as set before, so it
>> >>         wasn't
>> >>         written anywhere.
>> >>         And now basically if real part of ROKMIX(J,I) is lower than
>> >>         1D-12 we
>> >>         would randomly fail or pass the if test, depending on what
>> >>         stuff is
>> >>         written in the corresponding uninitialized memory.
>> >>
>> >>         >
>> >>         > PS: I don't have a variable "WS" in dergl.f
>> >>         > WS=0.0D0 in dergl.f
>> >>         My mistake, this should have been RW
>> >>         >
>> >>         > We can verify this also with ifort by setting -C in the
>> >>         flags, and in
>> >>         > fact doing so would result in an error in f7splt due to the
>> >>         mistake you
>> >>         > mentioned before.
>> >>         >
>> >>         > So the only problem was in f7splt.f and that should be fixed
>> >>         as
>> >>         > previously mentioned.
>> >>         Well, this is not what valgrind does. Valgrind doesn't do
>> >>         static code
>> >>         analysis as compilers do during compilation, but it replaces
>> >>         calls to
>> >>         malloc and similar at runtime with his own versions and tracks
>> >>         the used
>> >>         memory at runtime.
>> >>
>> >>         To be more specific, I'll try to explain on my previous
>> >>         example why I
>> >>         think it can't be discovered by the ifort checking.
>> >>         Although ifort probably can track, that imaginary part of
>> >>         ROKMIX(J,I) is
>> >>         uninitialized at that point, it has no way of knowing if
>> >>         return value of
>> >>         abs(ROKMIX(J,I)) actually depends on the uninitialized value
>> >>         or not,
>> >>         because abs function has not been linked at that point.
>> >>
>> >>         >
>> >>         > (PS: I was wrong with my first reply that you need ISPLIT=15
>> >>         (this
>> >>         > relates to an older version). The present if statement is
>> >>         intentional,
>> >>         > but still, no result from f7splt will be used (except for
>> >>         ISPLIT=15).
>> >>         Thanks, this explains it.
>> >>
>> >>         I'm very grateful and I appreciate it very much, that you are
>> >>         looking
>> >>         into this, even though this is not a problem at your platform.
>> >>         Thanks again.
>> >>
>> >>         _______________________________________________
>> >>         Wien mailing list
>> >>         Wien at zeus.theochem.tuwien.ac.at
>> >>         http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
>> >>         SEARCH the MAILING-LIST at:
>> >>          http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
>> >> _______________________________________________
>> >> Wien mailing list
>> >> Wien at zeus.theochem.tuwien.ac.at
>> >> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
>> >> SEARCH the MAILING-LIST at:  http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
>> >
>> >
>> > _______________________________________________
>> > Wien mailing list
>> > Wien at zeus.theochem.tuwien.ac.at
>> > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
>> > SEARCH the MAILING-LIST at:  http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
>>
>>
>>
>



-- 
Professor Laurence Marks
Department of Materials Science and Engineering
Northwestern University
www.numis.northwestern.edu 1-847-491-3996
"Research is to see what everybody else has seen, and to think what
nobody else has thought"
Albert Szent-Gyorgi


More information about the Wien mailing list