[Wien] Single precision corruption ???

Tue Mar 8 02:38:30 CET 2005

Dear Peter and Torsten,

Thanks so much for your prompt replies.

Actually, I may have tried to put too much in my initial post and caused 
some confusion.

My observation is simply that the same code (WIEN2k_05.2) with the same
input (whatever the RKmax, k-points, etc.) on the *same* platform produces
different total energies, depending on compiler options (with -r8 vs.  
without). The information about other platforms/compilers was merely to
verify that the problem occurs on other platforms/compilers as well.
(Peter: my answer differs from yours due to the small k-mesh I used for
illustration; but the problem is independent of k-mesh.) So the code
produces different answers (~0.20 mRy / atom) depending on how you compile
it, and this difference is of an order (especially if random) that could
have consequences for the smoothness of E-V curves, accuracy of
derivatives thereof (more so), etc. This is the problem.

And since the particular change in compilation settings is simply to
promote any default constants/declarations which may occur to double
precision (8 bytes) [default reals are single precision (4 bytes) on the
standard platforms I discussed], one possibility for the different answers
is that there are inadvertent default or single precision
constants/declarations in the code. (According to the compiler
documentation, -r8 simply makes the default precision double rather than
single -- no other optimizations, etc.) But there are other possibilities
as well.

Also, I recompiled the code with -O0 to disable optimizations and found
the same thing -- different answers with and without -r8 (~0.20 mRy /
atom). So the problem does not appear to be optimization related and,
again, occurs also on other platforms and compilers, and so appears more
general.

So we have two different answers and both cannot be right.

Thanks so much for your help and, again, my apologies for any lack of
precision (;-)) in my initial post.

John

------------------------------------------------------------

> Hi,
> 
> If your information in your mail is correct (struct and input files), I 
> would not worry about a tenth of a mRy, but more about several mRy.
> 
> My total energy for bcc W is     -32332.2513
> (For the parameter you specified in your mail)
> I do not specify more digits, because these numbers depend on the k-mesh
> (which you did not specify), but these 3-4 digits are correct for any 
mesh
> >1000 k-points.
> 
> So first you should find out why your number is so different.
> 
> About numerical problems: I just crosschecked, that WIEN2k_05 with an
> Intel P4 with ifc7.0, mkl 7.0, -FR -mp1 -w -prec_div -pc80 -pad -ip
> -DINTEL_VML
> (I think these are the recommended options of siteconfig)
> 
> yields identical total energies (up to mycro Ry) than an old WIEN2k_03 
on
> an (old) IBM SP3, xlf90, essl,   -O3 -qarch=pwr3
> 
> WIEN2k (at least in the scf programs) is pretty consistent with double 
> precission (and -r8 should not make much of a difference unless -r8
> implies also something else, see below !!!) and also
> optimization due to loop rearrangement,... usually is not critical.
> I do not believe that it is a "single precission" issue.
> 
> However, many compiler have options (or defaults) which reduce the
> accuracy of some floating point operations (or trig.fuctions) (to gain 
speed). 
> I'm to lazy to look it up again, but as far as
> I can remember, you can "speed-up" the programs when you omitt or modify
> some of iforts  compiler options, and depending on your compiler 
installation,
> these options might even be the default!
> 
> Please check the compiler documentation and apply first the IEEE 
standard
> (or "strict" or -O0 or whatever applies to your compiler). Once you
> have a benchmark number, you may increase "optimization" and gain lots
> of speed, but always crosscheck and compare the results.
> 
> Regards
> 

----------------------------
On Mon, 7 Mar 2005, Torsten Andersen wrote:

> Dear John,
> 
> It is well known that rounding errors come into play when the 
> optimization is increased, and I would not expect to get the exact same 
> solution on different architectures before you have a "converged" 
> calculation.
> 
> By converged I do not mean that the scf cycle converged :-) It is 
> possible due to rounding errors to have a slightly different density in 
> the first cycles, and this affects the mixing, the next cycle, etc.
> 
> Is your basis big enough? I believe that the standard settings for 
> calculations with W are too small (RMT*kmax should be about 9). Is it 
> converged with respect to the k-points? Also, "-ec anynumber" might be 
> too soft a convergence criterion for the scf cycle - it just means that 
> two consecutive cycles should be less than "anynumber" away from each 
> other in the total energy. Is the charge converged?
> 
> BTW, default reals and complexes are 8-byte real and 16-byte complex 
> already, so it shouldn't really give you any difference unless the -r8 
> means something in addition in your compilers. I know for sure that both 
> Compaq (Digital) Fortran and IBM xlf do some tradeoffs in speed versus 
> accuracy at the higher optimization levels (loop unrolling, code 
> reordering, etc.), and you might need to add a "-qstrict" (IBM) or 
> something similar in order to keep the semantics of the program stable.
> 
> What is the difference with, say, RMT*kmax=9 and about 500 k-points? 
> Needless to say, probably, but the Gamma-point should be included in 
> your k-point sampling.
> 
> Best regards,
> Torsten Andersen.
> 
--------------------------------------
Dr. John E. Pask
Lawrence Livermore National Laboratory
University of California
P.O. Box 808, L-045
Livermore, CA 94551 USA
phone/fax: +1 (925) 422-8392/2851
pask1 at llnl.gov