[Wien] Mixer error

Laurence Marks L-marks at northwestern.edu
Thu Nov 22 15:53:57 CET 2007


The information you gave gives me a good idea (99% confidence level)
as to what the problem may be. I will "Deposit" an explanation which
hopefully will be useful.

First, a little bit about what files are important. There are what we
call "history" files case.broyd1, case.broyd2 which should both be
small (case.broyd2 should probably be zero length). These contain a
little control information. In addition there are history files
case.broyd2001, 2002.... etc. These contain information about the
previous steps, the largest number begin the most recent. In the
current version some of these will be zero length (to save disc
space); in the next release the ones of zero length will be deleted.
What you should have is some reasonable number of non-zero files. Even
though the code is limited-memory (only uses a certain number of prior
steps, default 8) a few extra are kept so one can increase the number
of histories used if desired. (Before someone asks, to my knowledge
nobody currently knows how to adequately determine the best number of
prior steps to use for this type of algorithm.)

When the transition from the old mixer to the new one took place,
three things may not have been done right (small bugs):

a) A parameter "riter" in many of the run??_lapw scripts was left at
20 (or 40). This was used in the past to reset the calculation after
some number of iterations but is now obsolete at least for MSEC1.
Unfortunately in the current scripts it only deletes case.broyd1 and
case.broyd2 which is correct for the older version, but not for the
new one. A patch is to set riter to a large number (9999) in the
scripts or use -r 9999 when running run??_lapw. (I am fairly sure
Peter will be releasing some patches soon.)

b) Some of the scripts which cleaned up files (e.g. clean_lapw) and
after a minimization cycle only removed case.broyd1, case.broyd2 and
not all the others.

c) There was a small inconsistency in the counting in one of the
Fortran files. This could have led the program to believe that (for
instance) it should have a file case.broyd2001 containing useful
history when this file was in fact empty and not for use. This sounds
exactly like what you have. I believe this is caused because of
inconsistencies due to 1) and/or 2) above although I cannot verify
this for certain. I think I sent Peter a patch for this, but I am not
sure if it is in the current version on the web. (I am currently using
a beta version of a slightly improved mixer which may be released
soon, provided it proves to be stable. It is undergoing tests at the
moment.)

I know there will be a release soon which I believe will cure a)-c)
(as well as some other minor things). It will either have a slightly
patched mixer, or a newer version. In either case I believe this
problem will go away. In the meantime, I strongly suggest adding -r
9999 to runXX_lapw. If you do run into this problem simply delete ALL
the history files (rm case.broyd*) and continue.

On Nov 22, 2007 5:59 AM, Ashley Harvey <ashley.harvey at mat.ethz.ch> wrote:
> You are certainly correct that I did not include all the information.
> As a new user working alone, I do not know where to find all the
> information that might be relevant.
> Thanks for the list of general things to check.
>
> For some of your points:
> The second time this mixer error happened, it was in a brand-new
> directory with no previous files.
> 0) The version was installed within a few days of the version 07.3
> release.
> I have re-installed from the complete source code again, and perhaps
> the error won't repeat itself.
> 1) All the files listed in mixer.def were present:
> .inm - 200 bytes
> .clmsum_old - 2.7 MB
> .clmval - 2.5 MB
> .clmsc - 0 bytes
> .clmcor - 61.6 MB
> .struct - 2.9 KB
> .scf - 406.3 KB
> .broyd1 - 24 bytes
> .broyd2 - 0 bytes (also .broyd2001 2002 2003 2004)
> 2) Running x mixer gave:
> forrt1: severe (24) end-of-file during read, unit 32, file /home/
> aharvey/Wien2k/GdCoO3/GdCoO3.broyd2001
> This is similar to what Bjoern and Stefaan reported earlier.
> 5) Last lines of .outputm under 19. Iteration:
> :DIS      :  CHARGE DISTANCE     ( 0.0157539 for atom   1 spin 1)
> 0.1347781
> Big check      0.525D-02   0.643D-02   0.674D-02
> Last lines of .scfm:
> :PLANE:   INTERSTITIAL TOTAL    61.41379        DISTAN  0.0140833
> :CHARG:  CLM CHARGE TOTAL       1.54997         DISTAN  0.0086475
> :REDuction and DMIX in Broyd:   1.0829          0.4000
>
> Hopefully with time I become a better user.
> Ashley
>
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
>
> On Nov 21, 2007, at 1:27 PM, Laurence Marks wrote:
>
> > Unfortunately it is hard to impossible to know what the source of the
> > problem is because you have not provided us with enough information.
> > It is not impossible that this has something to do with incorrect
> > deleting of history files, a relic of old mixing, but by no means
> > certain. Here are some general suggestions, partially for the mixer
> > but in general as well.
> >
> > 0) Check that you have the latest version of the particular code.
> > Sometimes patches are loaded onto the web page and not announced
> > immediately.
> > 1) Check the file mixer.def and see if all the input files it looks
> > for (e.g. case.clmsum, case.clmsum_old) exist. Document their size.
> > 2) When a task dies run it by hand at the terminal to look for
> > Fortran/System error information, e.g.
> > "x mixer" or "mixer mixer.def". Change these as appropriate for other
> > commands. Document any errors and information this provides.
> > 3) Make sure that for the intel compiler you have "-traceback" in the
> > options. This will often give a line number and routine if a
> > Fortran/system problem occurred. Other compilers should have a similar
> > command.
> > 4) Consider recompiling, temporarily, that particular command with
> > debug on (-g). This often gives more information. There are other
> > possible options for run-time checks (-check all for ifort). These
> > often provide more information. Remember to switch the compilation
> > back when the problem has been isolated.
> > 5) Look in the outputfiles (see mixer.def and in general XXXX.def for
> > what these might be), here case.scfm and case.outputm. What are the
> > last few lines? In addition to sometimes having information about the
> > error this will also tell someone else roughly where in the code the
> > problem occurred.
> > 6) THINK about how to document this so someone can have an idea what
> > is going on, and can reproduce the problem. Consider this as similar
> > to how you would explain the problem to your mother; what basic
> > information would she need to know. We cannot read your mind, we can
> > only guess at the source of the problem and without adequate
> > information we will not do this.
> > 7) DO NOT send massive attachments or files unless asked specifically
> > to do so. In general (provided that they are small enough) the output
> > for a specific command (case.scfm, case.outputm for mixer) is
> > reasonable to attach to an email.
> >
> > On Nov 21, 2007 5:05 AM, Ashley Harvey <ashley.harvey at mat.ethz.ch>
> > wrote:
> >>  Dear Wien2k users,
> >>
> >> I have had this problem occur at least twice now, and I have not a
> >> clue
> >> about what is happening.
> >> I am trying to make a basic calculation for perovskite GdCoO3 in
> >> orthorhombic space group #62 Pnma.
> >> Everything was fine with the initialization and the case.inm files
> >> were
> >> (default):
> >>
> >> MSEC1 0.0 YES (BROYD/PRATT, extra charge (+1 for additional e), norm)
> >> 0.40 mixing FACTOR for BROYD/PRATT  scheme
> >> 1.00 1.00 PW and CLM-scaling factors
> >> 9999 8 idum, HISTORY
> >>
> >> The mixer.error file says only "Error in MIXER".  The other .error
> >> files are
> >> empty.
> >> The first time this occurred, I had selected charge convergence to
> >> 0.0001 e.
> >> I changed this to 0.0005 e and re-ran the SCF successfully.
> >> However, this most recent occurrence was first set to converge to
> >> 0.0005 e.
> >> Both cases stopped in the mixer of the 19th iteration (17 hours
> >> after start
> >> of SCF) with the message (w2web dayfile):
> >> error: command    /usr/local/wien2k/mixer    mixer.def failed
> >> The calculation has ca. 8900 PWs and ca. 35000 CLMs.  When the
> >> error first
> >> occurred and I re-ran the SCF, it converged in less than 30
> >> iterations.
> >>
> >> Can anyone recommend to me what is happening?  Is it perhaps
> >> related to a
> >> previous discussion about mixer:
> >> http://zeus.theochem.tuwien.ac.at/pipermail/wien/2007-October/
> >> 009924.html
> >>
> >> Thanks for the help,
> >> Ashley
> >>
> >>
> >>
> >> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> >>
> >> Ashley Harvey, Ph.D.
> >>
> >> ETH Zürich
> >>
> >> Nonmetallic Inorganic Materials
> >>
> >> phone: ++41 44 632 36 34
> >>
> >> fax: ++41 44 632 11 32
> >>
> >> email: ashley.harvey at mat.ethz.ch
> >>
> >> Wolfgang-Pauli-Str. 10, HCI G 531
> >>
> >> 8093 Zürich
> >>
> >> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> >>
> >> _______________________________________________
> >> Wien mailing list
> >> Wien at zeus.theochem.tuwien.ac.at
> >> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
> >>
> >>
> >
> >
> >
> > --
> > Laurence Marks
> > Department of Materials Science and Engineering
> > MSE Rm 2036 Cook Hall
> > 2220 N Campus Drive
> > Northwestern University
> > Evanston, IL 60208, USA
> > Tel: (847) 491-3996 Fax: (847) 491-7820
> > email: L-marks at northwestern dot edu
> > Web: www.numis.northwestern.edu
> > Commission on Electron Diffraction of IUCR
> > www.numis.northwestern.edu/IUCR_CED
> > _______________________________________________
> > Wien mailing list
> > Wien at zeus.theochem.tuwien.ac.at
> > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
>
> _______________________________________________
> Wien mailing list
> Wien at zeus.theochem.tuwien.ac.at
> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
>



-- 
Laurence Marks
Department of Materials Science and Engineering
MSE Rm 2036 Cook Hall
2220 N Campus Drive
Northwestern University
Evanston, IL 60208, USA
Tel: (847) 491-3996 Fax: (847) 491-7820
email: L-marks at northwestern dot edu
Web: www.numis.northwestern.edu
Commission on Electron Diffraction of IUCR
www.numis.northwestern.edu/IUCR_CED


More information about the Wien mailing list