[Wien] Restarting HF with SO

Luis Ogando lcodacal at gmail.com
Thu May 18 19:50:55 CEST 2017


Dear Prof. Marks,

   Thank you very much for your help !
   Unfortunately, I would like to understand why the  -s  option, designed
to restart a calculation at the same point where it crashed, does not work.
Without this, I am afraid that even your suggestion will not help.
   Thank you again,
                       Luis


2017-05-18 14:39 GMT-03:00 Laurence Marks <L-marks at northwestern.edu>:

> I don't have the answer, but you may want to contemplate in the future
> doing something like a set of shorter runs saving the interim results
>
> for i in 1 2 3 4 ... XX
> do
>   mkdir Safety
>   runsp_lapw -hf ... -i 3 -NI
>   rm Safety/*bro*
>   mv *bro* Safety
>   save -f -d Safety
>   cp Safety/*bro* ./ ; cp Safety/*.scf ./
> done
>
> (It would be easier if save_lapw had an option to not delete the *bro*
> files and retain case.scf -- a simple hack.)
>
> On Thu, May 18, 2017 at 12:27 PM, Luis Ogando <lcodacal at gmail.com> wrote:
> > Dear Gavin,
> >
> >    Thank you very much for your answer.
> >    I am using Wien2k 14.2 and, unfortunately, that was the only message I
> > got from the standard output file (queuing system). The error files and
> > case.dayfile have no useful information.
> >    The interruption was during the  hf  execution, after lapw1, that
> > finished without a problem.
> >    It was not the first time I had to restart the calculation due to a
> shut
> > down. In the other cases, I restarted the calculation from scratch, but,
> > with a non parallel calculation, I have to solve this reinitialization
> issue
> > or the calculation will never end. So, I would be glad if someone else
> could
> > give me another hint.
> >    Thank you again.
> >    All the best,
> >                      Luis
> >
> >
> >
> >
> > 2017-05-18 11:35 GMT-03:00 Gavin Abo <gsabo at crimson.ua.edu>:
> >>
> >> Sorry, those code line numbers are for WIEN2k 16.1.  For example, if you
> >> are using WIEN2k 14.2, the line numbers should be 998 instead of 1354
> and
> >> 1006 instead of 1365 in SRC_hf/calc_h.F.
> >>
> >>
> >> On 5/18/2017 8:19 AM, Gavin Abo wrote:
> >>
> >> Unfortunately, I think that error message can tell you "why" the
> >> calculation stopped, but it might not tell you the initial "cause" of
> it.
> >> That is likely because the issue that caused it happened earlier in the
> >> calculation (perhaps lapw1?).  The vector file size is smaller than the
> >> vectorhf_old.  I'm not sure if they should be the same size or not.  If
> so,
> >> perhaps you need to restart the calculation in the lapw1 step (-s
> lapw1) to
> >> regenerate the vector file instead of starting with the hf step (-s hf),
> >> which I believe comes later in the calculation from that of lapw1, or
> you
> >> might just have to start the calculation over from scratch.
> >>
> >> In SRC_hf/calc_h_2.F, you should see:
> >>
> >> line 1354:
> >> !_COMPLEX call
> >> zheev('V','U',nbf,ham,nbf,enknew,workdiag,2*nbf-1,rworkdiag,info)
> >>
> >> line 1365:
> >>         if (info .ne. 0) then
> >>           print *, 'info=', info
> >>           stop 'error in calc_h_2: info not equal to 0'
> >>         endif
> >>
> >> From the code above, you can see that there likely should be a little
> more
> >> error information available from the "print *, 'info=', info" statement
> that
> >> you did not report.  I believe this should have been printed to the
> standard
> >> output (terminal or std output file if you are using a queuing system).
> >>
> >> Depending on the value of the info variable, the calculation seems to
> have
> >> stopped because it encountered an illegal value or there was a
> convergence
> >> problem [1]:
> >>
> >>         INFO is INTEGER
> >>           = 0:  successful exit
> >>           < 0:  if INFO = -i, the i-th argument had an illegal value
> >>           > 0:  if INFO = i, the algorithm failed to converge; i
> >>                 off-diagonal elements of an intermediate tridiagonal
> >>                 form did not converge to zero.
> >>
> >> Perhaps, the software developers of the hf code have further insight
> than
> >> I currently do into what could resolve the problem.
> >>
> >> [1]
> >> http://www.netlib.org/lapack/explore-html/df/d9a/group__
> complex16_h_eeigen_ga70c041fd19635ff621cfd5d804bd7a30.html#
> ga70c041fd19635ff621cfd5d804bd7a30
> >>
> >> On 5/18/2017 5:52 AM, Luis Ogando wrote:
> >>
> >>    I do not know if it is relevant, but my calculation is complex (-c).
> >>    Thank you again,
> >>                     Luis
> >>
> >>
> >> 2017-05-18 8:29 GMT-03:00 Luis Ogando <lcodacal at gmail.com>:
> >>>
> >>> Dear Wien2k community,
> >>>
> >>>    I am trying to calculate the dielectric function for wurtzite GaP
> >>> using -hf and -so as previously discussed (
> >>> http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.
> at/msg14603.html
> >>> ).
> >>>    There was a shut down of the machine during the  hf  execution in
> the
> >>> first step of the calculation  (  run_lapw -hf ...  ). When the
> machine came
> >>> back, I removed the case.vectorhf (case.vectorhf_old is still there)
> and
> >>> case.energyhf.  Then, I executed
> >>>
> >>> run_lapw -hf -NI -s hf -ec 0.0001 -cc 0.0001 -i 200
> >>>
> >>> trying to restart the calculation (non-parallel execution due to the
> HF x
> >>> SO issue discussed in the previous messages above).
> >>>    The calculation restarted without a problem, but when the the
> >>> case.vectorhf reached 187MB (less than a half of the expected size, see
> >>> below) I got an error.
> >>>
> >>> -rw-r--r-- 1 luisoda luisoda 187M Mai 18 03:51
> GaPwurtHSE-DielSO-1.vector
> >>> -rw-r--r-- 1 luisoda luisoda 187M Mai 18 00:14
> >>> GaPwurtHSE-DielSO-1.vectorhf
> >>> -rw-r--r-- 1 luisoda luisoda 565M Abr 23 21:33
> >>> GaPwurtHSE-DielSO-1.vectorhf_old
> >>>
> >>>    The only related error message I found it was:
> >>>
> >>> error in calc_h: info not equal to 0
> >>>
> >>>    I am probably making a mistake when restarting the calculation and I
> >>> would really appreciate any help with this issue.
> >>>    Many thanks in advance.
> >>>    All the best,
> >>>              Luis
> >>
> >>
> >>
> >> _______________________________________________
> >> Wien mailing list
> >> Wien at zeus.theochem.tuwien.ac.at
> >> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
> >> SEARCH the MAILING-LIST at:
> >> http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
> >>
> >
>
>
>
> --
> Professor Laurence Marks
> "Research is to see what everybody else has seen, and to think what
> nobody else has thought", Albert Szent-Gyorgi
> www.numis.northwestern.edu ; Corrosion in 4D:
> MURI4D.numis.northwestern.edu
> Partner of the CFW 100% program for gender equity, www.cfw.org/100-percent
> Co-Editor, Acta Cryst A
> _______________________________________________
> Wien mailing list
> Wien at zeus.theochem.tuwien.ac.at
> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
> SEARCH the MAILING-LIST at:  http://www.mail-archive.com/
> wien at zeus.theochem.tuwien.ac.at/index.html
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://zeus.theochem.tuwien.ac.at/pipermail/wien/attachments/20170518/93e7cdf7/attachment.html>


More information about the Wien mailing list