[Wien] Restarting HF with SO
Laurence Marks
L-marks at northwestern.edu
Thu May 18 19:39:49 CEST 2017
I don't have the answer, but you may want to contemplate in the future
doing something like a set of shorter runs saving the interim results
for i in 1 2 3 4 ... XX
do
mkdir Safety
runsp_lapw -hf ... -i 3 -NI
rm Safety/*bro*
mv *bro* Safety
save -f -d Safety
cp Safety/*bro* ./ ; cp Safety/*.scf ./
done
(It would be easier if save_lapw had an option to not delete the *bro*
files and retain case.scf -- a simple hack.)
On Thu, May 18, 2017 at 12:27 PM, Luis Ogando <lcodacal at gmail.com> wrote:
> Dear Gavin,
>
> Thank you very much for your answer.
> I am using Wien2k 14.2 and, unfortunately, that was the only message I
> got from the standard output file (queuing system). The error files and
> case.dayfile have no useful information.
> The interruption was during the hf execution, after lapw1, that
> finished without a problem.
> It was not the first time I had to restart the calculation due to a shut
> down. In the other cases, I restarted the calculation from scratch, but,
> with a non parallel calculation, I have to solve this reinitialization issue
> or the calculation will never end. So, I would be glad if someone else could
> give me another hint.
> Thank you again.
> All the best,
> Luis
>
>
>
>
> 2017-05-18 11:35 GMT-03:00 Gavin Abo <gsabo at crimson.ua.edu>:
>>
>> Sorry, those code line numbers are for WIEN2k 16.1. For example, if you
>> are using WIEN2k 14.2, the line numbers should be 998 instead of 1354 and
>> 1006 instead of 1365 in SRC_hf/calc_h.F.
>>
>>
>> On 5/18/2017 8:19 AM, Gavin Abo wrote:
>>
>> Unfortunately, I think that error message can tell you "why" the
>> calculation stopped, but it might not tell you the initial "cause" of it.
>> That is likely because the issue that caused it happened earlier in the
>> calculation (perhaps lapw1?). The vector file size is smaller than the
>> vectorhf_old. I'm not sure if they should be the same size or not. If so,
>> perhaps you need to restart the calculation in the lapw1 step (-s lapw1) to
>> regenerate the vector file instead of starting with the hf step (-s hf),
>> which I believe comes later in the calculation from that of lapw1, or you
>> might just have to start the calculation over from scratch.
>>
>> In SRC_hf/calc_h_2.F, you should see:
>>
>> line 1354:
>> !_COMPLEX call
>> zheev('V','U',nbf,ham,nbf,enknew,workdiag,2*nbf-1,rworkdiag,info)
>>
>> line 1365:
>> if (info .ne. 0) then
>> print *, 'info=', info
>> stop 'error in calc_h_2: info not equal to 0'
>> endif
>>
>> From the code above, you can see that there likely should be a little more
>> error information available from the "print *, 'info=', info" statement that
>> you did not report. I believe this should have been printed to the standard
>> output (terminal or std output file if you are using a queuing system).
>>
>> Depending on the value of the info variable, the calculation seems to have
>> stopped because it encountered an illegal value or there was a convergence
>> problem [1]:
>>
>> INFO is INTEGER
>> = 0: successful exit
>> < 0: if INFO = -i, the i-th argument had an illegal value
>> > 0: if INFO = i, the algorithm failed to converge; i
>> off-diagonal elements of an intermediate tridiagonal
>> form did not converge to zero.
>>
>> Perhaps, the software developers of the hf code have further insight than
>> I currently do into what could resolve the problem.
>>
>> [1]
>> http://www.netlib.org/lapack/explore-html/df/d9a/group__complex16_h_eeigen_ga70c041fd19635ff621cfd5d804bd7a30.html#ga70c041fd19635ff621cfd5d804bd7a30
>>
>> On 5/18/2017 5:52 AM, Luis Ogando wrote:
>>
>> I do not know if it is relevant, but my calculation is complex (-c).
>> Thank you again,
>> Luis
>>
>>
>> 2017-05-18 8:29 GMT-03:00 Luis Ogando <lcodacal at gmail.com>:
>>>
>>> Dear Wien2k community,
>>>
>>> I am trying to calculate the dielectric function for wurtzite GaP
>>> using -hf and -so as previously discussed (
>>> http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/msg14603.html
>>> ).
>>> There was a shut down of the machine during the hf execution in the
>>> first step of the calculation ( run_lapw -hf ... ). When the machine came
>>> back, I removed the case.vectorhf (case.vectorhf_old is still there) and
>>> case.energyhf. Then, I executed
>>>
>>> run_lapw -hf -NI -s hf -ec 0.0001 -cc 0.0001 -i 200
>>>
>>> trying to restart the calculation (non-parallel execution due to the HF x
>>> SO issue discussed in the previous messages above).
>>> The calculation restarted without a problem, but when the the
>>> case.vectorhf reached 187MB (less than a half of the expected size, see
>>> below) I got an error.
>>>
>>> -rw-r--r-- 1 luisoda luisoda 187M Mai 18 03:51 GaPwurtHSE-DielSO-1.vector
>>> -rw-r--r-- 1 luisoda luisoda 187M Mai 18 00:14
>>> GaPwurtHSE-DielSO-1.vectorhf
>>> -rw-r--r-- 1 luisoda luisoda 565M Abr 23 21:33
>>> GaPwurtHSE-DielSO-1.vectorhf_old
>>>
>>> The only related error message I found it was:
>>>
>>> error in calc_h: info not equal to 0
>>>
>>> I am probably making a mistake when restarting the calculation and I
>>> would really appreciate any help with this issue.
>>> Many thanks in advance.
>>> All the best,
>>> Luis
>>
>>
>>
>> _______________________________________________
>> Wien mailing list
>> Wien at zeus.theochem.tuwien.ac.at
>> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
>> SEARCH the MAILING-LIST at:
>> http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
>>
>
--
Professor Laurence Marks
"Research is to see what everybody else has seen, and to think what
nobody else has thought", Albert Szent-Gyorgi
www.numis.northwestern.edu ; Corrosion in 4D: MURI4D.numis.northwestern.edu
Partner of the CFW 100% program for gender equity, www.cfw.org/100-percent
Co-Editor, Acta Cryst A
More information about the Wien
mailing list