[Wien] Problem with k-parallel

Guo-ping Zhang gpzhang at femto.indstate.edu
Sat Mar 12 02:12:28 CET 2016


Dear Maciej,

I think Peter's suggestion is correct.

I encountered the same problem last year and it took me several days to 
figure it out. The problem is that the wien script deletes those temp 
files after each run. If you have two or more jobs running on the same 
node, you will have one job killing another job's tmp files.  You can see 
at the end of lapw2para ... that the script removes those tmp files which 
are still needed by other jobs. My solution is let tmp_dir point to your 
local working directory, not /tmp.


Best regards,

Guoping


On Thu, 10 Mar 2016, Maciej Polak wrote:

> Dear Prof. Blaha,
>
> Thank you very much for your support. I changed the code according to your 
> suggestions and it seems to work better now. I will be testing it though and 
> let you know if any other issues arise.
>
> Thanks again,
>
> Best regards,
>
> Maciej Polak
>
>
>
> On 03/09/2016 05:06 PM, Peter Blaha wrote:
>> Just put it to /tmp.
>> 
>> (There was a suggestion in the mailing list that someone wanted to change 
>> /tmp to some other directory, which is tedious in the old versions. With a 
>> variable $tmp_dir  this is "easy" to change and will be active in the next 
>> release.
>> 
>> 
>> On 03/09/2016 04:27 PM, Maciej Polak wrote:
>>> Dear Prof. Blaha,
>>> 
>>> Thank you for your answer. I found the appropriate part of my lapw2para 
>>> and substituted it with what you suggested.
>>> However, one of the changes that I notice here is that you changed tmp for 
>>> $tmp_dir. This $tmp_dir is a new variable which is not recognized by my 
>>> script (tmp_dir: Undefined variable.). What is the meaning of this 
>>> variable?
>>> 
>>> Thank you again for finding the time to reply to my email.
>>> 
>>> Best regards,
>>> 
>>> Maciej Polak
>>> 
>>> P.S In case you need it, I post the unmodified part of my code:
>>> 
>>> 
>>> set i = 1
>>> while ($i <= $maxproc)
>>> # if ($debug > 0) echo -n "$i "
>>>   cp $def.def /tmp/.tmp.$user.$$
>>>   #subsituting in files:
>>>   cat <<theend >/tmp/.script.$user.$$
>>> 
>>> s/vectorso$dnup/&_$i/w /tmp/.mist.$user.$$
>>> s/vectorso$updn/&_$i/w /tmp/.mist.$user.$$
>>> s/vectordum$dnup/&_$i/w /tmp/.mist.$user.$$
>>> s/vectordum$updn/&_$i/w /tmp/.mist.$user.$$
>>> s/vector$dnup/&_$i/w /tmp/.mist.$user.$$
>>> s/vector$updn/&_$i/w /tmp/.mist.$user.$$
>>> s/energyso$dnup/&_$i/w /tmp/.mist.$user.$$
>>> s/energyso$updn/&_$i/w /tmp/.mist.$user.$$
>>> s/energydum/&_$i/w /tmp/.mist.$user.$$
>>> s/energy$dnup/&_$i/w /tmp/.mist.$user.$$
>>> s/energy$updn/&_$i/w /tmp/.mist.$user.$$
>>> s/\(weigh$dnup\)'/\1_$i'/w /tmp/.mist.$user.$$
>>> s/\(weigh$updn\)'/\1_$i'/w /tmp/.mist.$user.$$
>>> s/\(weightaverso$updn\)'/\1_$i'/w /tmp/.mist.$user.$$
>>> s/normso$dnup/&_$i/w /tmp/.mist.$user.$$
>>> s/normso$updn/&_$i/w /tmp/.mist.$user.$$
>>> s/output2${updn}$eece/&_$i/w /tmp/.mist.$user.$$
>>> s/clmval${updn}$eece/&_$i/w /tmp/.mist.$user.$$
>>> s/vrespval$updn/&_$i/w /tmp/.mist.$user.$$
>>> s/dmat$updn/&_$i/w /tmp/.mist.$user.$$
>>> s/scf2$updn/&_$i/w /tmp/.mist.$user.$$
>>> s/help$updn/&_$i/w /tmp/.mist.$user.$$
>>> s/almblm$updn/&_$i/w /tmp/.mist.$user.$$
>>> 
>>> theend
>>>
>>>   sed -f /tmp/.script.$user.$$ /tmp/.tmp.$user.$$ > /tmp/.tmp1.$user.$$
>>>   sed "s/vector_${i}dn_$i/vectordn_$i/" 
>>> /tmp/.tmp1.$user.$$>/tmp/.tmp2.$user.$$
>>>   sed "s/vector_${i}up_$i/vectorup_$i/" 
>>> /tmp/.tmp2.$user.$$>/tmp/.tmp1.$user.$$
>>>   sed "s/vector_${i}so_$i/vectorso_$i/" 
>>> /tmp/.tmp1.$user.$$>/tmp/.tmp2.$user.$$
>>>   sed "s/energy_${i}up_$i/energyup_$i/" 
>>> /tmp/.tmp2.$user.$$>/tmp/.tmp1.$user.$$
>>>   sed "s/energy_${i}dn_$i/energydn_$i/" 
>>> /tmp/.tmp1.$user.$$>/tmp/.tmp2.$user.$$
>>>   sed "s/energy_${i}so_$i/energyso_$i/" 
>>> /tmp/.tmp2.$user.$$>/tmp/.tmp1.$user.$$
>>>   sed "s/energyso_${i}dn_$i/energysodn_${i}/" 
>>> /tmp/.tmp1.$user.$$>/tmp/.tmp2.$u$
>>>   sed "s/energy_${i}dum_$i/energydum_$i/" 
>>> /tmp/.tmp2.$user.$$>/tmp/.tmp1.$user.$
>>>   sed "s/vector_${i}so_${i}dn_$i/vectorsodn_$i/" 
>>> /tmp/.tmp1.$user.$$>/tmp/.tmp2$
>>>   sed "s/vector_${i}dum_${i}dn_$i/vectordumdn_$i/" 
>>> /tmp/.tmp2.$user.$$>"$def"_$$
>>>   @ i ++
>>> end
>>> 
>>> 
>>> W dniu 09/03/16 13:45 Peter Blaha <pblaha at theochem.tuwien.ac.at> napisał:
>>>> 
>>>> Hi,
>>>> 
>>>> Yes, we had have recently also such a problem. It comes from slow disk 
>>>> I/O.
>>>> 
>>>> A fie like /tmp/.tmp2.mpolak.50255
>>>> 
>>>> is a temporary file created by lapw2para and is used when we modify the 
>>>> lapw2_xx.def files by a couple of sed commands.
>>>> 
>>>> Because of this I've reduced the sed commands in my version of lapw2para. 
>>>> Unfortunately, I cannot post the script because it will not be compatible 
>>>> with your WIEN2k version, but I can tell you what we did and since then 
>>>> these errors did not show up anymore.
>>>> 
>>>> Please identify the following lines in your lapw2para and modify it like 
>>>> shown below:
>>>> ...
>>>> #creating def files
>>>> set i = 1
>>>> while ($i <= $maxproc)
>>>> # if ($debug > 0) echo -n "$i "
>>>> cp $def.def $tmp_dir/.tmp.$user.$$
>>>> #subsituting in files:
>>>> cat <<theend >$tmp_dir/.script.$user.$$
>>>> s/vectorso$dnup/&_$i/w $tmp_dir/.mist.$user.$$
>>>> s/vectorso$updn/&_$i/w $tmp_dir/.mist.$user.$$
>>>> s/vectordum$dnup/&_$i/w $tmp_dir/.mist.$user.$$
>>>> s/vectordum$updn/&_$i/w $tmp_dir/.mist.$user.$$
>>>> s/vector$dnup'/vector${dnup}_$i'/w $tmp_dir/.mist.$user.$$
>>>> s/vector$updn'/vector${updn}_$i'/w $tmp_dir/.mist.$user.$$
>>>> s/energyso$dnup/&_$i/w $tmp_dir/.mist.$user.$$
>>>> s/energyso$updn/&_$i/w $tmp_dir/.mist.$user.$$
>>>> s/energydum/&_$i/w $tmp_dir/.mist.$user.$$
>>>> s/energy$dnup'/energy${dnup}_$i'/w $tmp_dir/.mist.$user.$$
>>>> s/energy$updn'/energy${updn}_$i'/w $tmp_dir/.mist.$user.$$
>>>> s/\(weigh$dnup\)'/\1_$i'/w $tmp_dir/.mist.$user.$$
>>>> s/\(weigh$updn\)'/\1_$i'/w $tmp_dir/.mist.$user.$$
>>>> s/\(weightaverso$updn\)'/\1_$i'/w $tmp_dir/.mist.$user.$$
>>>> s/normso$dnup/&_$i/w $tmp_dir/.mist.$user.$$
>>>> s/normso$updn/&_$i/w $tmp_dir/.mist.$user.$$
>>>> s/output2${updn}$eece/&_$i/w $tmp_dir/.mist.$user.$$
>>>> s/clmval${updn}$eece/&_$i/w $tmp_dir/.mist.$user.$$
>>>> s/vrespval$updn/&_$i/w $tmp_dir/.mist.$user.$$
>>>> s/dmat$updn/&_$i/w $tmp_dir/.mist.$user.$$
>>>> s/scf2$updn/&_$i/w $tmp_dir/.mist.$user.$$
>>>> s/help$updn/&_$i/w $tmp_dir/.mist.$user.$$
>>>> s/almblm$updn/&_$i/w $tmp_dir/.mist.$user.$$
>>>> 
>>>> theend
>>>> 
>>>> sed -f $tmp_dir/.script.$user.$$ $tmp_dir/.tmp.$user.$$ > 
>>>> $tmp_dir/.tmp1.$user.$$
>>>> mv $tmp_dir/.tmp1.$user.$$ "$def"_$i.def
>>>> 
>>>> # sed "s/vector_${i}dn_$i/vectordn_$i/" 
>>>> $tmp_dir/.tmp1.$user.$$>$tmp_dir/.tmp2.$user.$$
>>>> # sed "s/vector_${i}up_$i/vectorup_$i/" 
>>>> $tmp_dir/.tmp2.$user.$$>$tmp_dir/.tmp1.$user.$$
>>>> # sed "s/vector_${i}so_$i/vectorso_$i/" 
>>>> $tmp_dir/.tmp1.$user.$$>$tmp_dir/.tmp2.$user.$$
>>>> # sed "s/energy_${i}up_$i/energyup_$i/" 
>>>> $tmp_dir/.tmp2.$user.$$>$tmp_dir/.tmp1.$user.$$
>>>> # sed "s/energy_${i}dn_$i/energydn_$i/" 
>>>> $tmp_dir/.tmp1.$user.$$>$tmp_dir/.tmp2.$user.$$
>>>> # sed "s/energy_${i}so_$i/energyso_$i/" 
>>>> $tmp_dir/.tmp2.$user.$$>$tmp_dir/.tmp1.$user.$$
>>>> # sed "s/energyso_${i}dn_$i/energysodn_${i}/" 
>>>> $tmp_dir/.tmp1.$user.$$>$tmp_dir/.tmp2.$user.$$
>>>> # sed "s/energy_${i}dum_$i/energydum_$i/" 
>>>> $tmp_dir/.tmp2.$user.$$>$tmp_dir/.tmp1.$user.$$
>>>> # sed "s/vector_${i}so_${i}dn_$i/vectorsodn_$i/" 
>>>> $tmp_dir/.tmp1.$user.$$>$tmp_dir/.tmp2.$user.$$
>>>> # sed "s/vector_${i}dum_${i}dn_$i/vectordumdn_$i/" 
>>>> $tmp_dir/.tmp2.$user.$$>"$def"_$i.def
>>>> @ i ++
>>>> end
>>>> 
>>>> As you can see, all these additional sed commands are now commented, 
>>>> since after the modifications of the .script.xx file they are not needed 
>>>> anymore.
>>>> 
>>>> 
>>>> On 03/08/2016 03:44 PM, Maciej Polak wrote:
>>>>> Dear WIEN2k users and developers,
>>>>> 
>>>>> I encountered a very strange problem. Sometimes (50/50 chance), the 
>>>>> calculations using just k-parallel will not finish. This exact same 
>>>>> case, when submitted again (it sometime takes more tries) finishes with 
>>>>> no problem. Sometimes it crashes after a few iterations, sometimes after 
>>>>> a hundred or more, and sometimes it just finishes successfully.
>>>>> 
>>>>> This is what I get in the output:
>>>>> 
>>>>> sed: can't read /tmp/.tmp2.mpolak.50255: No such file or directory
>>>>> cp: cannot stat `.in.tmp': No such file or directory
>>>>> 
>>>>> 
>>>>> There is also an error in stderr:
>>>>> 
>>>>> forrtl: No such file or directory
>>>>> forrtl: severe (29): file not found, unit 20, file 
>>>>> /lustre/scratch/tmp/pbs.1275300.achilles/MoS2_LDA/fort.20
>>>>> Image PC Routine Line Source
>>>>> lapw2 00000000004B3E37 Unknown Unknown Unknown
>>>>> lapw2 00000000004D5BE0 Unknown Unknown Unknown
>>>>> lapw2 000000000048140D MAIN__ 155 lapw2_tmp_.F
>>>>> lapw2 0000000000403F0E Unknown Unknown Unknown
>>>>> libc.so.6 00002B1F2CD53D5D Unknown Unknown Unknown
>>>>> lapw2 0000000000403E19 Unknown Unknown Unknown
>>>>> 
>>>>> 
>>>>> Do you have any idea what may be the cause of this? Running on just one 
>>>>> CPU is always fine. There is certainly no error in my input file, 
>>>>> because after a few tries this exact same case will eventually finish 
>>>>> correctly.
>>>>> 
>>>>> Thank you for your help
>>>>> 
>>>>> Maciej Polak
>>>>> _______________________________________________
>>>>> Wien mailing list
>>>>> Wien at zeus.theochem.tuwien.ac.at
>>>>> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
>>>>> SEARCH the MAILING-LIST at: 
>>>>> http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
>>>>> 
>>>> 
>>>> -- 
>>>> 
>>>> P.Blaha
>>>> 
>>>> -------------------------------------------------------------------------- 
>>>> Peter BLAHA, Inst.f. Materials Chemistry, TU Vienna, A-1060 Vienna
>>>> Phone: +43-1-58801-165300 FAX: +43-1-58801-165982
>>>> Email: blaha at theochem.tuwien.ac.at WIEN2k: http://www.wien2k.at
>>>> WWW: http://www.imc.tuwien.ac.at/staff/tc_group_e.php
>>>> 
>>>> -------------------------------------------------------------------------- 
>>>> _______________________________________________
>>>> Wien mailing list
>>>> Wien at zeus.theochem.tuwien.ac.at
>>>> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
>>>> SEARCH the MAILING-LIST at: 
>>>> http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
>>> _______________________________________________
>>> Wien mailing list
>>> Wien at zeus.theochem.tuwien.ac.at
>>> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
>>> SEARCH the MAILING-LIST at: 
>>> http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
>>> 
>> 
>
> _______________________________________________
> Wien mailing list
> Wien at zeus.theochem.tuwien.ac.at
> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
> SEARCH the MAILING-LIST at: 
> http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
>


More information about the Wien mailing list