Hi all,<div><br></div><div>I have managed to run Wien2k in our cluster, with k-point parallelization. However, it looks like our NFS system (which is actually an AFS one) is still a bit unstable, since the cluster has been upgraded and re-assembled very recently. Problem is, the sysadmins have gone on vacations, so I'll have to find a way of getting around this the best I can until the beginning of next month.</div>
<div><br></div><div>My current problem is that looks like some nodes of our cluster have been losing connection with the AFS server intermittently, and from what I see (please correct me if I'm wrong) all the writing is done over the network to the home directory. So, during the writing of the energy_up files, if the connection is lost then lapw2 will crash. Indeed, one of the instances of lapw1 resulted in an energyup file, in the end, with 0 size. This in turn made lapw2 crash, and this has happened overnight.</div>
<div><br></div><div>My question is, I would like to make a small (I guess) change in the scripts, wherever needed. Instead of writing some files (only the ones that are critical for the execution of the next code) to the home, which would be done over AFS, they would be done in the scratch directory, which is local. Then, at the end of the execution, they would be copied to the home directory, possibly with a check on the success of the operation. I don't know if this would be better, but at least the problems with network load would be much more punctual, and it could also be more prone to error control.</div>
<div><br></div><div>Since I do not have much knowledge of csh programming (I'm mostly a bash guy) and the Wien2k scripts are pretty complex beasts to which I am not very acquainted, could you give your opinions on the feasibility of my suggestions, and if they are not too complex to implement, possible changes and/or places to be changed in the scripts?</div>
<div><br></div><div>Best regards,</div><div><br></div><div>Marcos</div>