[Wien] Problem with k-parallel in version 24.1?
Yichen Zhang
zycforphysics at gmail.com
Wed Oct 16 00:49:42 CEST 2024
Dear WIEN2k developers and users,
I'm running WIEN2k 24.1 on a SLURM cluster. In the case here, only
k-parallel is used (no omp or mpi). I typically divided klist into 64
groups onto 64 cores for this set of calculations. Hyperthreading is turned
off.
I encountered this error from time to time. Sometimes all SCF cycles just
finish successfully, but there is maybe a 20-40% chance that the SCF stops
at sumpara at one cycle after lapw2. Restarting the SCF may just work fine
until convergence or encounter this problem again at one cycle. Sometimes
the error just doesn't pop up. The error comes from file case.scf2up/dn_XX
not found. XX being between, for example 1 and 64, if 64 k-point parallel
procedures.
One example of such error in slurm standard output is:
forrtl: No such file or directory
forrtl: severe (29): file not found, unit 21, file
/scratch/yz155/UUD_U6p25eV/UUD_U6p25eV.scf2dn_62
Image PC Routine Line Source
sumpara 000000000042876C Unknown Unknown Unknown
sumpara 000000000041303A scfsum_ 128 scfsum.f
sumpara 0000000000410F92 MAIN__ 242
sumpara.F
sumpara 000000000040434D Unknown Unknown Unknown
libc.so.6 000014D975829590 Unknown Unknown Unknown
libc.so.6 000014D975829640 __libc_start_main Unknown Unknown
sumpara 0000000000404265 Unknown Unknown Unknown
cp: cannot stat '.in.tmp': No such file or directory
grep: No match.
> stop error
The missing scf2 file sometimes comes from scf2up or sometimes from scf2dn.
The "62" seems random among k-parallel numbers.
I noticed a previous thread in 2016 when Maciej Polak asked about "Problem
with k-parallel", but I guess much has been updated since then.
Does it still come from slow I/O? I already run it in /scratch on the
cluster which has the fastest I/O. What are some insights and suggestions?
Thank you very much in advance.
Best regards
Yichen
--
Yichen Zhang
Department of Physics and Astronomy
Rice University
6100 Main St., Houston, TX 77005-1892
Email: zycforphysics at gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://zeus.theochem.tuwien.ac.at/pipermail/wien/attachments/20241015/b01399d1/attachment.htm>
More information about the Wien
mailing list