[Wien] lapw2para exited due to an ERROR

Peter Blaha pblaha at theochem.tuwien.ac.at
Wed Aug 29 11:27:02 CEST 2007


Your fileserver (or your network) is overloaded and cannot handle so many
requests (44 parallel jobs are too much ?).

Please "think" and "check" what a meaningful number of parallel jobs is !!!
More processors can even run SLOWER than less processors, because of network
overload.

When you use 140 k-points, I suspect it is a fairly small system and each
k-point takes only a few seconds. In such a case I'd take only a few
processors, in particular I'd use a number which can divide 140 without rest.
Thus 44 proc is certainly a very bad choice. Try 10, 20 or 35 (depending on
system size AND network speed).

Two more hints:
You can try to trick slow NFS systems by increasing some time-delays in
lapw1para and lapw2para:
set delay       = 1             # delay launching of processes by n seconds
set sleepy      = 1             # additional sleep before checking

You can reduce NFS traffic by
    a) using the latest WIEN2k version (no help-files)
    b) using a SCRATCH variable (a local directory like /tmp ,...)

Regards

李海铭 schrieb:
> Hello, wien users
>     I get a problem when using the latest WIEN2k_07. The scf stop
> after a few iterations (3 or more). The error file is dnlapw2.error:
> **  testerror: Error in Parallel LAPW2
>     My .machine file isgranularity:11:cu011:cu011:cu021:cu021:cu031:cu031:cu041:cu041:cu051:cu051:cu061:cu061:cu071:cu071:cu081:cu081:cu091:cu091:cu101:cu101:cu111:cu111:cu121:cu121:cu131:cu131:cu141:cu141:cu151:cu151:cu161:cu16extrafine:1
>     140 Kpoints during the whole calculations. So in the STDOUT, itwill list "FORTRAN STOP  LAPW2 END"
> 44 times. It is ture in the uplapw2, but there are only 43 "FORTRANSTOP  LAPW2 END" for the dnlapw2. It
> list at the end that:
> o_ypcall: clnt_call: RPC: Timed outcp: cannot stat `.in.tmp': No such file or directoryrm: cannot remove `.in.tmp': No such file or directoryrm: cannot remove `.in.tmp1': No such file or directory
> I doubt that 1 of the dnlapw2 meet error. Even, I meet it many timesfor other examples.
>     I can't fix it and need your help.
>     Best wishes!
>                            Haiming Li                           2007-08-28 				--------------李海铭中国科学院高能物理研究所同步辐射实验室Haiming LiBeijing Synchrotron Radiation FacilityInstitute of High Energy PhysicsChinese Academy of Sciences19 Yu Quan Lu, 100049 Beijing, P.R. ChinaTel: 0086+10 8823 6437  /  0086+135 8190 2824E-mail:lihm at ihep.ac.cn_______________________________________________Wien mailing listWien at zeus.theochem.tuwien.ac.athttp://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien

-- 

                                      P.Blaha
--------------------------------------------------------------------------
Peter BLAHA, Inst.f. Materials Chemistry, TU Vienna, A-1060 Vienna
Phone: +43-1-58801-15671             FAX: +43-1-58801-15698
Email: blaha at theochem.tuwien.ac.at    WWW: http://info.tuwien.ac.at/theochem/
--------------------------------------------------------------------------


More information about the Wien mailing list