[Wien] Cannot run kpoint parallel jobs - only serial. _An offer to developers_

Lyudmila Dobysheva lyuka17 at mail.ru
Thu May 30 10:00:29 CEST 2013


29.05.2013 23:58, Robert Nichol wrote:
> If I submit the script for k-point parallelization lapw2 to crashes.
> contents of  a case.dayfile
  n0523(1) 0.094u 0.014s 0.12 84.38%      0+0k 0+0io 0pf+0w
    n0523	 k=11	 user=0.88	 wallclock=622.038
0.974u 3.769s 0:07.71 61.3%	0+0k 424+11064io 4pf+0w

Dear Robert,

It looks like lapw1 does not work at all, due to wrong setting in the 
file parallel_options

There should be:
setenv USE_REMOTE 0

Two months ago, we have already had a letter with this problem here in 
the mailing list. ("error in lapw2 - parallel"  of Mar 22 2013)
I'd like to suggest to developers to look why error files are empty when 
lapw1 has not actually worked.
Maybe creation of the nonzero error file should be moved to an earlier 
place in lapw1para. Now, when lapw1para fails due to this wrong option 
of setenv, the nonzero error files are still not created, there exist in 
the directory old zero error files. totalexec checks testerror and 
thinks that everything is ok and goes to lapw2.

Best regards
   Lyudmila Dobysheva
------------------------------------------------------------------
Phys.-Techn. Institute of Ural Br. of Russian Ac. of Sci.
426001 Izhevsk, ul.Kirova 132
RUSSIA
------------------------------------------------------------------
Tel.:7(3412) 442118 (home), 218988(office), 722529(Fax)
E-mail: lyu at ftiudm.ru
         lyuka17 at mail.ru (office) lyuka17 at gmail.com (home)
Skype:  lyuka17 (home), lyuka18 (office)
http://fti.udm.ru/content/view/25/103/lang,english/
------------------------------------------------------------------




More information about the Wien mailing list