[Wien] error in lapw2 - parallel

Lyudmila Dobysheva lyuka17 at mail.ru
Fri Mar 22 09:12:31 CET 2013


22.03.2013 11:43, Mathrubutham Rajagopalan пишет:
> lapw1  -p   -c  (11:23:10) starting parallel lapw1 at Mon Mar 24 11:23:10 IST 2003
>  4 number_of_parallel_jobs  ubuntu(11) ubuntu(11) ubuntu(11) ubuntu(11) ubuntu(1)
> ubuntu(1) ubuntu(1)    Summary of lapw1para:
>  ubuntu   k=11    user=0  wallclock=11
> 0.0u 0.0s 0:02.29 5.2% 0+0k 0+2040io 0pf+0w
> lapw2 -p   -c   (11:23:12) running LAPW2 in parallel mode **  LAPW2 crashed!
> 0.0u 0.0s 0:00.11 45.4% 0+0k 0+184io 0pf+0w

1. In spite of zero error files after lapw1 it looks like it was not 
working:
 > 0.0u 0.0s 0:02.29 5.2% 0+0k 0+2040io 0pf+0w
There is no a sign of processors' work. Look attentively at the output 
of lapw1 *.scf1 and *.output1
Are they o'k? It looks like the job stopped in lapw1para, without 
actually making lapw1.
Check the file in directory  /home/raja/wien2k/ parallel_options 
 

I think in your case there should be like:
setenv USE_REMOTE 0
setenv MPI_REMOTE 0
setenv WIEN_GRANULARITY 1

2. You have some remnants of previous iterations:
 >         8 -rw-rw-r-- 1 raja raja 0 Mar 24 11:55 lcore.error
 >         8 -rw-rw-r-- 1 raja raja 0 Mar 24 11:55 mixer.error

Make a fresh directory and do one cycle in it.

3. Make in terminal the commands by hand and send us the output
x lapw0
x lapw1 -c -p
x lapw2 -c -p

Best wishes
Lyudmila Dobysheva
------------------------------------------------------------------
Phys.-Techn. Institute of Ural Br. of Russian Ac. of Sci.
426001 Izhevsk, ul.Kirova 132
RUSSIA
------------------------------------------------------------------
Tel.:7(3412) 442118 (home), 218988(office), 722529(Fax)
E-mail: lyu at ftiudm.ru
         lyuka17 at mail.ru (office) lyuka17 at gmail.com (home)
Skype:  lyuka17 (home), lyuka18 (office)
http://fti.udm.ru/content/view/25/103/lang,english/
------------------------------------------------------------------


More information about the Wien mailing list