[Wien] Question about parallel mode calculation

Dong YuHui dongyh at mail.ihep.ac.cn
Fri Apr 16 02:43:55 CEST 2004


Sorry Jorisssen, I have solve the problem. You are right, it is no 
necessary to use MPI during k-point parallel job. The problem I met is 
due to my PC farm. After I change the configuration of ssh according to 
the request in UG, it works.


On Thu, 15 Apr 2004, Jorissen Kevin wrote:

> Hi,  I cant solve your problems right away.  Just a few thoughts :
> * k-point parall and MPI (fine grain) parall are two completely different things.  you can use k-point p. without MPI !!  Since MPI usually takes more effort than k-point to get things working, I advise you to focus on k-point p first.
> * if you're working on a large machine that you yourself don't control, you should probably be in touch with your system administrator for stuff like MPI, passwordless access to machines, ...
> * lapw1para says it's going to do just one parallel job.  That means : no k-point parall !!
> Probably, your .machines-file is wrong.  Does it contain only one line?  It should contain more!!
>  
> good luck,
>  
> Kevin.
> 
> 	-----Original Message----- 
> 	From: Dong YuHui [mailto:dongyh at mail.ihep.ac.cn] 
> 	Sent: Thu 4/15/2004 4:22 AM 
> 	To: wien at zeus.theochem.tuwien.ac.at 
> 	Cc: 
> 	Subject: [Wien] Question about parallel mode calculation
> 	
> 	
> 
> 	Dear Wien users,
> 	I want to launch the parallel calculation in a PC farm. However I can not
> 	get enough information about how to do it.
> 	I run the k-point parallelization and fail. The dayfile reported:
> 	
> 	Calculating ZnO_para in /home/wien2k/w2work/ZnO_para
> 	on bsrf-serv.ihep.ac.cn
> 	
> 	    start       (Thu Apr 15 08:57:31 CST 2004) with lapw0 (20/20 to go)
> 	>   lapw0 -p    (08:57:31) starting parallel lapw0 at Thu Apr 15 08:57:31
> 	CST 2004
> 	--------
> 	running lapw0 in single mode
> 	3.760u 0.110s 0:03.86 100.2%    0+0k 0+0io 2052pf+0w
> 	>   lapw1  -c -p        (08:57:35) starting parallel lapw1 at Thu Apr 15
> 	08:57:35 CST 2004
> 	->  starting parallel LAPW1 jobs at Thu Apr 15 08:57:35 CST 2004
> 	running LAPW1 in parallel mode (using .machines)
> 	1 number_of_parallel_jobs
> 	**  LAPW1 crashed!
> 	0.100u 0.140s 0:03.20 7.5%      0+0k 0+0io 10315pf+0w
> 	
> 	>   stop error
> 	
> 	STDOUT reported:
> 	FORTRAN STOP  LAPW0 END
> 	cat: No match.
> 	
> 	testpara past, testpara1 failed.
> 	
> 	I also checked the file lapw1para_lapw, and found
> 	if ( $?WIEN_MPIRUN ) then
> 	  set mpirun = "$WIEN_MPIRUN"
> 	else
> 	  set mpirun='mpirun -np _NP_ _EXEC_'
> 	endif
> 	
> 	I do not know how to define WIEN_MPIRUN. It seems essential for
> 	k-point parallelization. I also install mpich-1.2.5 and set
> 	  setenv WIEN_MPIRUN /home/wien2k/mpi/mpich-1.2.5/bin/mpirun
> 	
> 	before I set WIEN_MPIRUN, the program reported "/root/bin/mpirun:
> 	Permission denied", now no this message but testpara1 still fails.
> 	
> 	Anybody have any suggestions or ideas about how to do? Many thanks.
> 	
> 	Yu-Hui Dong
> 	
> 	
> 	_______________________________________________
> 	Wien mailing list
> 	Wien at zeus.theochem.tuwien.ac.at
> 	http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
> 	
> 	
> 
> 




More information about the Wien mailing list