[Wien] Question about parallel mode calculation
Dong YuHui
dongyh at mail.ihep.ac.cn
Fri Apr 16 02:43:55 CEST 2004
Sorry Jorisssen, I have solve the problem. You are right, it is no
necessary to use MPI during k-point parallel job. The problem I met is
due to my PC farm. After I change the configuration of ssh according to
the request in UG, it works.
On Thu, 15 Apr 2004, Jorissen Kevin wrote:
> Hi, I cant solve your problems right away. Just a few thoughts :
> * k-point parall and MPI (fine grain) parall are two completely different things. you can use k-point p. without MPI !! Since MPI usually takes more effort than k-point to get things working, I advise you to focus on k-point p first.
> * if you're working on a large machine that you yourself don't control, you should probably be in touch with your system administrator for stuff like MPI, passwordless access to machines, ...
> * lapw1para says it's going to do just one parallel job. That means : no k-point parall !!
> Probably, your .machines-file is wrong. Does it contain only one line? It should contain more!!
>
> good luck,
>
> Kevin.
>
> -----Original Message-----
> From: Dong YuHui [mailto:dongyh at mail.ihep.ac.cn]
> Sent: Thu 4/15/2004 4:22 AM
> To: wien at zeus.theochem.tuwien.ac.at
> Cc:
> Subject: [Wien] Question about parallel mode calculation
>
>
>
> Dear Wien users,
> I want to launch the parallel calculation in a PC farm. However I can not
> get enough information about how to do it.
> I run the k-point parallelization and fail. The dayfile reported:
>
> Calculating ZnO_para in /home/wien2k/w2work/ZnO_para
> on bsrf-serv.ihep.ac.cn
>
> start (Thu Apr 15 08:57:31 CST 2004) with lapw0 (20/20 to go)
> > lapw0 -p (08:57:31) starting parallel lapw0 at Thu Apr 15 08:57:31
> CST 2004
> --------
> running lapw0 in single mode
> 3.760u 0.110s 0:03.86 100.2% 0+0k 0+0io 2052pf+0w
> > lapw1 -c -p (08:57:35) starting parallel lapw1 at Thu Apr 15
> 08:57:35 CST 2004
> -> starting parallel LAPW1 jobs at Thu Apr 15 08:57:35 CST 2004
> running LAPW1 in parallel mode (using .machines)
> 1 number_of_parallel_jobs
> ** LAPW1 crashed!
> 0.100u 0.140s 0:03.20 7.5% 0+0k 0+0io 10315pf+0w
>
> > stop error
>
> STDOUT reported:
> FORTRAN STOP LAPW0 END
> cat: No match.
>
> testpara past, testpara1 failed.
>
> I also checked the file lapw1para_lapw, and found
> if ( $?WIEN_MPIRUN ) then
> set mpirun = "$WIEN_MPIRUN"
> else
> set mpirun='mpirun -np _NP_ _EXEC_'
> endif
>
> I do not know how to define WIEN_MPIRUN. It seems essential for
> k-point parallelization. I also install mpich-1.2.5 and set
> setenv WIEN_MPIRUN /home/wien2k/mpi/mpich-1.2.5/bin/mpirun
>
> before I set WIEN_MPIRUN, the program reported "/root/bin/mpirun:
> Permission denied", now no this message but testpara1 still fails.
>
> Anybody have any suggestions or ideas about how to do? Many thanks.
>
> Yu-Hui Dong
>
>
> _______________________________________________
> Wien mailing list
> Wien at zeus.theochem.tuwien.ac.at
> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
>
>
>
>
More information about the Wien
mailing list