[Wien] Running Parallel Jobs
jadhikari@clarku.edu
jadhikari at clarku.edu
Wed Oct 18 16:58:12 CEST 2006
Dear Prof. P Blaha,
Thank you very much for the help.
We will proceed according to your suggestions.
Regards,
Subin Adhikari
> Is this a "reasonable" case (of size > 5000 ?
> Does it run with the sequential version ?
>
> If yes, maybe your scalapack installation has a problem.
>
> Also your script does not seem right. In your email there are 2 lines (I
> don't know if they are broken accidentally):
>
> > /usr/local/mpich/bin/mpirun -np 4 -machinefile $PBS_NODEFILE
> > /usr/opt/WIEN2k/run_lapw -p
>
> The first line does not have any "command" and is not right.
>
> For testing you could use something like (in one line, not broken!):
> usr/local/mpich/bin/mpirun -np 4 -machinefile $PBS_NODEFILE
> /usr/opt/WIEN2k/lapw1 lapw1_1.def
>
> This runs only lapw1 in parallel and should be the same as
> x lapw1 -p
> when you have configured the parallel run (siteconfig) correctly. (The
> WIEN2k scripts will insert the mpirun commands automatically).
>
> A full scf cycle is executed as usual
>
> run_lapw -p
>
> and the selection between k-point parallel and mpi-parallel comes only
> from the corresponding generation of the .machines file.
>
> jadhikari at clarku.edu schrieb:
>> Hello,
>> Thank you very much for the previous helps.
>>
>> We are trying to run a job in a parallel mode for the first time.
>> But it never runs past the LAPW1 step.
>> Here is the PBS script-
>> ________________________________________
>> #!/bin/sh
>> #PBS -l nodes=2:ppn=2:myrinet
>> #PBS -j oe
>> cd /home/lsmith/rut
>> rm -f .machines
>> awk '{print "1:"$1}' $PBS_NODEFILE > .machines
>> echo "granularity:1" >> .machines
>> /usr/local/mpich/bin/mpirun -np 4 -machinefile $PBS_NODEFILE
>> /usr/opt/WIEN2k/run_lapw -p
>> ________________________________________
>> Error message-
>>
>> Missing: program name
>> LAPW0 END
>> LAPW1 END
>> LAPW1 END
>> LAPW1 END
>> LAPW1 END
>> LAPW1 END
>>
>> Also we got another error as follows-
>>
>> --Cholesky INFO = 895
>> 'SECLR4' - POTRF (Scalapack/LAPACK) failed.
>>
>> Is it the case.in1 error or the script error? We could not figure it
>> out.
>> The nodes run for 3-4 minutes and then stop running. TESTPARA1 shows
>> this
>> error.
>>
>> Regards,
>> Subin Adhikari
>> _______________________________________________
>> Wien mailing list
>> Wien at zeus.theochem.tuwien.ac.at
>> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
>
> --
>
> P.Blaha
> --------------------------------------------------------------------------
> Peter BLAHA, Inst.f. Materials Chemistry, TU Vienna, A-1060 Vienna
> Phone: +43-1-58801-15671 FAX: +43-1-58801-15698
> Email: blaha at theochem.tuwien.ac.at WWW:
> http://info.tuwien.ac.at/theochem/
> --------------------------------------------------------------------------
> _______________________________________________
> Wien mailing list
> Wien at zeus.theochem.tuwien.ac.at
> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
>
>
More information about the Wien
mailing list