[Wien] Running Parallel Jobs

jadhikari@clarku.edu jadhikari at clarku.edu
Wed Oct 18 16:58:12 CEST 2006


Dear Prof. P Blaha,
Thank you very much for the help.
We will proceed according to your suggestions.

Regards,
Subin Adhikari

> Is this a "reasonable" case (of size > 5000 ?
> Does it run with the sequential version ?
>
> If yes, maybe your scalapack installation has a problem.
>
> Also your script does not seem right. In your email there are 2 lines (I
> don't know if they are broken accidentally):
>
>  > /usr/local/mpich/bin/mpirun -np 4 -machinefile $PBS_NODEFILE
>  > /usr/opt/WIEN2k/run_lapw -p
>
> The first line does not have any "command" and is not right.
>
> For testing you could use something like (in one line, not broken!):
> usr/local/mpich/bin/mpirun -np 4 -machinefile $PBS_NODEFILE
> /usr/opt/WIEN2k/lapw1 lapw1_1.def
>
> This runs only lapw1 in parallel and should be the same as
> x lapw1 -p
> when you have configured the parallel run (siteconfig) correctly. (The
> WIEN2k scripts will insert the mpirun commands automatically).
>
> A full scf cycle is executed as usual
>
> run_lapw -p
>
> and the selection between k-point parallel and mpi-parallel comes only
> from the corresponding generation of the .machines file.
>
> jadhikari at clarku.edu schrieb:
>> Hello,
>> Thank you very much for the previous helps.
>>
>> We are trying to run a job in a parallel mode for the first time.
>> But it never runs past the LAPW1 step.
>> Here is the PBS script-
>> ________________________________________
>> #!/bin/sh
>> #PBS -l nodes=2:ppn=2:myrinet
>> #PBS -j oe
>> cd /home/lsmith/rut
>> rm -f .machines
>> awk '{print "1:"$1}' $PBS_NODEFILE > .machines
>> echo "granularity:1" >> .machines
>> /usr/local/mpich/bin/mpirun -np 4 -machinefile $PBS_NODEFILE
>> /usr/opt/WIEN2k/run_lapw -p
>> ________________________________________
>> Error message-
>>
>> Missing: program name
>>  LAPW0 END
>>  LAPW1 END
>>  LAPW1 END
>>  LAPW1 END
>>  LAPW1 END
>>  LAPW1 END
>>
>> Also we got another error as follows-
>>
>> --Cholesky INFO =           895
>>  'SECLR4' - POTRF (Scalapack/LAPACK) failed.
>>
>> Is it the case.in1 error or the script error? We could not figure it
>> out.
>> The nodes run for 3-4 minutes and then stop running. TESTPARA1 shows
>> this
>> error.
>>
>> Regards,
>> Subin Adhikari
>> _______________________________________________
>> Wien mailing list
>> Wien at zeus.theochem.tuwien.ac.at
>> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
>
> --
>
>                                        P.Blaha
> --------------------------------------------------------------------------
> Peter BLAHA, Inst.f. Materials Chemistry, TU Vienna, A-1060 Vienna
> Phone: +43-1-58801-15671             FAX: +43-1-58801-15698
> Email: blaha at theochem.tuwien.ac.at    WWW:
> http://info.tuwien.ac.at/theochem/
> --------------------------------------------------------------------------
> _______________________________________________
> Wien mailing list
> Wien at zeus.theochem.tuwien.ac.at
> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
>
>



More information about the Wien mailing list