[Wien] lapw2 mpi parallelization limits
Peter Blaha
pblaha at theochem.tuwien.ac.at
Mon Mar 16 21:34:43 CET 2009
We run routinely on more cpus.
Hard to guess what could go wrong.
A possible test: use a .machines file with
1:compute-0-13 compute-0-13 compute-0-13 ....
(I have not tested the :8 instructuion, although it should work)
A possible patch: lapw2para uses the ".processes" file (generated by
lapw1 step). You may want to edit it so that lapw2 uses less cpus.
Scott Beardsley schrieb:
> I'm running into a problem where lapw2 crashes when using > 4cpus. I'm
> hoping someone else has been down this road before. Are there any limits
> to lapw2_mpi? It seems like a job using 4 cpus is fine but 5 cpus
> crashes with a timeout. I'm using strictly MPI-only parallelization with
> WIEN 9.01 (plus patches). The problem is with the following command
> (this runs after "x lapw2 -up -p"):
>
> $ mpirun -np 16 -machinefile .machine1 /path/to/wien/lapw2_mpi
> uplapw2_1.def 1
>
> My parallel_options looks like:
>
> setenv USE_REMOTE 0
> setenv WIEN_GRANULARITY 1
> setenv WIEN_MPIRUN "mpirun -np _NP_ -machinefile _HOSTS_ _EXEC_"
>
> If I set up my environment and run the following everything works fine:
>
> $ mpirun -np 4 -machinefile .machine1 /path/to/wien/lapw2_mpi
> uplapw2_1.def 1
>
> I've noticed that the lapw2 step only takes ~ 25secs. Is there any way
> to specify how many cpus you want to use for lapw2? If it only takes
> 1/4min that is ok with 4cpus but the other stages take longer and I'd
> like to use more than 4cpus there. My machines file looks like this:
>
> lapw0: compute-0-13:8 compute-0-19:8
> 1: compute-0-13:8 compute-0-19:8
> granularity:1
> extrafine:1
>
> If you need more info or want to reproduce let me know what config files
> I should send.
>
> Scott
> _______________________________________________
> Wien mailing list
> Wien at zeus.theochem.tuwien.ac.at
> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
More information about the Wien
mailing list