[Wien] lapw2 mpi parallelization limits

Peter Blaha pblaha at theochem.tuwien.ac.at
Mon Mar 16 21:34:43 CET 2009


We run routinely on more cpus.

Hard to guess what could go wrong.

A possible test: use a .machines file with
1:compute-0-13 compute-0-13 compute-0-13 ....
(I have not tested the :8 instructuion, although it should work)

A possible patch:  lapw2para uses the    ".processes" file (generated by 
  lapw1 step). You may want to edit it so that lapw2 uses less cpus.

Scott Beardsley schrieb:
> I'm running into a problem where lapw2 crashes when using > 4cpus. I'm
> hoping someone else has been down this road before. Are there any limits
> to lapw2_mpi? It seems like a job using 4 cpus is fine but 5 cpus
> crashes with a timeout. I'm using strictly MPI-only parallelization with
> WIEN 9.01 (plus patches). The problem is with the following command
> (this runs after "x lapw2 -up -p"):
> 
> $ mpirun -np 16 -machinefile .machine1 /path/to/wien/lapw2_mpi
> uplapw2_1.def 1
> 
> My parallel_options looks like:
> 
> setenv USE_REMOTE 0
> setenv WIEN_GRANULARITY 1
> setenv WIEN_MPIRUN "mpirun -np _NP_ -machinefile _HOSTS_ _EXEC_"
> 
> If I set up my environment and run the following everything works fine:
> 
> $ mpirun -np 4 -machinefile .machine1 /path/to/wien/lapw2_mpi
> uplapw2_1.def 1
> 
> I've noticed that the lapw2 step only takes ~ 25secs. Is there any way
> to specify how many cpus you want to use for lapw2? If it only takes
> 1/4min that is ok with 4cpus but the other stages take longer and I'd
> like to use more than 4cpus there. My machines file looks like this:
> 
> lapw0: compute-0-13:8 compute-0-19:8
> 1: compute-0-13:8 compute-0-19:8
> granularity:1
> extrafine:1
> 
> If you need more info or want to reproduce let me know what config files
> I should send.
> 
> Scott
> _______________________________________________
> Wien mailing list
> Wien at zeus.theochem.tuwien.ac.at
> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien


More information about the Wien mailing list