[Wien] Performance with AMD Phenom 9850 2.5GHz & Intel Q9450 2.66GHz & Xeon x3230 2.66GHz

Peter Blaha pblaha at theochem.tuwien.ac.at
Fri Mar 6 08:46:50 CET 2009


> Q2. Above CPUs are quad core processor, what is the best way to evaluate 
> parallel calc. internally.(compile options, things to be checked)

When you have more k-points, use k-point parallelization. The
AMD-processors (and in particular new Intel Nehalem (Core 7) CPS should
be quite efficient.

For big cases (NMAT < 7000) and if you have more than ONE quad-processor
with good network, you can install mpi and scalapack and use the
fine-grain parallell version.

> Q3. With my AMD processor the test calculation of H2O calculation takes 
> a lots of time.( more than 5! hours for one ionic minimization)
>     (water molecule is in the cubic box. 10 anstrom) RKmax = 4, lmax = 
> 10, Gmax=12 , Rmt= O: 0.86 H:0.46

a) Of course, an WIEN2k is VERY inefficient for a H2O molecule
(supercell approach, but also: our method is efficient when we can use
large RMTs, which is of course not possible for H2O.
b) Still you can DRAMATICALLY speed up using
   i) setting OMP_NUM_THREADS to 2 (or 4)  (automatic parallelization of
the diagonalization using the mkl routines
   ii) using iterative diagonalization   (run_lapw -it). Diagonalization
may be 10 times faster !!!   5h --> 1h
   iii) Test if RKmax is needed. Always start with small RKmax (for H
you can start with 3.0), once the scf cycle has converged, "save_lapw"
the results, edit case.in1 and increase RKMAX; continue (without
init_lapw !!) with run_lapw -I -fc 1    and compare the forces with
smaller RKMAX. If the forces are "similar", go back to the smaller RKmax
and perform min_lapw (with small RKMAX). Eventually, when close to the
equillibrium, increase RKMAX again. (Reducing RKMAX: 1h --> 20 min ??)
"Sounds" complicated, but "convergence checks" should be made anyway in
EVERY CASE. (Can be "automatized with a little shell script ...)

iv) GMAX=12 sounds small for H-systemes !!!

v) Our recommended default is PORT, but yes, sometimes NEW1 works even
better. Try it out.
>     Method of geometry minimization was NEW1(modified steepest-descent 
> method) . Is that right??
> Q4.How can I improve the performance.
> 
> 
> Thank you.
> Best regards.
> 
> 
> 
> 
> 
> Hyun Jung, Kim
> Seoul, Korea.
> 
-- 
-----------------------------------------
Peter Blaha
Inst. Materials Chemistry, TU Vienna
Getreidemarkt 9, A-1060 Vienna, Austria
Tel: +43-1-5880115671
Fax: +43-1-5880115698
email: pblaha at theochem.tuwien.ac.at
-----------------------------------------


More information about the Wien mailing list