[Wien] Performance with AMD Phenom 9850 2.5GHz & Intel Q9450 2.66GHz & Xeon x3230 2.66GHz

Gerhard Fecher fecher at uni-mainz.de
Fri Mar 6 09:01:06 CET 2009

for quad cores, the use of the 32 bit MKL is not appropriate.
Change to 64 bit (its required that you use a 64 bit Linux distribution) !

with Intel processors I have no problms with

FOPT:    -FR -w -mp1 -prec_div -pad -ip -O3 -xT
LDFLAGS: -L/opt/intel/fce/10.1.017/lib -static -lsvml -lguide -lpthread
R_LIBS:  -L/opt/intel/mkl/ -lmkl_lapack -lmkl_em64t -lmkl_core -lguide -lpthread

that gave the best performance on XEON and Core processors.
For AMD you may need to check if the optimization switches -ip -O3 -xT work correctly (maybe you need -x instead of -xT
or some of the -axY switches, check the manual))

The new ifort Version 11 allows -xhost to generate any instructions that are supported by the compilation host
but I do not know whether this works with AMD, with Intel it works fine.


Dr. Gerhard H. Fecher
Institut of Inorganic and Analytical Chemistry
Johannes Gutenberg - University
55099 Mainz
Von: wien-bounces at zeus.theochem.tuwien.ac.at [wien-bounces at zeus.theochem.tuwien.ac.at] im Auftrag von HyunJungKim [angpangmokjang at hanmail.net]
Gesendet: Freitag, 6. März 2009 03:48
An: wien at zeus.theochem.tuwien.ac.at
Betreff: [Wien] Performance with AMD Phenom 9850 2.5GHz & Intel Q9450   2.66GHz & Xeon x3230 2.66GHz

Dear all.

I'm now using three different architectures with WIEN2k_09. The compilation has been successfully done with intel MKL and Ifort 10.1 compiler.

** Compilation option : compiler was 10.1.021 for both machine
 With AMD CPU, the compile option was "-FR -lowercase -assume byterecl".
      Intel CPU, it was "-FR -mp1 -w -prec_div -pc80 -pad -align -DINTEL_VML -traceback"

** MKLibrary version (non-commercial distribution from Intel) : 32 bit
    LINKER FLAG   : $(FOPT) -L/opt/intel/mkl/ -i-static
R_LIB (LAPACK+BLAS) : -lmkl_lapack -lmkl_core -lmkl_ia32 -lguide -pthread

Q1. Is that reliable with those options??
Q2. Above CPUs are quad core processor, what is the best way to evaluate parallel calc. internally.(compile options, things to be checked)
Q3. With my AMD processor the test calculation of H2O calculation takes a lots of time.( more than 5! hours for one ionic minimization)
    (water molecule is in the cubic box. 10 anstrom) RKmax = 4, lmax = 10, Gmax=12 , Rmt= O: 0.86 H:0.46
    Method of geometry minimization was NEW1(modified steepest-descent method) . Is that right??
Q4.How can I improve the performance.

Thank you.
Best regards.

Hyun Jung, Kim
Seoul, Korea.


        [http://amsimg.daum-img.net/www2/0B8E/cCcb/daumdirect_footer_29327_2.gif] <http://daumdirect.daum.net/websales/main/event/20080516_woori_event.jsp?_partner_code.key=9100009>


More information about the Wien mailing list