[Wien] Benchmark

Gerhard H Fecher fecher at uni-mainz.de
Mon Nov 21 14:00:13 CET 2005


It seems the speed depends rather linearly on the CPU speed, has anyone 
experience with AMD processors like Opteron ?

Am Montag, 21. November 2005 11:36 schrieb Michael Gurnett:
> I did notice that having both OMP_NUM_THREADS=2 and k point paralisation 
> running at the same time seemed to result in the cpu monitor fluctuating 
> less ie seemed to hold closer to 100 % but times were comparable so it 
> doesn't slow the system down at least (just as easy to keep 
> OMP_NUM_THREADS=2 always on).
> 
> My ram timings are not optimal at the moment but I have got the following 
> benchmarks for the test case
> 
> 
> P4D dual-core (820), 3.3 GHz     128 sec    ifort9 + cmkl8.0 
> OMP_NUM_THREADS=2This system should be able to get under 2 minutes with a 
> bit of tweaking.Michael----- Original Message ----- 
> From: <lombaeb at science.unisa.ac.za>
> To: "A Mailing list for WIEN2k users" <wien at zeus.theochem.tuwien.ac.at>
> Sent: Friday, November 18, 2005 1:57 PM
> Subject: Re: [Wien] Benchmark
> 
> 
> > OK great.  For the sake of others: just add it in .bashrc (or add setenv
> > OMP_NUM_THREADS 2  in .cshrc, if you use tcsh); and logout / login for the
> > new settings to take effect.
> >
> > Lapw0 does not use MKL/GOTO as far as I know, so OMP_NUM_THREADS should
> > not have an effect.
> >
> > I have not tired using parallel k-points and OMP_NUM_THREADS
> > simultaneously. This will probably slow down the calculation relative to
> > only using parallel k-points.
> >
> > Regards
> >
> > Enrico
> >
> > --
> > Dr E B Lombardi
> > Physics Department
> > University of South Africa
> > P.O. Box 392
> > 0003 UNISA
> > South Africa
> > Tel: +27 (0)12 429-8027
> > Fax: +27 (0)12 429-3643
> > e-mail: lombaeb at science.unisa.ac.za
> >
> >
> > On Fri, 18 Nov 2005, Michael Gurnett wrote:
> >
> >> Thank you for the answer. Just a few questions. Does this require a
> >> recompile of the code? or enough to just add it in the .bashrc. Is there
> >> any speed up in lapw0, and finally have you noticed any problem when 
> >> using
> >> this but spreading several k points over cpus (would be nice if it worked
> >> for lapw0 without causing problems in lapw1)
> >>
> >> Michael
> >>
> >> -----Original Message-----
> >> From: lombaeb at science.unisa.ac.za
> >> To: A Mailing list for WIEN2k users <wien at zeus.theochem.tuwien.ac.at>
> >> Date: Fri, 18 Nov 2005 11:45:53 +0200 (SAST)
> >> Subject: Re: [Wien] Benchmark
> >>
> >> > Using   export OMP_NUM_THREADS=2  in .bashrc should speed up
> >> > the calculations.
> >> >
> >> > For a 32 bit dual processor 3.0 GHz Xeon machine I get:
> >> > 1 kpoint,  1 processor:        247s (4:08)
> >> > 2 kpoints, 2 processors:       261s (4:22)
> >> > 1 kpoint,  OMP_NUM_THREADS=2:  275s (3:09 = 189)
> >> >
> >> > Note in the last line that the 'CPU seconds' does not equal the 'Real
> >> > time', since the former takes into account that two CPUs are used. The
> >> > difference of 1s in the first two lines is due to CPU usage never
> >> > being exactly 100.0%.
> >> >
> >> > So using OMP....=2, speeds up the calculation by about 50%, which
> >> > unfortunately implies only 50% efficiency for the 2nd CPU.
> >> >
> >> > Note that both MKL and GOTO use the OMP_NUM_THREADS environment
> >> > variable.
> >> >
> >> > Regards
> >> >
> >> > Enrico
> >> >
> >> > --
> >> > Dr E B Lombardi
> >> > Physics Department
> >> > University of South Africa
> >> > P.O. Box 392
> >> > 0003 UNISA
> >> > South Africa
> >> > Tel: +27 (0)12 429-8027
> >> > Fax: +27 (0)12 429-3643
> >> > e-mail: lombaeb at science.unisa.ac.za
> >> >
> >> >
> >> > On Thu, 17 Nov 2005, Michael Gurnett wrote:
> >> >
> >> > > Yes. using a both processors by spreading k points over them
> >> > basically
> >> > > doubles the speed (and this is also the more typical setup we use).
> >> > What I
> >> > > really want to do is just get an idea how fast this system is on the
> >> > test
> >> > > case, which means most likely mpi, which I believe is just not that
> >> > good. It
> >> > > would be nice if mkl would use both processors. I will be setting up
> >> > mpi for
> >> > > lapw0 (tried long ago but never got it working). So if someone has a
> >> > > "Getting intel mpi to work with ifort for dummies" book I would be
> >> > most
> >> > > greatful.
> >> > >
> >> > > Michael
> >> > > ----- Original Message ----- 
> >> > > From: <lombaeb at science.unisa.ac.za>
> >> > > To: "A Mailing list for WIEN2k users"
> >> > <wien at zeus.theochem.tuwien.ac.at>
> >> > > Sent: Wednesday, November 16, 2005 7:13 PM
> >> > > Subject: Re: [Wien] Benchmark
> >> > >
> >> > >
> >> > > > Dear Wien users
> >> > > >
> >> > > > On an x86_64 machine with a 3.2 GHz (P4-640) CPU I got the
> >> > following
> >> > > > benchmark times, using ifort 9.0 and mkl 8.0 (OPTIONS used are
> >> > given at
> >> > > > the end of the e-mail) on a system with an Intel motherboard (945G
> >> > > > chipset) and DDR-II 533MHz RAM (dual channel configuration).
> >> > > >
> >> > > > HT disabled:  163s
> >> > > > HT enabled:  176s
> >> > > > HT enabled, with OMP_NUM_THREADS = 2:  194s
> >> > > > HT enabled, 2 k-points in parallel: 386 s  (DIV 2 = 193s) (x lapw1
> >> > -c
> >> > > > -p)
> >> > > >
> >> > > >
> >> > > > To Michael:
> >> > > > It may be possible to speed up the .throughput. time on a dual core
> >> > > > machine by using .normal. MKL and running 2 k-points simultaneously
> >> > > > (using .machines).  The time for 1 k-point may be slower, but the
> >> > time
> >> > > > for 2 k-points in parallel will probably be faster.
> >> > > >
> >> > > >
> >> > > >
> >> > > > OPTIONS used (thanks to Gerhard Fecher for the useful e-mail about
> >> > > > compiling Wien2k with ifort 9.0 at the end of August):
> >> > > >
> >> > > > current:FOPT:-FR -mp1 -w -prec_div -pc80 -pad -ip -DINTEL_VML -xP
> >> > > > current:FPOPT:-FR -mp1 -w -prec_div -pc80 -pad -ip -DINTEL_VML
> >> > > > current:LDFLAGS:-L/opt/intel/fce/9.0/lib
> >> > -L/opt/intel/mkl/8.0/lib/em64t -lsvml
> >> > > > current:DPARALLEL:'-DParallel'
> >> > > > current:R_LIBS:-lmkl_lapack -lmkl_em64t -lguide -lguide_stats
> >> > -lpthread
> >> > > > current:RP_LIBS:-L /usr/local/SCALAPACK -L /usr/local/BLACS/LIB
> >> > -lpblas
> >> > > > -lredist -ltools -lscalapack -lfblacs -lblacs .lmpi
> >> > > >
> >> > > > Notes:
> >> > > > 1.  The speed difference between the new GOTO 1.00 library and MKL
> >> > > > 8.0 was negligible.
> >> > > > 2.  Omitting the -xP option for ifort (P4 only) slows down the
> >> > > > calculations by about 4s (there is a change of +- 1 in the last
> >> > > > significant digit of the output of lapw1 if -xP is included).
> >> > > > 3.  The paths to the compiler and mkl libraries will be
> >> > installation
> >> > > > dependent.
> >> > > > 4.  These paths must also be included in .bashrc (or .cshrc) in the
> >> > > > LD_LIBRARY_PATH environment variable.
> >> > > >
> >> > > > Regards,
> >> > > >
> >> > > > Enrico
> >> > > >
> >> > > >
> >> > > >
> >> > > >
> >> > > > On Mon,
> >> > > > 14 Nov 2005,
> >> > > > Michael Gurnett wrote:
> >> > > >
> >> > > >>
> >> > > >> PIV dual-core 820 at 3.3 Ghz ifort 9 and libgoto_prescott64p-r1.00
> >> > > >>
> >> > > >> 171 seconds
> >> > > >>
> >> > > >>
> >> > > >> The compiler options used were as follows:
> >> > > >>
> >> > > >> current:FOPT:-FR -mp1 -w -prec_div -pc80 -pad -ip -DINTEL_VML
> >> > > >> current:LDFLAGS:-L/opt -L/opt/intel/cmkl/8.0/lib/em64t -Vaxlib
> >> > > >> -static-libcxa -pthread
> >> > > >> current:DPARALLEL:'-DParallel'
> >> > > >> current:R_LIBS:-lgoto_prescott64p-r1.00 -lmkl_lapack64 -lmkl_em64t
> >> > -lguide
> >> > > >>
> >> > > >>
> >> > > >> If anyone has some recommendations to increase speed I would
> >> > appreciate
> >> > > >> it
> >> > > >>
> >> > > >> Michael
> >> > > >>
> >> > > >>
> >> > > >> _______________________________________________
> >> > > >> Wien mailing list
> >> > > >> Wien at zeus.theochem.tuwien.ac.at
> >> > > >> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
> >> > > >>
> >> > > >
> >> > > > _______________________________________________
> >> > > > Wien mailing list
> >> > > > Wien at zeus.theochem.tuwien.ac.at
> >> > > > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
> >> > > >
> >> > > >
> >> > >
> >> > >
> >> > > _______________________________________________
> >> > > Wien mailing list
> >> > > Wien at zeus.theochem.tuwien.ac.at
> >> > > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
> >> > >
> >> >
> >> > _______________________________________________
> >> > Wien mailing list
> >> > Wien at zeus.theochem.tuwien.ac.at
> >> > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
> >>
> >>
> >> _______________________________________________
> >> Wien mailing list
> >> Wien at zeus.theochem.tuwien.ac.at
> >> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
> >>
> >
> > _______________________________________________
> > Wien mailing list
> > Wien at zeus.theochem.tuwien.ac.at
> > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
> >
> > 
> 
> 
> _______________________________________________
> Wien mailing list
> Wien at zeus.theochem.tuwien.ac.at
> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
> 


More information about the Wien mailing list