[Wien] Wien2k 19.1 with linux+gfortran benchmarks

Hemza hemza.dz at gmail.com
Thu Dec 12 15:28:53 CET 2019


thank you, Peter and Pavel,  for the clarifications.

On Thu, 12 Dec 2019 at 14:06, Pavel Ondračka <pavel.ondracka at email.cz>
wrote:

> I concur.
>
> In general for the serial test case on modern CPU (avx2 instructions)
> your runtime should be around or below 30seconds for single thread.
>
> However as this is almost 10 years old mobile CPU with just avx
> instructions the total runtime of slightly above 1 minute is expected.
>
> Regarding the scaling even when not memory bound, I can get around 35%
> runtime compared to serial run with openBLAS (MKL scales slightly
> better). Small speedups could be probably gained with some work on HNS
> section (as this is the worst scaling part which we have more or less
> under control) but for the DIAG part we just depend on the BLAS/LAPACK
> to scale properly.
>
> If you have multiple k-points and your total memory permits it, its
> best to use k-point parallelization and use OpenMP just for lapw0 and
> mixer...
>
> Pavel
>
> On Thu, 2019-12-12 at 13:42 +0100, Peter Blaha wrote:
> > It is perfectly ok for your hardware.
> >
> > The cpu time is not so important for you, what counts is the WALL-
> > time
> > (this is the time it really takes until it finishes).
> >
> > You can see that Hamilt parallelizes fairly well (3.7 vs. 12.3
> > seconds /
> > speedup factor 3.3), HNS is not so good (3.8 vs. 8.8 s / factor 2.3)
> > and
> > DIAG is worse (23.2 vs. 48.2 / factor 2.1).
> >
> > Part of the reason that you can never see a factor of 4 is the slow
> > memory access, so when 4 cores do some calculations, they have to
> > wait
> > sometimes for data from the memory.
> >
> > On machines with more cores and a better memory bus, you will get
> > other
> > speed-ups, but basically no machine can use all cores with 100%
> > efficiency because of this limited memory access.
> >
> >
> > On 12/12/19 1:07 PM, Hemza wrote:
> > > Hi everybody:
> > > I just finished updating my wien2k installation to 19.1 with
> > > openMP
> > > support (linux (4.19.88), gfortran (9.2.0), openblas-lapack-openmp
> > > (0.3.7), fftw3 (3.3.8), libxc (4.3.4)), and patches from
> > > "https://github.com/gsabo/WIEN2k-Patches".
> > > I intend to use it for relatively small cases (less than 25
> > > atoms/unit
> > > cell). I run 'x lapw1' on the test_case.
> > > With OMP_NUM_THREAD=4 in bashrc:
> > > --------------------------
> > > $ x lapw1
> > > STOP  LAPW1 END
> > > 113.876u 2.097s 0:31.36 369.7%  0+0k 424+37840io 2pf+0w
> > > $ grep HORB *output1*
> > > test_case.output1:       TIME HAMILT (CPU)  =    13.5, HNS =
> > >  12.6,
> > > HORB =     0.0, DIAG =    87.3, SYNC =     0.0
> > > test_case.output1:       TIME HAMILT (WALL) =     3.7, HNS =
> > > 3.8,
> > > HORB =     0.0, DIAG =    23.2, SYNC =     0.0
> > > --------------------------
> > >
> > > and with OMP_NUM_THREAD=1 , I got:
> > > -------------------------------------
> > > $ x lapw
> > > STOP  LAPW1 END
> > > 69.380u 0.339s 1:09.88 99.7%    0+0k 352+37848io 2pf+0w
> > > $ grep HORB *output1*
> > > test_case.output1:       TIME HAMILT (CPU)  =    12.0, HNS =
> > > 8.8,
> > > HORB =     0.0, DIAG =    48.1, SYNC =     0.0
> > > test_case.output1:       TIME HAMILT (WALL) =    12.3, HNS =
> > > 8.8,
> > > HORB =     0.0, DIAG =    48.2, SYNC =     0.0
> > > ------------------------------------
> > > I do not feel i really understand the output and I do not know if
> > > this
> > > timing are good, so I eager to read your comments!
> > >
> > > My machine ('inix -dm' output)
> > > ------------------------
> > > System:    Host: dojo Kernel: 4.19.88-1-lts x86_64 bits: 64
> > > Desktop: i3
> > > 4.17.1 Distro: Artix rolling
> > > Machine:   Type: Laptop System: ASUSTeK product: K53SD v: 1.0
> > > serial:
> > > <root required>
> > >             Mobo: ASUSTeK model: K53SD v: 1.0 serial: <root
> > > required>
> > > BIOS: American Megatrends v: K53SD.202
> > >             date: 11/02/2011
> > > Battery:   ID-1: BAT0 charge: 33.8 Wh condition: 33.8/59.4 Wh (57%)
> > > Memory:    RAM: total: 7.57 GiB used: 4.84 GiB (63.9%)
> > >             RAM Report: permissions: Unable to run dmidecode. Are
> > > you root?
> > > CPU:       Quad Core: Intel Core i7-2670QM type: MT MCP speed: 849
> > > MHz
> > > min/max: 800/3100 MHz
> > > Graphics:  Device-1: Intel 2nd Generation Core Processor Family
> > > Integrated Graphics driver: i915 v: kernel
> > >             Device-2: NVIDIA GF119M [GeForce 610M] driver: nouveau
> > > v:
> > > kernel
> > >             Display: x11 server: X.org 1.20.6 driver:
> > > intel,nouveau
> > > unloaded: fbdev,modesetting,vesa
> > >             resolution: <xdpyinfo missing>
> > >             Message: Unable to show advanced data. Required tool
> > > glxinfo
> > > missing.
> > > Network:   Device-1: Intel Centrino Wireless-N 100 driver: iwlwifi
> > >             Device-2: Qualcomm Atheros AR8151 v2.0 Gigabit
> > > Ethernet
> > > driver: atl1c
> > > Drives:    Local Storage: total: 2.05 TiB used: 1.45 TiB (70.8%)
> > > Info:      Processes: 300 Uptime: 1d 1h 46m Shell: bash inxi:
> > > 3.0.26
> > > -------------------------
> > >
> > > regards
> > >
> > > _______________________________________________
> > > Wien mailing list
> > > Wien at zeus.theochem.tuwien.ac.at
> > > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
> > > SEARCH the MAILING-LIST at:
> > > http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
> > >
>
> _______________________________________________
> Wien mailing list
> Wien at zeus.theochem.tuwien.ac.at
> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
> SEARCH the MAILING-LIST at:
> http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://zeus.theochem.tuwien.ac.at/pipermail/wien/attachments/20191212/e5d078d7/attachment.html>


More information about the Wien mailing list