[Wien] Wien2k on AVX512 CPUs

Laurence Marks L-marks at northwestern.edu
Wed Feb 27 10:14:53 CET 2019


N.B., there was an seclr4 update posted some time ago, I think by Thomas
Ruh. This may be needed, and may not be in the current Wien2k release on
the web page.

The next release will do a better job I suspect.

_____
Professor Laurence Marks
"Research is to see what everybody else has seen, and to think what nobody
else has thought", Albert Szent-Gyorgi
www.numis.northwestern.edu

On Wed, Feb 27, 2019, 03:07 Laurence Marks <L-marks at northwestern.edu wrote:

> I think Peter may have mispoke about the latest elpa. I believe it will
> run OK if you compile it (--enable-AVX512 etc) so the highest kernel is
> equal to the lowest instruction set you use. You may also get it to work by
> using their environmental variables. With the current Wien2k you cannot
> exploit elpa optimally if you have a heterogeneous set of nodes.
>
> I would say 30% faster comparing a 6130 to a E5-2650. However, ifort
> compiler switches can make a big difference, as can the mpi version.
>
> N.B., I can dig up my elpa compiler options later if needed. I use
> ifort/icc/mpiifort/mpiicc.
>
> _____
> Professor Laurence Marks
> "Research is to see what everybody else has seen, and to think what nobody
> else has thought", Albert Szent-Gyorgi
> www.numis.northwestern.edu
>
> On Wed, Feb 27, 2019, 02:50 Peter Blaha <pblaha at theochem.tuwien.ac.at
> wrote:
>
>> We have an Intel I7-7820X CPU @ 3.60GHz with 8 cores and avx512.
>>
>> The testcase with OMP_NUM_THREADS=1 runs a bit faster with avx512 than
>> with avx2, but it is a rather small effect (at least when working with
>> this MKL_ENABLE_INSTRUCTIONS variable:
>> ----------------------avx512
>>         TIME HAMILT (CPU)  =     5.1, HNS =     2.1, HORB =     0.0,
>> DIAG =    15.3
>>         TIME HAMILT (WALL) =     5.4, HNS =     2.1, HORB =     0.0,
>> DIAG =    15.3
>> ----------------------avx2
>>         TIME HAMILT (CPU)  =     5.8, HNS =     2.5, HORB =     0.0,
>> DIAG =    16.3
>>         TIME HAMILT (WALL) =     6.1, HNS =     2.5, HORB =     0.0,
>> DIAG =    16.3
>>
>> However, when using OMP_NUM_THREADS=8, this difference is further
>> reduced (probably due to memory bounds ?)
>> -----------------------avx512
>>         TIME HAMILT (CPU)  =    19.9, HNS =     7.7, HORB =     0.0,
>> DIAG =    24.2
>>         TIME HAMILT (WALL) =     2.6, HNS =     1.0, HORB =     0.0,
>> DIAG =     3.2
>> ------------------------avx2
>>         TIME HAMILT (CPU)  =    20.0, HNS =     7.4, HORB =     0.0,
>> DIAG =    27.0
>>         TIME HAMILT (WALL) =     2.6, HNS =     1.0, HORB =     0.0,
>> DIAG =     3.5
>> -------------------------------------------------------------------------
>>
>> Yes, we have the latest ELPA elpa-2018.11.001 installed. Seems to run
>> without problems and is overall significantly better than the old ELPA),
>> but it requires a change in the user interface. The next release of
>> WIEN2k will have two elpa versions supported, a ELPA15 (which is in
>> WIEN2k_18), and a new ELPA interface for elpa versions later than 2017
>> (this is somehow like FFTW2 and FFTW3 versions).
>>
>> So in essence: with the present code one cannot use ELPA-versions from
>> 2017 or later.
>>
>> On 2/27/19 7:34 AM, Pavel Ondračka wrote:
>> > Dear mailing list,
>> >
>> > just out of curiosity has anyone any experience running Wien2k on a
>> > AVX512 capable machine (eg. the KNL accelerators or recent Intel
>> > skylake-avx512 CPUs)?
>> >
>> > Recently my cluster updated to this skylake-avx512 machines however I'm
>> > unable to get any better performance for Wien2k. In particular MKL seem
>> > to suck, for example in single core performance (with the serial
>> > test_case) the eigenvalue problem is actually faster when I forbid the
>> > usage of AVX512 instructions:
>> >
>> > running with MKL_VERBOSE=1 MKL_ENABLE_INSTRUCTIONS=AVX2
>> > MKL_VERBOSE
>> > ZHETRD(L,3481,0x2b74d8567cc0,3481,0x2b74d82121c0,0x2b74d8218e88,0x2b74e
>> > f769b00,0x2b74ef777490,452530,0) 10.21s CNR:OFF Dyn:1 FastMM:1
>> > TID:0  NThr:1
>> >
>> > with MKL_ENABLE_INSTRUCTIONS=AVX512
>> > MKL_VERBOSE
>> > ZHETRD(L,3481,0x2b5397c96cc0,3481,0x2b53979411c0,0x2b5397947e88,0x2b53a
>> > ee98b00,0x2b53aeea6490,452530,0) 12.31s CNR:OFF Dyn:1 FastMM:1
>> > TID:0  NThr:1
>> >
>> > This is somewhat compensated by speedups in the hamilt part (the VML
>> > stuff and various ?GEMMs seem to be actually slightly faster), but
>> > overall the performance is mostly the same with and without the AVX512
>> > stuff. OpenBLAS is maybe 15% slower so not an option as well...
>> >
>> > Moreover for MPI version I'm not able to get a correctly working ELPA
>> > compiled with the AVX512 support (I went for the latest elpa-
>> > 2018.11.001 version), it just returns bogus results and diverges after
>> > few iterations. If someone has this working I'd be really grateful for
>> > a working configure line, and advice with which elpa and which compiler
>> > version this was.
>> >
>> > Unfortunately I was not able to get any support from the cluster admins
>> > beyond "We see a 30% per-core performance increase in average"
>> > therefore asking here if anyone has experience with such machines.
>> >
>> > Any advice would be appreciated.
>> > Best regards
>> > Pavel
>> >
>> > _______________________________________________
>> > Wien mailing list
>> > Wien at zeus.theochem.tuwien.ac.at
>> >
>> https://urldefense.proofpoint.com/v2/url?u=http-3A__zeus.theochem.tuwien.ac.at_mailman_listinfo_wien&d=DwIGaQ&c=yHlS04HhBraes5BQ9ueu5zKhE7rtNXt_d012z2PA6ws&r=U_T4PL6jwANfAy4rnxTj8IUxm818jnvqKFdqWLwmqg0&m=0vwn_c2KmvYL2EmszqmMAxn22_AHFhqVwSIMrLn_c_8&s=9rbXdyGFAJctXB2SLaOcC0V-kJ5Pi8IEjT4Rh-WXr7E&e=
>> > SEARCH the MAILING-LIST at:
>> https://urldefense.proofpoint.com/v2/url?u=http-3A__www.mail-2Darchive.com_wien-40zeus.theochem.tuwien.ac.at_index.html&d=DwIGaQ&c=yHlS04HhBraes5BQ9ueu5zKhE7rtNXt_d012z2PA6ws&r=U_T4PL6jwANfAy4rnxTj8IUxm818jnvqKFdqWLwmqg0&m=0vwn_c2KmvYL2EmszqmMAxn22_AHFhqVwSIMrLn_c_8&s=qjTxSMAPwx29qPYmofuPDU3WxGJX4Yw4QkCHJKo7T8g&e=
>> >
>>
>> --
>>
>>                                        P.Blaha
>> --------------------------------------------------------------------------
>> Peter BLAHA, Inst.f. Materials Chemistry, TU Vienna, A-1060 Vienna
>> Phone: +43-1-58801-165300             FAX: +43-1-58801-165982
>> Email: blaha at theochem.tuwien.ac.at    WIEN2k:
>> https://urldefense.proofpoint.com/v2/url?u=http-3A__www.wien2k.at&d=DwIGaQ&c=yHlS04HhBraes5BQ9ueu5zKhE7rtNXt_d012z2PA6ws&r=U_T4PL6jwANfAy4rnxTj8IUxm818jnvqKFdqWLwmqg0&m=0vwn_c2KmvYL2EmszqmMAxn22_AHFhqVwSIMrLn_c_8&s=TFV0KhtG7EcQlTVqkdKqOmMJVdxRAy3ZuDrld-uWvIM&e=
>> WWW:
>> https://urldefense.proofpoint.com/v2/url?u=http-3A__www.imc.tuwien.ac.at_TC-5FBlaha&d=DwIGaQ&c=yHlS04HhBraes5BQ9ueu5zKhE7rtNXt_d012z2PA6ws&r=U_T4PL6jwANfAy4rnxTj8IUxm818jnvqKFdqWLwmqg0&m=0vwn_c2KmvYL2EmszqmMAxn22_AHFhqVwSIMrLn_c_8&s=YmE7c8gn2QT2WRBkXhUey5BerwAAUH0MfBj8RNBoNNQ&e=
>> --------------------------------------------------------------------------
>> _______________________________________________
>> Wien mailing list
>> Wien at zeus.theochem.tuwien.ac.at
>>
>> https://urldefense.proofpoint.com/v2/url?u=http-3A__zeus.theochem.tuwien.ac.at_mailman_listinfo_wien&d=DwIGaQ&c=yHlS04HhBraes5BQ9ueu5zKhE7rtNXt_d012z2PA6ws&r=U_T4PL6jwANfAy4rnxTj8IUxm818jnvqKFdqWLwmqg0&m=0vwn_c2KmvYL2EmszqmMAxn22_AHFhqVwSIMrLn_c_8&s=9rbXdyGFAJctXB2SLaOcC0V-kJ5Pi8IEjT4Rh-WXr7E&e=
>> SEARCH the MAILING-LIST at:
>> https://urldefense.proofpoint.com/v2/url?u=http-3A__www.mail-2Darchive.com_wien-40zeus.theochem.tuwien.ac.at_index.html&d=DwIGaQ&c=yHlS04HhBraes5BQ9ueu5zKhE7rtNXt_d012z2PA6ws&r=U_T4PL6jwANfAy4rnxTj8IUxm818jnvqKFdqWLwmqg0&m=0vwn_c2KmvYL2EmszqmMAxn22_AHFhqVwSIMrLn_c_8&s=qjTxSMAPwx29qPYmofuPDU3WxGJX4Yw4QkCHJKo7T8g&e=
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://zeus.theochem.tuwien.ac.at/pipermail/wien/attachments/20190227/25c67277/attachment.html>


More information about the Wien mailing list