<div dir="auto">That is quite an extraordinary time difference! The KKK/OpenBlas comparison is very clear, but I wonder if it is just the vectorization routines for the rest.<div dir="auto"><br></div><div dir="auto">One observation about ifort; it is really a very good optimizer, albeit sometimes too aggressive. There are quite a few loops in Wien2k which are technically not the fastest, with indices the wrong way around or redundant variables. At various times I have rewritten some of these, and seen almost no timing change. I think this is because ifort is itself doing the reorganization as well as others such as cache and memory optimizations.</div><div dir="auto"><br></div><div dir="auto">I doubt that gfortran is as good an optimizing compiler as ifort. <br><br><div data-smartmail="gmail_signature" dir="auto">---<br>Professor Laurence Marks<br>"Research is to see what everybody else has seen, and to think what nobody else has thought", Albert Szent-Gyorgi<br><a href="http://www.numis.northwestern.edu">http://www.numis.northwestern.edu</a><br>Corrosion in 4D <a href="http://MURI4D.numis.northwestern.edu">http://MURI4D.numis.northwestern.edu</a><br>Partner of the CFW 100% gender equity project, <a href="http://www.cfw.org/100-percent">www.cfw.org/100-percent</a><br>Co-Editor, Acta Cryst A<br>    </div></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Dec 12, 2016 04:04, "Peter Blaha" <<a href="mailto:pblaha@theochem.tuwien.ac.at">pblaha@theochem.tuwien.ac.at</a>> wrote:<br type="attribution"><blockquote class="quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Inspired by the recent posts about   gfortran and openblas, I made some<br>
timing tests myself.<br>
<br>
I was using the "Test-Case" (serial benchmark) from our website (a<br>
complex case with NMAT=3481.<br>
<br>
I tested it on an Intel I7-3939 (6 core) processor with either ifort+mkl<br>
(2016.3.210)  or   gfortran+openblas.<br>
I was using 1, 2, 4 or 6 cores (set via OMP_NUM_TRHEADS) of one PC:<br>
<br>
1 core:<br>
Intel     TIME HAMILT (WALL) =     5.2, HNS =     4.2, DIAG =    25.8<br>
gfortran: TIME HAMILT (WALL) =    36.3, HNS =     4.0, DIAG =    25.0<br>
<br>
2 cores:<br>
Intel     TIME HAMILT (WALL) =     5.3, HNS =     2.5, DIAG =    14.4<br>
gfortran: TIME HAMILT (WALL) =    36.3, HNS =     2.4, DIAG =    13.4<br>
<br>
4 cores:<br>
Intel     TIME HAMILT (WALL) =     5.3, HNS =     1.7, DIAG =     7.7<br>
gfortran: TIME HAMILT (WALL) =    36.6, HNS =     1.7, DIAG =     7.9<br>
<br>
6 cores:<br>
Intel     TIME HAMILT (WALL) =     5.3, HNS =     1.5, DIAG =     6.4<br>
gfortran: TIME HAMILT (WALL) =    36.4, HNS =     2.0, DIAG =     7.4<br>
<br>
So obviously, the openblas is really VERY good and basically of the same<br>
quality as the MKL (if not faster !!).<br>
<br>
But: Setting up the eigenvalue problems (HAMILT) involves the<br>
calculation of many cosines (exponentials) and we can use the<br>
"vector-cosines" from the mkl. This makes ifort in this part 7 times<br>
faster !!!!<br>
This can also be seen from the partial timing in case.output1 of the<br>
hamilt-times, where phase and us are significantly faster:<br>
<br>
ifort<br>
Time for al,bl    (hamilt, cpu/wall) :          0.3         0.3<br>
Time for legendre (hamilt, cpu/wall) :          0.1         0.1<br>
Time for phase    (hamilt, cpu/wall) :          1.1         1.3<br>
Time for us       (hamilt, cpu/wall) :          1.2         1.2<br>
Time for overlaps (hamilt, cpu/wall) :          2.0         1.9<br>
Time for distrib  (hamilt, cpu/wall) :          0.1         0.0<br>
gfortran<br>
Time for al,bl    (hamilt, cpu/wall) :          0.2         0.3<br>
Time for legendre (hamilt, cpu/wall) :          0.2         0.2<br>
Time for phase    (hamilt, cpu/wall) :         25.9        25.3<br>
Time for us       (hamilt, cpu/wall) :          6.3         6.8<br>
Time for overlaps (hamilt, cpu/wall) :          2.8         3.0<br>
Time for distrib  (hamilt, cpu/wall) :          0.0         0.0<br>
<br>
This limits gfortan significantly, making it in these tests a factor of<br>
two (or, when using 4 cores a factor of 3) slower.<br>
<br>
Anyway, the openblas is really good, and if somebody would know how to<br>
"vectorize" the cos, sin (exp) calls in gfortran this would be very<br>
valuable.<br>
<br>
Peter Blaha<br>
<div class="quoted-text"><br>
On 12/08/2016 01:51 PM, John Rundgren wrote:<br>
> Dear Arthur,<br>
><br>
> "Linker Flags" and "R_LIB" are found by consulting google on<br>
> "xianyi-openblas user manual".<br>
><br>
> The "include" flag is necessary, otherwise there is a conflict with<br>
> /usr/link/ld.<br>
><br>
> Xianyi recommends -lopenblas and adds -lpthread -lgfortran with<br>
> motivations understood by wise Linuxers. They have not done any harm.<br>
><br>
> Could you improve calculation time ...? In a previous wien-bounces you<br>
> find a test where gfortran+openblas is fully competitive with intel+mkl.<br>
> A try is worthwhile.<br>
><br>
> Best regards / John<br>
><br>
><br>
> John Rundgren<br>
> Department of Theoretical Physics, KTH Royal Institute of Technology<br>
><br>
><br>
</div>> ______________________________<wbr>_________________<br>
> Wien mailing list<br>
> <a href="mailto:Wien@zeus.theochem.tuwien.ac.at">Wien@zeus.theochem.tuwien.ac.<wbr>at</a><br>
> <a href="https://urldefense.proofpoint.com/v2/url?u=http-3A__zeus.theochem.tuwien.ac.at_mailman_listinfo_wien&d=CwICAg&c=yHlS04HhBraes5BQ9ueu5zKhE7rtNXt_d012z2PA6ws&r=U_T4PL6jwANfAy4rnxTj8IUxm818jnvqKFdqWLwmqg0&m=LEs97us_bzM_-ekk7jHI23IDX5u8HNB0WsYy5TlAROU&s=4ZF5koixKVtEe8uFZ3WxwMYft6mbBOA_NeMfSH4_eyA&e=" rel="noreferrer" target="_blank">https://urldefense.proofpoint.<wbr>com/v2/url?u=http-3A__zeus.<wbr>theochem.tuwien.ac.at_mailman_<wbr>listinfo_wien&d=CwICAg&c=<wbr>yHlS04HhBraes5BQ9ueu5zKhE7rtNX<wbr>t_d012z2PA6ws&r=U_<wbr>T4PL6jwANfAy4rnxTj8IUxm818jnvq<wbr>KFdqWLwmqg0&m=LEs97us_bzM_-<wbr>ekk7jHI23IDX5u8HNB0WsYy5TlAROU<wbr>&s=<wbr>4ZF5koixKVtEe8uFZ3WxwMYft6mbBO<wbr>A_NeMfSH4_eyA&e=</a><br>
> SEARCH the MAILING-LIST at:  <a href="https://urldefense.proofpoint.com/v2/url?u=http-3A__www.mail-2Darchive.com_wien-40zeus.theochem.tuwien.ac.at_index.html&d=CwICAg&c=yHlS04HhBraes5BQ9ueu5zKhE7rtNXt_d012z2PA6ws&r=U_T4PL6jwANfAy4rnxTj8IUxm818jnvqKFdqWLwmqg0&m=LEs97us_bzM_-ekk7jHI23IDX5u8HNB0WsYy5TlAROU&s=PkWwLIVnknBQProsZ2RGefHAlq2CYov5A2ip53hpcqg&e=" rel="noreferrer" target="_blank">https://urldefense.proofpoint.<wbr>com/v2/url?u=http-3A__www.<wbr>mail-2Darchive.com_wien-<wbr>40zeus.theochem.tuwien.ac.at_<wbr>index.html&d=CwICAg&c=<wbr>yHlS04HhBraes5BQ9ueu5zKhE7rtNX<wbr>t_d012z2PA6ws&r=U_<wbr>T4PL6jwANfAy4rnxTj8IUxm818jnvq<wbr>KFdqWLwmqg0&m=LEs97us_bzM_-<wbr>ekk7jHI23IDX5u8HNB0WsYy5TlAROU<wbr>&s=<wbr>PkWwLIVnknBQProsZ2RGefHAlq2CYo<wbr>v5A2ip53hpcqg&e=</a><br>
><br>
<br>
--<br>
<br>
                                       P.Blaha<br>
------------------------------<wbr>------------------------------<wbr>--------------<br>
Peter BLAHA, Inst.f. Materials Chemistry, TU Vienna, A-1060 Vienna<br>
Phone: +43-1-58801-165300             FAX: +43-1-58801-165982<br>
Email: <a href="mailto:blaha@theochem.tuwien.ac.at">blaha@theochem.tuwien.ac.at</a>    WIEN2k: <a href="https://urldefense.proofpoint.com/v2/url?u=http-3A__www.wien2k.at&d=CwICAg&c=yHlS04HhBraes5BQ9ueu5zKhE7rtNXt_d012z2PA6ws&r=U_T4PL6jwANfAy4rnxTj8IUxm818jnvqKFdqWLwmqg0&m=LEs97us_bzM_-ekk7jHI23IDX5u8HNB0WsYy5TlAROU&s=1xbt5lr-9fo0OyUhcIh_lWOQfAW0G1JRzhMhq99AvNo&e=" rel="noreferrer" target="_blank">https://urldefense.proofpoint.<wbr>com/v2/url?u=http-3A__www.<wbr>wien2k.at&d=CwICAg&c=<wbr>yHlS04HhBraes5BQ9ueu5zKhE7rtNX<wbr>t_d012z2PA6ws&r=U_<wbr>T4PL6jwANfAy4rnxTj8IUxm818jnvq<wbr>KFdqWLwmqg0&m=LEs97us_bzM_-<wbr>ekk7jHI23IDX5u8HNB0WsYy5TlAROU<wbr>&s=1xbt5lr-9fo0OyUhcIh_<wbr>lWOQfAW0G1JRzhMhq99AvNo&e=</a><br>
WWW:   <a href="https://urldefense.proofpoint.com/v2/url?u=http-3A__www.imc.tuwien.ac.at_TC-5FBlaha&d=CwICAg&c=yHlS04HhBraes5BQ9ueu5zKhE7rtNXt_d012z2PA6ws&r=U_T4PL6jwANfAy4rnxTj8IUxm818jnvqKFdqWLwmqg0&m=LEs97us_bzM_-ekk7jHI23IDX5u8HNB0WsYy5TlAROU&s=TDhIh5Sq25APcF9q2jXtVUsF80QcruC4QDMuU77MdB8&e=" rel="noreferrer" target="_blank">https://urldefense.proofpoint.<wbr>com/v2/url?u=http-3A__www.imc.<wbr>tuwien.ac.at_TC-5FBlaha&d=<wbr>CwICAg&c=<wbr>yHlS04HhBraes5BQ9ueu5zKhE7rtNX<wbr>t_d012z2PA6ws&r=U_<wbr>T4PL6jwANfAy4rnxTj8IUxm818jnvq<wbr>KFdqWLwmqg0&m=LEs97us_bzM_-<wbr>ekk7jHI23IDX5u8HNB0WsYy5TlAROU<wbr>&s=<wbr>TDhIh5Sq25APcF9q2jXtVUsF80Qcru<wbr>C4QDMuU77MdB8&e=</a><br>
------------------------------<wbr>------------------------------<wbr>--------------<br>
______________________________<wbr>_________________<br>
Wien mailing list<br>
<a href="mailto:Wien@zeus.theochem.tuwien.ac.at">Wien@zeus.theochem.tuwien.ac.<wbr>at</a><br>
<a href="https://urldefense.proofpoint.com/v2/url?u=http-3A__zeus.theochem.tuwien.ac.at_mailman_listinfo_wien&d=CwICAg&c=yHlS04HhBraes5BQ9ueu5zKhE7rtNXt_d012z2PA6ws&r=U_T4PL6jwANfAy4rnxTj8IUxm818jnvqKFdqWLwmqg0&m=LEs97us_bzM_-ekk7jHI23IDX5u8HNB0WsYy5TlAROU&s=4ZF5koixKVtEe8uFZ3WxwMYft6mbBOA_NeMfSH4_eyA&e=" rel="noreferrer" target="_blank">https://urldefense.proofpoint.<wbr>com/v2/url?u=http-3A__zeus.<wbr>theochem.tuwien.ac.at_mailman_<wbr>listinfo_wien&d=CwICAg&c=<wbr>yHlS04HhBraes5BQ9ueu5zKhE7rtNX<wbr>t_d012z2PA6ws&r=U_<wbr>T4PL6jwANfAy4rnxTj8IUxm818jnvq<wbr>KFdqWLwmqg0&m=LEs97us_bzM_-<wbr>ekk7jHI23IDX5u8HNB0WsYy5TlAROU<wbr>&s=<wbr>4ZF5koixKVtEe8uFZ3WxwMYft6mbBO<wbr>A_NeMfSH4_eyA&e=</a><br>
SEARCH the MAILING-LIST at:  <a href="https://urldefense.proofpoint.com/v2/url?u=http-3A__www.mail-2Darchive.com_wien-40zeus.theochem.tuwien.ac.at_index.html&d=CwICAg&c=yHlS04HhBraes5BQ9ueu5zKhE7rtNXt_d012z2PA6ws&r=U_T4PL6jwANfAy4rnxTj8IUxm818jnvqKFdqWLwmqg0&m=LEs97us_bzM_-ekk7jHI23IDX5u8HNB0WsYy5TlAROU&s=PkWwLIVnknBQProsZ2RGefHAlq2CYov5A2ip53hpcqg&e=" rel="noreferrer" target="_blank">https://urldefense.proofpoint.<wbr>com/v2/url?u=http-3A__www.<wbr>mail-2Darchive.com_wien-<wbr>40zeus.theochem.tuwien.ac.at_<wbr>index.html&d=CwICAg&c=<wbr>yHlS04HhBraes5BQ9ueu5zKhE7rtNX<wbr>t_d012z2PA6ws&r=U_<wbr>T4PL6jwANfAy4rnxTj8IUxm818jnvq<wbr>KFdqWLwmqg0&m=LEs97us_bzM_-<wbr>ekk7jHI23IDX5u8HNB0WsYy5TlAROU<wbr>&s=<wbr>PkWwLIVnknBQProsZ2RGefHAlq2CYo<wbr>v5A2ip53hpcqg&e=</a><br>
</blockquote></div><br></div>