<div dir="ltr">In fact Peter added the vector code in lapw1, although I added it to aim and lapw5. I did the W2kinit with some help.<div><br></div><div>I suspect I probably use the -DINTEL_VML parameter in W2kinit and perhaps aim/lapw5 a bit sloppily, and it could be generalized. For instance it makes sense to modify the code so -DOPENBLAS or similar is set and then have some compile time #ifdef statements.</div><div><br></div><div>However, this gets to be somewhat tricky as I don't have access to all compilers (and I suspect Peter does not either). </div><div><br></div><div>Also, W2kinit does some important things such as setup some error handlers and set ulimit. (If you go back a few years you will find ever third email of the list was about ulimit issues!) Setting this for other systems can be tricky. I think we resolved the MAC issues, but they seem to reoccur.</div><div><br></div><div>And...one has to worry about compatibility and portability. While Fortran is standard, C is less so and system calls embedded in compilers can change. Plus, gfortran is in a state of flux. When I recently tested the mixer with it I noticed that it gave a compile time warning that DO loops with floating point variables was a "deleted" feature. (Fortunately the mixer still seems to work.)</div></div><div class="gmail_extra"><br><div class="gmail_quote">On Wed, May 2, 2018 at 9:08 AM, Fecher, Gerhard <span dir="ltr"><<a href="mailto:fecher@uni-mainz.de" target="_blank">fecher@uni-mainz.de</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Dear Pavel,<br>
maybe it's better to ask Laurence, seems he was writing the VML things. <br>
<br>
I didn't look into the code within the last years, what I found on a fast look is:<br>
<br>
The only place where the INTEL_VML is used any longer seems to be in Hamilt.f of LAPW1<br>
I found that it is commented in all other cases where it was once used.<br>
<br>
If you don't use INTEL_VML, the INTEL ifort will vectorice the loops in vectf.f of LAPW1 (see code in Hamilt.f that calls it)<br>
(as I mentioned, maybe one has to link the libsvml explicitely)<br>
<br>
For example <br>
-O2 -xHost -qopt-report=1 -qopt-report-phase=vec<br>
will show you which loops were vectorized<br>
<br>
I could not see that the svml has a reduced accuracy, however, you can set the performance/accuracy level in the VML.<br>
What you can do is to set a threshhold for the loop size (similar to unroll), might need some short study of the manual.<br>
<br>
I could not see that in W2kinit.F a threshold for the loops (size of the arrays) was set,<br>
only the precision was set there for the INTEL_VML script, however,<br>
I guess that Laurence used it where only large arrays appeared.<br>
<br>
NB: I enjoy more questions about how to increase the speed or how to improve the code.<br>
<br>
<br>
Ciao<br>
Gerhard<br>
<br>
DEEP THOUGHT in D. Adams; Hitchhikers Guide to the Galaxy:<br>
"I think the problem, to be quite honest with you,<br>
is that you have never actually known what the question is."<br>
<br>
==============================<wbr>======<br>
Dr. Gerhard H. Fecher<br>
Institut of Inorganic and Analytical Chemistry<br>
Johannes Gutenberg - University<br>
55099 Mainz<br>
and<br>
Max Planck Institute for Chemical Physics of Solids<br>
01187 Dresden<br>
______________________________<wbr>__________<br>
Von: Pavel Ondračka [<a href="mailto:pavel.ondracka@email.cz">pavel.ondracka@email.cz</a>]<br>
Gesendet: Mittwoch, 2. Mai 2018 12:05<br>
An: Fecher, Gerhard<br>
Betreff: Re: [Wien] Installation with MPI and GNU compilers<br>
<br>
I'm using private answer since this might be getting too technical for<br>
the list and in fact not interesting for majority of users...<br>
<br>
Fecher, Gerhard píše v St 02. 05. 2018 v 09:00 +0000:<br>
> I never checked that: does the -DINTEL_VML switch correspond to the<br>
> VML library routines of MKL<br>
> or to the<br>
> SVML library routines of the compiler<br>
<br>
The lapw1 calls directly the VML library, for example the vdcos, vdsin<br>
functions, but I have not checked the rest of Wien2k.<br>
<br>
> this makes a difference, the svml routines are automatically invoked<br>
> by the INTEL compiler if one uses -O2 optimization or higher.<br>
> (check also the usage of the switches -vec, -no-vec, -vec-report)<br>
><br>
> The VML routines of the MKL make only sense for appropriate sizes of<br>
> the vectors, otherwise, they may even slow down the program (how much<br>
> might also depend on threads etc.).<br>
<br>
The common usage of the VML in Wien2k is to call the VML functions with<br>
a _large_ array as an argument. So if I understand it correctly the<br>
vectorization is done inside the VML and the VML chooses the best<br>
intrinsic. Since the arrays are large, there is a speedup in all cases.<br>
<br>
BTW are you sure the -O2 switch alone will give you the svml<br>
intrinsic? IMO the svml intrinsic have different accuracy (might not be<br>
strictly IEEE compliant as compared to the scalar variants) so I would<br>
expect you need to specify it explicitly with some additional flag that<br>
you are OK with this (e.g. for GCC you need the -ffast-math switch to<br>
get the vectorized sse,avx goniometric fuctions from the libmvec).<br>
<br>
> A note (for the INTEL Fortran):<br>
> I vaguely remember that the -DINTEL_VML switch did not bring any<br>
> better performance, at that time one needed to give the -lsvml (with<br>
> path to the compiler libs) explicitely.<br>
><br>
> Ciao<br>
> Gerhard<br>
><br>
Best regards<br>
Pavel<br>
______________________________<wbr>_________________<br>
Wien mailing list<br>
<a href="mailto:Wien@zeus.theochem.tuwien.ac.at">Wien@zeus.theochem.tuwien.ac.<wbr>at</a><br>
<a href="https://urldefense.proofpoint.com/v2/url?u=http-3A__zeus.theochem.tuwien.ac.at_mailman_listinfo_wien&d=DwIFBA&c=yHlS04HhBraes5BQ9ueu5zKhE7rtNXt_d012z2PA6ws&r=U_T4PL6jwANfAy4rnxTj8IUxm818jnvqKFdqWLwmqg0&m=wmyY1dhGbwZI3oD02dwXOGa_s3afe_LJdSea-dHnBZg&s=yKwJ1eDMhXNnx8lfTdFJzKZ2ky182xvw_qemg_Hap8c&e=" rel="noreferrer" target="_blank">https://urldefense.proofpoint.<wbr>com/v2/url?u=http-3A__zeus.<wbr>theochem.tuwien.ac.at_mailman_<wbr>listinfo_wien&d=DwIFBA&c=<wbr>yHlS04HhBraes5BQ9ueu5zKhE7rtNX<wbr>t_d012z2PA6ws&r=U_<wbr>T4PL6jwANfAy4rnxTj8IUxm818jnvq<wbr>KFdqWLwmqg0&m=<wbr>wmyY1dhGbwZI3oD02dwXOGa_s3afe_<wbr>LJdSea-dHnBZg&s=<wbr>yKwJ1eDMhXNnx8lfTdFJzKZ2ky182x<wbr>vw_qemg_Hap8c&e=</a><br>
SEARCH the MAILING-LIST at: <a href="https://urldefense.proofpoint.com/v2/url?u=http-3A__www.mail-2Darchive.com_wien-40zeus.theochem.tuwien.ac.at_index.html&d=DwIFBA&c=yHlS04HhBraes5BQ9ueu5zKhE7rtNXt_d012z2PA6ws&r=U_T4PL6jwANfAy4rnxTj8IUxm818jnvqKFdqWLwmqg0&m=wmyY1dhGbwZI3oD02dwXOGa_s3afe_LJdSea-dHnBZg&s=9-0hCMIR1GVVfXz3pmjs9JR2Su5gfqttq9X1dt11Rn8&e=" rel="noreferrer" target="_blank">https://urldefense.proofpoint.<wbr>com/v2/url?u=http-3A__www.<wbr>mail-2Darchive.com_wien-<wbr>40zeus.theochem.tuwien.ac.at_<wbr>index.html&d=DwIFBA&c=<wbr>yHlS04HhBraes5BQ9ueu5zKhE7rtNX<wbr>t_d012z2PA6ws&r=U_<wbr>T4PL6jwANfAy4rnxTj8IUxm818jnvq<wbr>KFdqWLwmqg0&m=<wbr>wmyY1dhGbwZI3oD02dwXOGa_s3afe_<wbr>LJdSea-dHnBZg&s=9-<wbr>0hCMIR1GVVfXz3pmjs9JR2Su5gfqtt<wbr>q9X1dt11Rn8&e=</a><br>
</blockquote></div><br><br clear="all"><div><br></div>-- <br><div class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><span style="font-size:12.8px">Professor Laurence Marks</span><br></div><div dir="ltr"><span style="font-size:12.8px">"Research is to see what everybody else has seen, and to think what nobody else has thought", </span><span style="font-size:12.8px">Albert Szent-Gyorgi</span><br><a href="http://www.numis.northwestern.edu" target="_blank">www.numis.northwestern.edu</a> ; <span style="font-size:12.8px">Corrosion in 4D: </span><a href="http://MURI4D.numis.northwestern.edu" style="font-size:12.8px" target="_blank">MURI4D.numis.northwestern.edu</a><div><span style="font-size:12.8px">Partner of the CFW 100% program for gender equity, </span><a href="http://www.cfw.org/100-percent" style="font-size:12.8px" target="_blank">www.cfw.org/100-percent</a></div><div>Co-Editor, Acta Cryst A</div></div></div></div></div></div></div></div></div></div></div>
</div>