[Wien] Installation with MPI and GNU compilers

Rui Costa ruicosta.r15 at gmail.com
Mon Apr 30 19:35:48 CEST 2018


I was able to install wien2k with gfortran+MKL. Apparently the MKL
libraries are free [https://software.intel.com/en-us/performance-libraries]
but not the compilers.

While doing the benchmark tests we noticed that during the Hamilt there was
a huge difference between this and an ifort+MKL compilation, and as Pavel
said, this comes from the VML functions. This is not the case during DIAG
because while the DIAG belongs to MKL, Hamilt is from wien2k. I then tried
to compile with these VML functions but I couldn't because I need an
ifcore.mod file that comes with intel compilers I think, at least it is not
in the free MKL version.

Do you have any recommendation about the compilation options that could
better optimize wien2k?

The ones I used are the following:

 ***********************************************************************
 *                 Specify compiler and linker options                 *
 ***********************************************************************


 Recommended options for system linuxgfortran are:
      Compiler options:        -ffree-form -O2 -ffree-line-length-none
      Linker Flags:            $(FOPT) -L../SRC_lib
      Preprocessor flags:      '-DParallel'
      R_LIB (LAPACK+BLAS):     -lopenblas -llapack -lpthread

 Current settings:
  O   Compiler options:        -ffree-form -O2 -ftree-vectorize
-ffree-line-length-none -fopenmp -m64 -I$(MKLROOT)/include
-I/opt/openmpi/include
  L   Linker Flags:            $(FOPT) -L$(MKLROOT)/lib/$(MKL_TARGET_ARCH)
-L/opt/openmpi/lib -L/opt/fftw3/lib -pthread
  P   Preprocessor flags       '-DParallel'
  R   R_LIBS (LAPACK+BLAS):
/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64_lin/libmkl_blas95_lp64.a
/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64_lin/libmkl_lapack95_lp64.a
-Wl,--no-as-needed -lmkl_gf_lp64 -lmkl_gnu_thread -lmkl_core -lgomp
-lpthread -lm -ldl
  X   LIBX options:            -DLIBXC -I/opt/etsf/include
      LIBXC-LIBS:              -L/opt/etsf/lib -lxcf03 -lxc

   ***********************************************************************
   *             Specify parallel options and library settings           *
   ***********************************************************************

   Your current parallel settings (options and libraries) are:

     C   Parallel Compiler:          mpifort
     FP  Parallel Compiler Options:  -ffree-form -O2 -ftree-vectorize
-ffree-line-length-none -fopenmp -m64 -I$(MKLROOT)/include
-I/opt/openmpi/include
     MP  MPIRUN command:             mpirun -np _NP_ -machinefile _HOSTS_
_EXEC_

   Additional setting for SLURM batch systems (is set to 1 otherwise):

     CN  Number of Cores:            1

   Libraries:

     F   FFTW options:                -DFFTW3 -I/opt/fftw3/include
         FFTW-LIBS:                   -L/opt/fftw3/lib -lfftw3
         FFTW-PLIBS:                  -lfftw3_mpi
     Sp  SCALAPACK:
 -L/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64
                                                     -lmkl_scalapack_lp64

 -L/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64
-lmkl_blacs_openmpi_lp64
     E   ELPA options:
         ELPA-LIBS:

   Since you use gfortran you might need to specify additional libraries.
   You have to make sure that all necessary libraries are present (e.g.
MPI, ...)
   and can be found by the linker (specify, if necessary,
-L/Path_to_library )!

     RP  Parallel-Libs for gfortran:



Additionally, whenever I try to run a simulation with "-it" flag the
simulations fail in the second cycle with a "Fortran runtime error". In
this example I am doing TiC from the UG and executing the command "run_lapw
-it":

hup: Command not found.
STOP  LAPW0 END
foreach: No match.
Note: The following floating-point exceptions are signalling: IEEE_DENORMAL
STOP  LAPW1 END
STOP  LAPW2 END
STOP  CORE  END
STOP  MIXER END
ec cc and fc_conv 0 1 1
in cycle 2    ETEST: 0   CTEST: 0
hup: Command not found.
STOP  LAPW0 END
At line 140 of file jacdavblock_tmp_.F (unit = 200, file =
'./TiC.storeHinv_proc_0')
Fortran runtime error: Sequential READ or WRITE not allowed after EOF
marker, possibly use REWIND or BACKSPACE

>   stop error


The "TiC.storeHinv_proc_0" file is empty and I can't find the file "
jacdavblock_tmp_.F". What could be the problem?

Best regards,
Rui Costa.

On 5 April 2018 at 11:18, Pavel Ondračka <pavel.ondracka at email.cz> wrote:

> Laurence Marks píše v St 04. 04. 2018 v 16:01 +0000:
> > I confess to being rather doubtful that gfortran+... is comparable to
> > ifort+... for Intel cpu, it might be for AMD. While the mkl vector
> > libraries are useful in a few codes such as aim, they are minor for
> > the main lapw[0-2].
>
> Well, some fast benchmark data then (serial benchmark single core):
> Intel(R) Core(TM) i7-5820K CPU @ 3.30GHz (haswell)
> Wien2k 17.1
>
> -------------
>
> gfortran 7.3.1 + OPENBLAS 0.2.20 + glibc 2.26 (with the custom patch to
> use libmvec):
>
> Time for al,bl    (hamilt, cpu/wall) :          0.2         0.2
> Time for legendre (hamilt, cpu/wall) :          0.1         0.2
> Time for phase    (hamilt, cpu/wall) :          1.2         1.2
> Time for us       (hamilt, cpu/wall) :          1.2         1.2
> Time for overlaps (hamilt, cpu/wall) :          2.6         2.8
> Time for distrib  (hamilt, cpu/wall) :          0.1         0.1
> Time sum iouter   (hamilt, cpu/wall) :          5.5         5.8
>  number of local orbitals, nlo (hamilt)      304
>        allocate YL           2.5
> MB          dimensions    15  3481     3
>        allocate phsc         0.1 MB          dimensions  3481
> Time for los      (hamilt, cpu/wall) :          0.4         0.3
> Time for alm         (hns) :          0.1
> Time for vector      (hns) :          0.3
> Time for vector2     (hns) :          0.3
> Time for VxV         (hns) :          2.1
> Wall Time for VxV    (hns) :          0.1
>          245  Eigenvalues computed
>  Seclr4(Cholesky complete (CPU)) :               1.380     40754.14
> Mflops
>  Seclr4(Transform to eig.problem (CPU)) :        4.470     37745.44
> Mflops
>  Seclr4(Compute eigenvalues (CPU)) :            12.750     17643.13
> Mflops
>  Seclr4(Backtransform (CPU)) :                   0.290     10237.08
> Mflops
>        TIME HAMILT (CPU)  =     5.8, HNS =     2.5, HORB =     0.0,
> DIAG =    18.9
>        TIME HAMILT (WALL) =     6.1, HNS =     2.5, HORB =     0.0,
> DIAG =    19.0
>
> real    0m28.610s
> user    0m27.817s
> sys     0m0.394s
>
> -----------
>
> Ifort 17.0.0 + MKL 2017.0:
>
> Time for al,bl    (hamilt, cpu/wall) :          0.2         0.2
> Time for legendre (hamilt, cpu/wall) :          0.1         0.2
> Time for phase    (hamilt, cpu/wall) :          1.2         1.3
> Time for us       (hamilt, cpu/wall) :          1.0         1.0
> Time for overlaps (hamilt, cpu/wall) :          2.6         2.8
> Time for distrib  (hamilt, cpu/wall) :          0.1         0.1
> Time sum iouter   (hamilt, cpu/wall) :          5.4         5.6
>  number of local orbitals, nlo (hamilt)      304
>        allocate YL           2.5
> MB          dimensions    15  3481     3
>        allocate phsc         0.1 MB          dimensions  3481
> Time for los      (hamilt, cpu/wall) :          0.2         0.2
> Time for alm         (hns) :          0.0
> Time for vector      (hns) :          0.4
> Time for vector2     (hns) :          0.4
> Time for VxV         (hns) :          2.1
> Wall Time for VxV    (hns) :          0.1
>          245  Eigenvalues computed
>  Seclr4(Cholesky complete (CPU)) :               1.110     50667.31
> Mflops
>  Seclr4(Transform to eig.problem (CPU)) :        3.580     47129.09
> Mflops
>  Seclr4(Compute eigenvalues (CPU)) :            11.320     19873.04
> Mflops
>  Seclr4(Backtransform (CPU)) :                   0.250     11875.01
> Mflops
>        TIME HAMILT (CPU)  =     5.7, HNS =     2.6, HORB =     0.0,
> DIAG =    16.3
>        TIME HAMILT (WALL) =     5.9, HNS =     2.6, HORB =     0.0,
> DIAG =    16.3
>
> real    0m25.587s
> user    0m24.857s
> sys     0m0.321s
> -------------
>
> So I apologize for my statement in the last email that was too
> ambitious. Indeed in this particular case the opensource stack is ~12%
> slower (25 vs 28 seconds). Most of this is in the DIAG part (which I
> believe is where OpenBLAS comes to play). However on some other (older)
> Intel CPUs the DIAG part can be even faster with OpenBLAS, see the
> already mentioned email by prof. Blaha https://www.mail-archive.com/wie
> n at zeus.theochem.tuwien.ac.at/msg15106.html where he tested on i7-3930K
> (sandybridge), hence for those older CPUs I would expect the
> performance to be really comparable (with the small patch to utilize
> the libmvec in order to speed up the HAMILT part).
>
> In general the opensource support is usually slow to materialize hence
> the performance on older CPUs is better. Especially in the OpenBLAS
> where the optimizations for new CPUs and instruction sets are not
> provided by Intel (contrary to the gcc, gfrortran and glibc where Intel
> engineers contribute directly) while the MKL and ifort have good
> support from day 1.
>
> I do agree that it is better to advise users to use MKL+ifort since
> when they have it properly installed the siteconfig is almost always
> able to detect and build everything out of the box with default config.
> This is unfortunately not the case with the opensource libraries, where
> the detection does not work most of time due to distro differences and
> the unfortunate fact that majority of the needed libraries does not
> provide any good means for autodetection (e.g. proper package config
> files), hence the user must edit the compiler flags by hand. I just
> believe that the "ifort is always much faster that gfortran" dogma is
> no longer always true.
>
> Best regards
> Pavel
> _______________________________________________
> Wien mailing list
> Wien at zeus.theochem.tuwien.ac.at
> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
> SEARCH the MAILING-LIST at:  http://www.mail-archive.com/
> wien at zeus.theochem.tuwien.ac.at/index.html
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://zeus.theochem.tuwien.ac.at/pipermail/wien/attachments/20180430/30621f7d/attachment.html>


More information about the Wien mailing list