[Wien] Parallel calculation with Dual Quad Core Processors

Gerhard Fecher fecher at uni-mainz.de
Sun Jan 10 09:42:35 CET 2010


Do you use the serial or the parallel version of the MKL ?
The MKL that is used for matrix operations is internally parllelized, the behaviour is
usually controlled by the environment variables OMP_NUM_THREADS or MKL_NUM_THREADS
as well as some other MKL rlated ones (check the MKL manual).
The use of the parallel MKL may therefore cause that all of youre cores are used.
Usually the Wien2k config script sets OMP_NUM_THREADS=1 in the bash.rc,
If you do not figure out what was setting the thread number for the MKL, try to force that Wien2k uses the serial version (check the compiler and linker options
in the Intel manuals)

Indeed the use of sveral threads for the mkl and k-point parallelization may be contraproductive and can even slow down the program
Note if you change the number of threads using OMP_NUM_THREADS=1, you may need to restart W2WEB, otherwise it stays
with the old value.

Ciao
Gerhard

====================================
Dr. Gerhard H. Fecher
Institut of Inorganic and Analytical Chemistry
Johannes Gutenberg - University
55099 Mainz
________________________________________
Von: wien-bounces at zeus.theochem.tuwien.ac.at [wien-bounces at zeus.theochem.tuwien.ac.at] im Auftrag von 马超 [cma at blem.ac.cn]
Gesendet: Sonntag, 10. Januar 2010 00:32
An: wien at zeus.theochem.tuwien.ac.at
Betreff: Re: [Wien] Parallel calculation with Dual Quad Core Processors

Dear Prof. Marks,

Thanks very much for your suggestions. Now, I know how to set up the parallel calculation and edit .machines.

In addition, I have another question about the performance of my machine with Dual Quad Core Processors (i. e. 8 CPUs). By checking the process, I found eight CPUs of my machine were all used by more than 90%, even when I performed the calculation without k-point parallelization. I have compared these two cases with and without k-point parallelization, and found that they almost took the same time. So, I think it is not necessary to perform k-point parallel calculation in my case, and that this workstation will automatically allocate the mission on average to all eight CPUs . Is my understanding right? Thanks,

Best regards,

Chao


More information about the Wien mailing list