[Wien] OMP_NUM_THREADS

Gerhard Fecher fecher at uni-mainz.de
Wed Jul 27 21:24:19 CEST 2011


Here a simpoified answer when using a single processor multiple core machine.

The Hyperthreading of the processor has nothing to do with the threads that MKL uses,
MKL is (internally) parallelized, that is it can execute certain loops in parallel, setting OMP_NUM_THREADS
means that you tell MKL how many processor cores to use for the parallel handling.

Hyperthreading means that each core can act as two one performing integer and one floating point operations,
but thats usually not what you have when solving numerical things in Wien.
Now if you tell MKL to use 8 cores for floating point operations but you have only 4 floating point units (because you have only 4 cores)
the hyperthreading will distribute the work most probably not in the way that speed the things up.
(In some cases hyperthreading helps also if you have a lot of disk or other operations (playing games during calculations?), that is if the processor can 
run the floating point operations and uses the integer unit for other stuff.)

If you use k-parallel, then you have principally something like the MKL, just a level higher.

The behavior will depend how the memory is used, if the mkl can keep the intermediate results in the processor (registers, 1st level cache)
then using more then one thread is oftenly faster than using the k-parallelisation because the data are held in the cache.

With a quadcore you have three choices
1) 4 k-point processes  (4 times "1:localhost")
2) 2k point processes  (2 times "1:localhost") and OMP_NUM_THREADS=2
3) OMP_NUM_THREADS=4
but there is no definit answer what will be faster more k-points parallel or more MKL threads, it will depend when, and how many data are just
needed by the process and what is the best use of the cache.

Actually I found for my purpose that just OMP_NUM_THREADS=2 and running two Wien calculations
in parallel is the fastest, as I am not an input machine and like to drink a lot of coffee.

On the older dual cores it was never a good choice to use hyperthreading and OMP_NUM_THREADS together,
what happened was that all the work was still done on a single core whereas the other did rather nothing
and even Intel told to switch off hyperthreading when using OMP_NUM_THREADS (I tried it some years ago with Wien on a 2 processor dual core Xenon machine 
and found that Intel is right).
I don't know whether this behavior to distribute the processes is meanwhile changed by hard- or software management.

Ciao
Gerhard

====================================
Dr. Gerhard H. Fecher
Institut of Inorganic and Analytical Chemistry
Johannes Gutenberg - University
55099 Mainz
________________________________________
Von: wien-bounces at zeus.theochem.tuwien.ac.at [wien-bounces at zeus.theochem.tuwien.ac.at]" im Auftrag von "Dr Qiwen  YAO [Yao.Qiwen at nims.go.jp]
Gesendet: Mittwoch, 27. Juli 2011 17:30
Bis: A Mailing list for WIEN2k users
Betreff: Re: [Wien] OMP_NUM_THREADS

Dear Gerhard,
Thank you very much for your respond.
I am a bit slow in catching up with what you are saying, may I rephrase what you've suggested and see is I could understand what you are suggesting:

For 4 k-points and 4 mkl thread - do you mean I would set 4 lines of "1:localhost" in the .machines file and set OMP_NUM_THREADS=4?

And for the 2 k points and 2 mkl threads - do I set only 2 lines of  "1:localhost" in the .machines file and set OMP_NUM_THREADS=2?

If I am understanding you correctly, I will try both scenario and see which one is more efficient.

Thank you so much for your time and help!

Qiwen


------Original Message------
From:"Gerhard Fecher"<fecher at uni-mainz.de>
To:"A Mailing list for WIEN2k users"<wien at zeus.theochem.tuwien.ac.at>
Cc:
Subject:Re: [Wien] OMP_NUM_THREADS
Date:07/27/2011 03:17:21 PM(+0000)
>If you have four "real" cores you may run in parallel either 4 k-points or 4 mkl threads or 2 k points and 2 mkl threads
>
>In some cases it might be good to "switch off the virtual cores" in the bios, at least with older processors/compilers this was faster,
>but I did not check anymore.
>
>Ciao
>Gerhard
>
>====================================
>Dr. Gerhard H. Fecher
>Institut of Inorganic and Analytical Chemistry
>Johannes Gutenberg - University
>55099 Mainz
>________________________________________
>Von: wien-bounces at zeus.theochem.tuwien.ac.at [wien-bounces at zeus.theochem.tuwien.ac.at]&quot; im Auftrag von &quot;Dr Qiwen  YAO [Yao.Qiwen at nims.go.jp]
>Gesendet: Mittwoch, 27. Juli 2011 14:13
>Bis: A Mailing list for WIEN2k users
>Betreff: [Wien] OMP_NUM_THREADS
>
>Dear Wien2k users,
>
>We were told in the WIEN workshop that for  mkl+multi-core cases, it might be better having a setting of $OMP_NUM_THREADS =2.
>
>I have two questions in my mind:
>
>Q1.  Does this apply to a 2 core system with 4GB RAM that is not running parallel calculation (not K-point parallel nor mpi-parallel )?
>
>
>Q2. Or this only apply to eg a quad core machine that runs on k-point parallel or mpi-parallel calculation?
>
>I have a 4-Core Dell T7500 PC with 12GB RAM, each core is of two threads, so in Susie/Linux or even in Windows, it all displays as a 8 CPU machine (so it is in actuality a four-core CPU but each core is with 2 threads, so all the OS sees it as a 8-core CPU). The actual info for this CPU is here if you like to see the details of it: http://ark.intel.com/products/37111
>
>I am setting up this machine running k-parallel calculation (not mpi-parallel as I have got only one of this machine for the moment), I am pondering:
>
>Which of the following 2 scenarios is a better choice for a 90 atom supercell calculation?
>
>Scenario  1.
>.machines files is this:
>-------
>granularity:1
>1:localhost
>1:localhost
>1:localhost
>1:localhost
>1:localhost
>1:localhost
>1:localhost
>1:localhost
>extrafine:1
>----------
>and the OMP_NUM_THREADS=1 as default in my .bashrc file.
>so no multi-threading but all k-parallelism. (With this setting, I do notice after running the job for while - more than an hour say, the 8 CPUs shown in the System Monitor says only two CPUs were really utilized at a time (and it keep switching CPUs for the full-loading status, but mostly only two fully loaded at a time) and the rest of the 6 CPUs weren't really doing much - some of a few percentage of the load and others even on 0% - so I was wondering maybe this setting isn't optimized?
>
>Scenario  2.
>.machines files would be like this:
>-------
>granularity:1
>1:localhost
>1:localhost
>1:localhost
>1:localhost
>extrafine:1
>----------
>and set the OMP_NUM_THREADS=2 in my .bashrc file - I have not tried this setting as I am not sure if this would be a workable setting?
>
>Or, both settings would work and won't make much difference in calculation time length for a supercell calculation of 90 atoms? I am new to WIEN so I could not fully understand the THREAD'ings in WIEN's aspect.
>
>On addition, for the above two .machines file setting, would it make any difference if I put the real hostname in the place of "localhost"?
>
>Any comment would be greatly appreciated.
>
>Thank you!
>
>Kind regards,
>Qiwen
>
>**********************************************************
>
>Dr QiWen YAO
>
>JSPS Fellow
>Multifunctional Materials Group
>Optical and Electronic Materials Unit
>Environment and Energy Materials Research Division
>
>National Institute for Materials Science
>
>1-2-1 Sengen, Tsukuba, Ibaraki 305-0047, Japan
>Phone: +81-29-851-3354, ext. no. 6482, Fax: +81-29-859-2501
>
>**********************************************************
>
>_______________________________________________
>Wien mailing list
>Wien at zeus.theochem.tuwien.ac.at
>http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
>_______________________________________________
>Wien mailing list
>Wien at zeus.theochem.tuwien.ac.at
>http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien

**********************************************************

Dr QiWen YAO

JSPS Fellow
Multifunctional Materials Group
Optical and Electronic Materials Unit
Environment and Energy Materials Research Division

National Institute for Materials Science

1-2-1 Sengen, Tsukuba, Ibaraki 305-0047, Japan
Phone: +81-29-851-3354, ext. no. 6482, Fax: +81-29-859-2501

**********************************************************

_______________________________________________
Wien mailing list
Wien at zeus.theochem.tuwien.ac.at
http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien


More information about the Wien mailing list