[Wien] OMP_NUM_THREADS

Dr Qiwen YAO Yao.Qiwen at nims.go.jp
Thu Jul 28 03:05:12 CEST 2011


Dear Gerhard,
Thank you very much for your detailed reply. It is very clear. I am very grateful for your kind help.

have a nice day.

Kind regards,
Qiwen


------Original Message------
From:"Gerhard Fecher"<fecher at uni-mainz.de>
To:"A Mailing list for WIEN2k users"<wien at zeus.theochem.tuwien.ac.at>
Cc:
Subject:Re: [Wien] OMP_NUM_THREADS
Date:07/27/2011 07:24:19 PM(+0000)
>Here a simpoified answer when using a single processor multiple core machine.
>
>The Hyperthreading of the processor has nothing to do with the threads that MKL uses,
>MKL is (internally) parallelized, that is it can execute certain loops in parallel, setting OMP_NUM_THREADS
>means that you tell MKL how many processor cores to use for the parallel handling.
>
>Hyperthreading means that each core can act as two one performing integer and one floating point operations,
>but thats usually not what you have when solving numerical things in Wien.
>Now if you tell MKL to use 8 cores for floating point operations but you have only 4 floating point units (because you have only 4 cores)
>the hyperthreading will distribute the work most probably not in the way that speed the things up.
>(In some cases hyperthreading helps also if you have a lot of disk or other operations (playing games during calculations?), that is if the processor can 
>run the floating point operations and uses the integer unit for other stuff.)
>
>If you use k-parallel, then you have principally something like the MKL, just a level higher.
>
>The behavior will depend how the memory is used, if the mkl can keep the intermediate results in the processor (registers, 1st level cache)
>then using more then one thread is oftenly faster than using the k-parallelisation because the data are held in the cache.
>
>With a quadcore you have three choices
>1) 4 k-point processes  (4 times "1:localhost")
>2) 2k point processes  (2 times "1:localhost") and OMP_NUM_THREADS=2
>3) OMP_NUM_THREADS=4
>but there is no definit answer what will be faster more k-points parallel or more MKL threads, it will depend when, and how many data are just
>needed by the process and what is the best use of the cache.
>
>Actually I found for my purpose that just OMP_NUM_THREADS=2 and running two Wien calculations
>in parallel is the fastest, as I am not an input machine and like to drink a lot of coffee.
>
>On the older dual cores it was never a good choice to use hyperthreading and OMP_NUM_THREADS together,
>what happened was that all the work was still done on a single core whereas the other did rather nothing
>and even Intel told to switch off hyperthreading when using OMP_NUM_THREADS (I tried it some years ago with Wien on a 2 processor dual core Xenon machine 
>and found that Intel is right).
>I don't know whether this behavior to distribute the processes is meanwhile changed by hard- or software management.
>
>Ciao
>Gerhard
>
>====================================
>Dr. Gerhard H. Fecher
>Institut of Inorganic and Analytical Chemistry
>Johannes Gutenberg - University
>55099 Mainz
>________________________________________
>Von: wien-bounces at zeus.theochem.tuwien.ac.at [wien-bounces at zeus.theochem.tuwien.ac.at]&quot; im Auftrag von &quot;Dr Qiwen  YAO [Yao.Qiwen at nims.go.jp]
>Gesendet: Mittwoch, 27. Juli 2011 17:30
>Bis: A Mailing list for WIEN2k users
>Betreff: Re: [Wien] OMP_NUM_THREADS
>
>Dear Gerhard,
>Thank you very much for your respond.
>I am a bit slow in catching up with what you are saying, may I rephrase what you've suggested and see is I could understand what you are suggesting:
>
>For 4 k-points and 4 mkl thread - do you mean I would set 4 lines of "1:localhost" in the .machines file and set OMP_NUM_THREADS=4?
>
>And for the 2 k points and 2 mkl threads - do I set only 2 lines of  "1:localhost" in the .machines file and set OMP_NUM_THREADS=2?
>
>If I am understanding you correctly, I will try both scenario and see which one is more efficient.
>
>Thank you so much for your time and help!
>
>Qiwen
>
>
>------Original Message------
>From:"Gerhard Fecher"<fecher at uni-mainz.de>
>To:"A Mailing list for WIEN2k users"<wien at zeus.theochem.tuwien.ac.at>
>Cc:
>Subject:Re: [Wien] OMP_NUM_THREADS
>Date:07/27/2011 03:17:21 PM(+0000)
>>If you have four "real" cores you may run in parallel either 4 k-points or 4 mkl threads or 2 k points and 2 mkl threads
>>
>>In some cases it might be good to "switch off the virtual cores" in the bios, at least with older processors/compilers this was faster,
>>but I did not check anymore.
>>
>>Ciao
>>Gerhard
>>
>>====================================
>>Dr. Gerhard H. Fecher
>>Institut of Inorganic and Analytical Chemistry
>>Johannes Gutenberg - University
>>55099 Mainz
>>________________________________________
>>Von: wien-bounces at zeus.theochem.tuwien.ac.at [wien-bounces at zeus.theochem.tuwien.ac.at]&quot; im Auftrag von &quot;Dr Qiwen  YAO [Yao.Qiwen at nims.go.jp]
>>Gesendet: Mittwoch, 27. Juli 2011 14:13
>>Bis: A Mailing list for WIEN2k users
>>Betreff: [Wien] OMP_NUM_THREADS
>>
>>Dear Wien2k users,
>>
>>We were told in the WIEN workshop that for  mkl+multi-core cases, it might be better having a setting of $OMP_NUM_THREADS =2.
>>
>>I have two questions in my mind:
>>
>>Q1.  Does this apply to a 2 core system with 4GB RAM that is not running parallel calculation (not K-point parallel nor mpi-parallel )?
>>
>>
>>Q2. Or this only apply to eg a quad core machine that runs on k-point parallel or mpi-parallel calculation?
>>
>>I have a 4-Core Dell T7500 PC with 12GB RAM, each core is of two threads, so in Susie/Linux or even in Windows, it all displays as a 8 CPU machine (so it is in actuality a four-core CPU but each core is with 2 threads, so all the OS sees it as a 8-core CPU). The actual info for this CPU is here if you like to see the details of it: http://ark.intel.com/products/37111
>>
>>I am setting up this machine running k-parallel calculation (not mpi-parallel as I have got only one of this machine for the moment), I am pondering:
>>
>>Which of the following 2 scenarios is a better choice for a 90 atom supercell calculation?
>>
>>Scenario  1.
>>.machines files is this:
>>-------
>>granularity:1
>>1:localhost
>>1:localhost
>>1:localhost
>>1:localhost
>>1:localhost
>>1:localhost
>>1:localhost
>>1:localhost
>>extrafine:1
>>----------
>>and the OMP_NUM_THREADS=1 as default in my .bashrc file.
>>so no multi-threading but all k-parallelism. (With this setting, I do notice after running the job for while - more than an hour say, the 8 CPUs shown in the System Monitor says only two CPUs were really utilized at a time (and it keep switching CPUs for the full-loading status, but mostly only two fully loaded at a time) and the rest of the 6 CPUs weren't really doing much - some of a few percentage of the load and others even on 0% - so I was wondering maybe this setting isn't optimized?
>>
>>Scenario  2.
>>.machines files would be like this:
>>-------
>>granularity:1
>>1:localhost
>>1:localhost
>>1:localhost
>>1:localhost
>>extrafine:1
>>----------
>>and set the OMP_NUM_THREADS=2 in my .bashrc file - I have not tried this setting as I am not sure if this would be a workable setting?
>>
>>Or, both settings would work and won't make much difference in calculation time length for a supercell calculation of 90 atoms? I am new to WIEN so I could not fully understand the THREAD'ings in WIEN's aspect.
>>
>>On addition, for the above two .machines file setting, would it make any difference if I put the real hostname in the place of "localhost"?
>>
>>Any comment would be greatly appreciated.
>>
>>Thank you!
>>
>>Kind regards,
>>Qiwen
>>
>>**********************************************************
>>
>>Dr QiWen YAO
>>
>>JSPS Fellow
>>Multifunctional Materials Group
>>Optical and Electronic Materials Unit
>>Environment and Energy Materials Research Division
>>
>>National Institute for Materials Science
>>
>>1-2-1 Sengen, Tsukuba, Ibaraki 305-0047, Japan
>>Phone: +81-29-851-3354, ext. no. 6482, Fax: +81-29-859-2501
>>
>>**********************************************************
>>
>>_______________________________________________
>>Wien mailing list
>>Wien at zeus.theochem.tuwien.ac.at
>>http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
>>_______________________________________________
>>Wien mailing list
>>Wien at zeus.theochem.tuwien.ac.at
>>http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
>
>**********************************************************
>
>Dr QiWen YAO
>
>JSPS Fellow
>Multifunctional Materials Group
>Optical and Electronic Materials Unit
>Environment and Energy Materials Research Division
>
>National Institute for Materials Science
>
>1-2-1 Sengen, Tsukuba, Ibaraki 305-0047, Japan
>Phone: +81-29-851-3354, ext. no. 6482, Fax: +81-29-859-2501
>
>**********************************************************
>
>_______________________________________________
>Wien mailing list
>Wien at zeus.theochem.tuwien.ac.at
>http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
>_______________________________________________
>Wien mailing list
>Wien at zeus.theochem.tuwien.ac.at
>http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien

**********************************************************

Dr QiWen YAO

JSPS Fellow
Multifunctional Materials Group
Optical and Electronic Materials Unit
Environment and Energy Materials Research Division

National Institute for Materials Science

1-2-1 Sengen, Tsukuba, Ibaraki 305-0047, Japan
Phone: +81-29-851-3354, ext. no. 6482, Fax: +81-29-859-2501

**********************************************************



More information about the Wien mailing list