<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<p>You might also check what OMP_NUM_THREADS is set to on your
system in .bashrc or .cshrc?</p>
<p>For example, on my Ubuntu system, I do:<br>
</p>
<p>username@computername:~/Desktop$ grep OMP_NUM_THREADS ~/.bashrc<br>
export OMP_NUM_THREADS=1<br>
</p>
<p>As you can see I'm using a different value than the default that
would have been set by userconfig_lapw during installation of
WIEN2k. I believe the default value is OMP_NUM_THREADS=4.</p>
<p>Is your Xeon processor a E5-2698 v3? If it is, the following
link has "# of Threads" as 32:<br>
</p>
<p><a class="moz-txt-link-freetext" href="https://ark.intel.com/content/www/us/en/ark/products/81060/intel-xeon-processor-e5-2698-v3-40m-cache-2-30-ghz.html">https://ark.intel.com/content/www/us/en/ark/products/81060/intel-xeon-processor-e5-2698-v3-40m-cache-2-30-ghz.html</a><br>
</p>
<p>With your .machines file requesting 16 cores, if you
OMP_NUM_THREADS is <font color="#0000ff">4</font>, you would be
requesting 16 cores * <font color="#0000ff">4</font> threads/core
= 64 threads. That should be 32 threads (=64 requested threads -
32 processor core threads) more than your processor could handle
at one time.</p>
<p>If you using a different processor, you would have to look on
Intel's website to find out the "# of Threads" your particular
processor can handle.</p>
<p>The OMP_NUM_THREADS of course can be overridden by using
omp_global in the .machines file.</p>
<p>If the problem is coming from a memory error as previously
discussed as a possibility in the post:<br>
</p>
<p><a class="moz-txt-link-freetext" href="https://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/msg20807.html">https://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/msg20807.html</a><br>
</p>
<p>Then, you might want to check /var/log. The following post might
help with that:<br>
</p>
<p><a class="moz-txt-link-freetext" href="https://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/msg19703.html">https://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/msg19703.html</a><br>
</p>
<p>You might also check what parallel_options are set to with the
command:<br>
</p>
<p>cat $WIENROOT/parallel_options</p>
<p>If the problem is related to passwordless login. One of the
posts in the mailing list archive that might help is:<br>
</p>
<p><a class="moz-txt-link-freetext" href="https://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/msg02295.html">https://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/msg02295.html</a><br>
</p>
<p><br>
</p>
<div class="moz-cite-prefix">On 2/8/2021 5:33 AM, Peter Blaha wrote:<br>
</div>
<blockquote type="cite"
cite="mid:3d0cf269-d25a-ea9b-fab4-b06589adebf2@theochem.tuwien.ac.at">We
still don't know much about your case.
<br>
<br>
Please modify your .machinesfile and use only 2 instead of 16
lines with
<br>
1:localhost
<br>
If this solves the problem, increase it to 4 or 6 (when you have
12 k-points) or 8 (if you have more k-points).
<br>
Also uncomment
<br>
omp_global:2 or 4
<br>
Then you are still using all your cores, but you will need less
memory.
<br>
<br>
Am 08.02.2021 um 11:24 schrieb Murat Aycibin:
<br>
<blockquote type="cite">Hi Dr/ Blaha
<br>
Thanks you for your reply
<br>
I took smaller k value 12 in 5x5x3 grid(100 k points i defined).
I got the same mistake. I have computer which has 64 GB Ramand i
have 16 core (intel xeon processes). My machine file is
<br>
<br>
.machines is the control file for parallel execution. Add
lines like
<br>
#
<br>
# speed:machine_name
<br>
#
<br>
# for each machine specifying there relative speed. For mpi
parallelization use
<br>
#
<br>
# speed:machine_name:1 machine_name:1
<br>
# lapw0:machine_name:1 machine_name:1
<br>
#
<br>
# further options are:
<br>
#
<br>
# granularity:number (for loadbalancing on irregularly used
machines)
<br>
# residue:machine_name (on shared memory machines)
<br>
# extrafine (to distribute the remaining k-points one
after the other)
<br>
#
<br>
# granularity sets the number of files that will be
approximately
<br>
# be generated by each processor; this is used for
load-balancing.
<br>
# On very homogeneous systems set number to 1
<br>
# if after distributing the k-points to the various machines
residual
<br>
# k-points are left, they will be distributed to the
residual-machine_name.
<br>
#
<br>
1:localhost
<br>
1:localhost
<br>
1:localhost
<br>
1:localhost
<br>
1:localhost
<br>
1:localhost
<br>
1:localhost
<br>
1:localhost
<br>
1:localhost
<br>
1:localhost
<br>
1:localhost
<br>
1:localhost
<br>
1:localhost
<br>
1:localhost
<br>
1:localhost
<br>
1:localhost
<br>
granularity:1
<br>
extrafine:1
<br>
#
<br>
# Uncomment for specific OMP-parallelization (overwriting a
global OMP_NUM_THREADS)
<br>
#
<br>
#omp_global:4
<br>
# or use program-specific parallelization:
<br>
#omp_lapw0:4
<br>
#omp_lapw1:4
<br>
#omp_lapw2:4
<br>
#omp_lapwso:4
<br>
#omp_dstart:4
<br>
#omp_sumpara:4
<br>
#omp_nlvdw:4
<br>
<br>
I had RTmax 7 percent. The error
<br>
<br>
.machines is the control file for parallel execution. Add
lines like
<br>
#
<br>
# speed:machine_name
<br>
#
<br>
# for each machine specifying there relative speed. For mpi
parallelization use
<br>
#
<br>
# speed:machine_name:1 machine_name:1
<br>
# lapw0:machine_name:1 machine_name:1
<br>
#
<br>
# further options are:
<br>
#
<br>
# granularity:number (for loadbalancing on irregularly used
machines)
<br>
# residue:machine_name (on shared memory machines)
<br>
# extrafine (to distribute the remaining k-points one
after the other)
<br>
#
<br>
# granularity sets the number of files that will be
approximately
<br>
# be generated by each processor; this is used for
load-balancing.
<br>
# On very homogeneous systems set number to 1
<br>
# if after distributing the k-points to the various machines
residual
<br>
# k-points are left, they will be distributed to the
residual-machine_name.
<br>
#
<br>
1:localhost
<br>
1:localhost
<br>
1:localhost
<br>
1:localhost
<br>
1:localhost
<br>
1:localhost
<br>
1:localhost
<br>
1:localhost
<br>
1:localhost
<br>
1:localhost
<br>
1:localhost
<br>
1:localhost
<br>
1:localhost
<br>
1:localhost
<br>
1:localhost
<br>
1:localhost
<br>
granularity:1
<br>
extrafine:1
<br>
#
<br>
# Uncomment for specific OMP-parallelization (overwriting a
global OMP_NUM_THREADS)
<br>
#
<br>
#omp_global:4
<br>
# or use program-specific parallelization:
<br>
#omp_lapw0:4
<br>
#omp_lapw1:4
<br>
#omp_lapw2:4
<br>
#omp_lapwso:4
<br>
#omp_dstart:4
<br>
#omp_sumpara:4
<br>
#omp_nlvdw:4
<br>
<br>
. I do not have any idea what is wrong now.
<br>
<br>
-- <br>
Yrd Doc Dr. Murat Aycibin
<br>
Van Yuzuncu Yil Universitesi
<br>
Fizik Bolumu
<br>
</blockquote>
</blockquote>
</body>
</html>