[Wien] .machines problem

Peter Blaha pblaha at theochem.tuwien.ac.at
Sat Oct 1 19:37:13 CEST 2016


No, your Linux is probably ok.

I guess you made an error during siteconfig_lapw when installing wien2k.

Please check $WIENROOT/parallel_options.

Most likely    use_remote=0, which is for a single shared memory machine.

Set it to 1.

PS: Test the timing with just 4 parallel jobs per node. I expect this is 
faster than using 8.

Am 01.10.2016 um 15:19 schrieb John Rundgren:
> On 09/30/2016 09:44 AM, Peter Blaha wrote:
>> Parallelization has always its limits and you need to do it
>> "sensible". In general it is NOT true, that more cores always mean
>> faster execution, but it could even slow down the calculations
>> dramatically.
>>
>> a) Hardware: You say your computers have "8 threads". Do you mean they
>> have really 8 cores (like some Xeons), or are these 4 core machines
>> with hyperthreading. In the latter case 8 parallel jobs are useless,
>> as hyperthreading provides only a logical core, but not a "real" one.
>> In addition, it is well known that modern multi-core cpus are very
>> often "memory-bound", this means, their memory bus is too slow to
>> saturate all cores simultaneously. Thus it is often "natural" that a N
>> core job is NOT N times as fast as a single core job.
>> Another factor is disk I/O, which on some systems can become VERY slow
>> (over the network or on a single node) the more jobs are running.
>>
>> b) Software: There is a "multithreading" option with Intel, and
>> setting OMP_NUM_THREAD=2 makes lapw1 nearly twice as fast as
>> OMP_NUM_THREAD=1. Of course, when using this, you should reduce the
>> number of parallel jobs by 2. Check with "top" your cpu usage. When
>> you see "200 %" for an lapw1 process, it is this multithreading.
>> lapw1para: it starts the parallel processes with some "DELAY",
>> otherwise this leads to problems on some systems. If for instance
>> DELAY=1, it means that spanning 16 lapw1 will take at least 16
>> seconds. If your testcase runs only for 2 seconds/lapw1, you can
>> imagine that you will not get any speedup, but a drastic slowdown. If
>> it runs for 5 min, the 16 seconds are negligible and you should see a
>> speedup from 5 to 2.5 min (provided you have enough k-points !, check
>> with "testpara").
>>
>> It is always good, if you can "watch" your parallel job on the two
>> nodes with "top" (in two different windows). You should see how they
>> start, how they run (do the get nearly 100 or 200% of the cores most
>> of the time), and how they stop (nearly same time, or very unbalanced) ?
>>
>>
>> On 09/28/2016 03:21 PM, John Rundgren wrote:
>>> Dear W2k team,
>>> On my desk are two identical computers alpha and beta of 8 threads each.
>>>
>>> How is .machines set up such that k-point parallelization goes twice as
>>> fast using alpha & beta compared with using single alpha?
>>>
>>> Unfortunately, my testing UG 5.5.4 responds with error diagnostics.
>>>
>>> When I try the following .machines with and without #,
>>>   1:alpha
>>>   #1:beta
>>>   1:alpha
>>>   #1:beta
>>>   1:alpha
>>>   #1:beta
>>>   1:alpha
>>>   #1:beta
>>>   1:alpha
>>>   #1:beta
>>>   1:alpha
>>>   #1:beta
>>>   1:alpha
>>>   #1:beta
>>>   1:alpha
>>>   #1:beta
>>>   granularity:1
>>>   extrafine:1
>>> computing time comes out similar in both cases. I would like to see
>>> sixteen threads executing twice as fast as eight.
>>>
>>> Regards,
>>> John Rundgren KTH
>>>
>>>
>>> _______________________________________________
>>> Wien mailing list
>>> Wien at zeus.theochem.tuwien.ac.at
>>> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
>>> SEARCH the MAILING-LIST at:
>>> http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
>>
> Dear Peter,
> Thanks for your comments on the use of several computers. A simple
> reason to my failure seems to be that my Linux set-up is defective.
>
> My computers are,
>   homer = Xenon E3-1270 v3 @ 3.50GHz, 4 cpus and 8 threads,
>   odysvs = i7-3770 3.40GHz, 4 cpus and 8 threads,
> homer being the main computer.
>
> When the following .machines files,
>   1:homer
>   1:homer
>   1:homer
>   1:homer
>   1:homer
>   1:homer
>   1:homer
>   1:homer
>   granularity:1
>   extrafine:1
> and
>   1:odysvs
>   1:odysvs
>   1:odysvs
>   1:odysvs
>   1:odysvs
>   1:odysvs
>   1:odysvs
>   1:odysvs
>   granularity:1
>   extrafine:1
> are used separately as input to homer, the execution takes place in
> homer. In both cases the System Monitor of odysvs is idle, although in
> the second case the dayfile refers to odysvs.
>
> The following ssh commands were made beforehand,
>  homer> ssh-keygen -t rsa
>  homer> ssh-copy-id odysvs,
> test,
>  homer> ssh odysvs pwd > /home/jru, without password.
> Any computer mentioned in .machines seems to be treated as "localhost".
>
> Does this test give a clue to what fails?
> Regards, John
>
> _______________________________________________
> Wien mailing list
> Wien at zeus.theochem.tuwien.ac.at
> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
> SEARCH the MAILING-LIST at:
> http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html

-- 
--------------------------------------------------------------------------
Peter BLAHA, Inst.f. Materials Chemistry, TU Vienna, A-1060 Vienna
Phone: +43-1-58801-165300             FAX: +43-1-58801-165982
Email: blaha at theochem.tuwien.ac.at    WIEN2k: http://www.wien2k.at
WWW:   http://www.imc.tuwien.ac.at/staff/tc_group_e.php
--------------------------------------------------------------------------


More information about the Wien mailing list