<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"

"http://www.w3.org/TR/REC-html40/loose.dtd">

<html>

<head>

<meta http-equiv="content-type" content="text/html; charset=utf-8">

<title></title>

</head>

<body style="font-family:Arial;font-size:14px">

<p>Dear Lukasz,</p>

<p>The difference in computation time between lapw1 and lapwso calculations is expected in band calculations. The lapwso step involves the calculation of spin-orbit coupling, which can be computationally more demanding compared to the lapw1 step that calculates the bands without spin-orbit coupling.</p>

<p>The lapwso calculation includes additional interactions between the spin of the electron and its orbital motion, which requires more computational resources and time. Therefore, it is normal to observe a longer runtime for lapwso compared to lapw1.</p>

<p>In your case, the lapwso process is utilizing a significant portion of the CPU resources, as indicated by the high CPU usage percentages (%CPU) in the top output. The memory usage (%MEM) is also relatively high for the lapwso processes.</p>

<p>It appears that you have allocated sufficient resources (OMP=2) for the lapwso step, and your system has ample memory available. Therefore, the longer runtime can be attributed to the inherent complexity of the spin-orbit coupling calculations rather than a resource limitation.</p>

<p>If you need to optimize the performance further, you may consider adjusting the OMP settings or exploring parallelization options with k-points or MPI to distribute the workload across multiple cores or processors. However, it's important to note that the total runtime for the lapwso step will inherently be longer due to the nature of the calculations involved.<br>

<br>

Suggestions for trying:<br>

omp_global:4<br>

#omp_lapw1:2<br>

#omp_lapw2:2<br>

#omp_lapwso:2<br>

<br>

In WIEN2k, k-points parallelization can be more efficient. You can use the <code>testpara_lapw</code> command to assess if increasing the number of "1:localhost" lines in your .machines file is necessary. <code>testpara_lapw</code> is a utility program in the WIEN2k package that helps determine the optimal number of lines (k-points) needed for accurate calculations.<br>

<br>

Compiling the code with appropriate optimization flags can significantly improve the performance and speed of calculations. Here are some additional suggestions related to code compilation:<br>

Experiment with different optimization levels. Most compilers provide different optimization levels, such as -O1, -O2, -O3. Higher optimization levels generally provide better performance but may increase compilation time. Use the suggestion of the siteconfig_lapw for the right balance of the code.<br>

Ensure that you are using the latest version of the code, i.e., WIEN2k_23.2. The respectful Developers have recently released significant updates and bug fixes that can improve performance, super thanks to Peter Blaha and all the developers.<br>

<br>

Best regards,</p>

<p>Saeid<br></p>

<p><br>

Quoting pluto via Wien <<a href="mailto:wien@zeus.theochem.tuwien.ac.at">wien@zeus.theochem.tuwien.ac.at</a>>:</p>

<blockquote style="border-left:2px solid blue;margin-left:2px;padding-left:12px;" type="cite">

<p>Dear All,<br>

<br>

When calculating bands for a large slab I have following sequence:<br>

<br>

Sun May 14 12:33:03 PM CEST 2023> (x) lapw1 -band -up -p<br>

Sun May 14 02:25:26 PM CEST 2023> (x) lapw1 -band -dn -p<br>

Sun May 14 04:17:22 PM CEST 2023> (x) lapwso -up -p<br>

Mon May 15 01:30:05 AM CEST 2023> (x) qtl -up -p -band -so<br>

Mon May 15 01:30:05 AM CEST 2023> (x) lapw2 -p -fermi -so -up<br>

Mon May 15 01:51:51 AM CEST 2023> (x) qtl -dn -p -band -so<br>

Mon May 15 01:51:51 AM CEST 2023> (x) lapw2 -p -fermi -so -dn<br>

<br>

As you can see lapwso takes much longer than lapw1 (approx. 9h vs 2h). Is this normal for band calculations?<br>

<br>

I have 128 GB of RAM in this computer, so this is not a RAM issue. Here is what top shows for the lapwso calculation (I have 4 parallel localhost processes in .machines, OMP=2 and no mpi):<br>

<br>

Tasks: 505 total,   2 running, 503 sleeping,   0 stopped,   0 zombie<br>

%Cpu(s): 24.0 us,  0.2 sy,  0.0 ni, 75.7 id,  0.0 wa,  0.1 hi,  0.0 si,  0.0 st<br>

MiB Mem : 128047.1 total,   1845.8 free,  16809.0 used, 111158.6 buff/cache<br>

MiB Swap:  32088.0 total,  31471.5 free,    616.5 used. 111238.1 avail Mem<br>

<br>

   PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND<br>

1336417 lplucin   20   0 6417856   4.8g  15840 R 199.3   3.8   1294:13 lapwso<br>

1336392 lplucin   20   0 2848204   2.3g  15880 S 146.8   1.9   1295:30 lapwso<br>

1336391 lplucin   20   0 2848188   2.4g  15916 S 130.6   1.9   1304:23 lapwso<br>

1336396 lplucin   20   0 2848060   2.3g  15816 S  99.7   1.9   1288:06 lapwso<br>

<br>

.machines file:<br>

<br>

omp_global:8<br>

omp_lapw1:2<br>

omp_lapw2:2<br>

omp_lapwso:2<br>

1:localhost<br>

1:localhost<br>

1:localhost<br>

1:localhost<br>

granularity:1<br>

<br>

Best,<br>

Lukasz<br>

_______________________________________________<br>

Wien mailing list<br>

<a href="mailto:Wien@zeus.theochem.tuwien.ac.at">Wien@zeus.theochem.tuwien.ac.at</a><br>

<a href="http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wienSEARCH" target="_blank">http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wienSEARCH</a> the MAILING-LIST at:  <a href="http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html" target="_blank">http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html</a></p>

</blockquote>

<p><br>

<br></p>

</body>

</html>