<!DOCTYPE html>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<p>There is a list of potential exit code 9 (KILLED BY SIGNAL: 9)
causes at [1].</p>
<p>Hitting the walltime (--time [2,3]) limit is listed as one of
them.</p>
<p>The slurm seff command might be helpful for determining if it
caused by oom. Refer to [4,5].<br>
</p>
[1]
<a class="moz-txt-link-freetext" href="https://www.intel.com/content/www/us/en/docs/mpi-library/developer-guide-linux/2021-6/error-message-bad-termination.html">https://www.intel.com/content/www/us/en/docs/mpi-library/developer-guide-linux/2021-6/error-message-bad-termination.html</a><br>
[2]
<a class="moz-txt-link-freetext" href="https://docs.hpc.uwec.edu/slurm/determining-resources/#time-walltime">https://docs.hpc.uwec.edu/slurm/determining-resources/#time-walltime</a><br>
[3] <a class="moz-txt-link-freetext" href="https://hpcc.umd.edu/hpcc/help/jobs.html#walltime">https://hpcc.umd.edu/hpcc/help/jobs.html#walltime</a><br>
[4] <a class="moz-txt-link-freetext" href="https://www.nsc.liu.se/support/memory-management/">https://www.nsc.liu.se/support/memory-management/</a><br>
[5]
<a class="moz-txt-link-freetext" href="https://documentation.sigma2.no/jobs/choosing-memory-settings.html">https://documentation.sigma2.no/jobs/choosing-memory-settings.html</a><br>
<br>
Hope that can help,<br>
Gavin<br>
WIEN2k user<br>
<div class="moz-cite-prefix"><br>
</div>
<div class="moz-cite-prefix">On 1/24/2025 8:40 AM, Laurence Marks
wrote:<br>
</div>
<blockquote type="cite"
cite="mid:CANkSMZD2uCF20kvLsyfNdSbcN=DFU=osaA2i1bCHAPxUwOD09A@mail.gmail.com">
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<div dir="auto">
<div>
<div>Sorry, but you have not provided enough information for
more than a guess.</div>
<div dir="auto"><br>
</div>
<div dir="auto">Exit code 9 is when the OS kills the task,
often from out of memory (oom) but it does not have to be.
The larger calculation will require about 8*8 more memory
(perhaps more) than your simple calculation: do "grep
"Matrix size" *output1* -18". You probably ran out of
memory, and will need to use more mpi/kpt for the larger
calculation.</div>
<div dir="auto"><br>
</div>
<div dir="auto">N.B., using 2 ompi per task is also useful in
reducing the total memory useage. Combine this with mpi.</div>
<div dir="auto"><br>
</div>
<div><br>
</div>
<div data-smartmail="gmail_signature">---<br>
Emeritus Professor Laurence Marks (Laurie)<br>
<a href="http://www.numis.northwestern.edu"
moz-do-not-send="true">www.numis.northwestern.edu</a><br>
<a
href="https://scholar.google.com/citations?user=zmHhI9gAAAAJ&hl=en"
moz-do-not-send="true">https://scholar.google.com/citations?user=zmHhI9gAAAAJ&hl=en</a><br>
"Research is to see what everybody else has seen, and to
think what nobody else has thought" Albert Szent-Györgyi</div>
<br>
<div class="gmail_quote gmail_quote_container">
<div dir="ltr" class="gmail_attr">On Fri, Jan 24, 2025,
07:46 Sergeev Gregory <<a
href="mailto:sgregory@live.ru" moz-do-not-send="true"
class="moz-txt-link-freetext">sgregory@live.ru</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote"
style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div dir="ltr">
<div
style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
Dear developers,</div>
<div
style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
I do my calculations on hpc with slurm system and I
have strange behaviour of parallel wien2k jobs:</div>
<div
style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
<br>
</div>
<div
style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
I have two structures:</div>
<div
style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
1. Structure with 8 atoms in unitcell (simple
structure)</div>
<div
style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
2. Supercell structure with 64 atoms (2*2*2 supercell
structure) based on cell from simple structure</div>
<div
style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
<br>
</div>
<div
style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
I try to do Wien2k calculations on parallel mode with
two configs:</div>
<div
style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
1. Calculations on 1 node (1 node has 48 processors)
with 12 parallel jobs with 4 processors per each job
(one node job)</div>
<div
style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
2. Calculations on 2 nodes (2 node has 48*2=96
processors) with 24 parallel jobs with 4 processors
per each job (two node job)</div>
<div
style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
<br>
</div>
<div
style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
For "simple structure" "one node job" and "two node
job" work without problems.</div>
<div
style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
<br>
</div>
<div
style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
For "supercell structure" "one node job" works well,
but "two node job" crashs with errors in .time1_*
files (I use Intel MPI):</div>
<div
style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
<br>
</div>
<div
style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
-----------------</div>
<div
style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
n053 n053 n053 n053(21) </div>
<div
style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
===================================================================================</div>
<div
style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
= BAD TERMINATION OF ONE OF YOUR APPLICATION
PROCESSES</div>
<div
style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
= PID 21859 RUNNING AT n053</div>
<div
style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
= EXIT CODE: 9</div>
<div
style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
= CLEANING UP REMAINING PROCESSES</div>
<div
style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES</div>
<div
style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
===================================================================================</div>
<div
style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
<br>
</div>
<div
style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
===================================================================================</div>
<div
style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
= BAD TERMINATION OF ONE OF YOUR APPLICATION
PROCESSES</div>
<div
style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
= PID 21859 RUNNING AT n053</div>
<div
style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
= EXIT CODE: 9</div>
<div
style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
= CLEANING UP REMAINING PROCESSES</div>
<div
style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES</div>
<div
style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
===================================================================================</div>
<div
style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
Intel(R) MPI Library troubleshooting guide:</div>
<div
style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
<a
href="https://software.intel.com/node/561764"
target="_blank" rel="noreferrer"
moz-do-not-send="true" class="moz-txt-link-freetext">https://software.intel.com/node/561764</a></div>
<div
style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
===================================================================================</div>
<div
style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
0.042u 0.144s 2:45.42 0.1% 0+0k 4064+8io 60pf+0w </div>
<div
style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
-----------------</div>
<div
style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
<br>
</div>
<div
style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
First I thinked, that there are problems with
unufficial memory on "2 node job" (but why, if "1 node
job" works with same processors per one parallel
job?). I tried to twice increaced used memory per task
(#SBATCH --cpus-per-task 2), but this fix haven't
solve problem. Same error.</div>
<div
style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
<br>
</div>
<div
style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
Any ideas why such strange behavior?</div>
<div
style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
Does Wien2k have problems scaling to multiple nodes?</div>
<div
style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
<br>
</div>
<div
style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
I would appreciate your help. I want to speed up
calculations for complex structures, I have the
resources, but I can't do it.</div>
<div
style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
<span style="white-space: pre-wrap">
</span></div>
</div>
</blockquote>
</div>
</div>
</div>
</blockquote>
</body>
</html>