<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class="">
Hi, Marks and Peter:<br class="">
<br class="">
Thank you for your suggestions. About your reply, I have several follow-up questions. Actually, I’m using a intermediate cluster in my university, which has 16 cores and 64 GB memory on standard nodes. The calculation I’m doing is k-point but not MPI parallelized.
From the :RKM flag I posted in my first email, I estimate that the matrix size I need for a Rkmax=5+ will be at least 40000. In my current calculation, the lapw1 program will occupy as large as 3GB on each slot (1 k point/slot). So I estimate the memory for
each slot will be at least 12 GB. I have 8 k points so that 96 GB memory will be required at least (if my estimation is correct). Considering the current computation resources I have, this is way too memory demanding. On our clusters, there’s a 4 GB memory
limit for each slot on standard node. Although I can submit request for high memory node, but their usages are very competitive among cluster users. Do you have any suggestions on accomplishing this calculation within the limitation of my cluster?<br class="">
<br class="">
About the details of my calculation, the material I'm looking at is a hydrogen terminated silicon carbide with 56 atoms. A 1x1x14 k-mesh is picked for k-point sampling. The radius of 1.2 is achieved from setrmt_lapw actually. Indeed, the radius of hydrogen
is too large and I’m adjusting its radius during the progress of optimization all the time. The reason why I have such a huge matrix is mainly due to size of my unit cell. I’m using large unit cell to isolate the coupling between neighboring nanowire.<br class="">
<br class="">
Except for the above questions, I also met some problems in mpi calculation. By following Marks’ suggestion on parallel calculation, I want to test the efficiency of mpi calculation since I only used k-point parallelized calculation before. The MPI installed
on my cluster is openmpi. In the output file, I get the following error:<br class="">
<br class="">
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------<br class="">
<font face="Menlo" class=""> LAPW0 END<br class="">
<br class="">
lapw1c_mpi:19058 terminated with signal 11 at PC=2b56d9118f79 SP=7fffc23d6890. Backtrace:</font>
<div class=""><font face="Menlo" class="">...<br class="">
mpirun has exited due to process rank 14 with PID 19061 on<br class="">
node neon-compute-2-25.local exiting improperly. There are two reasons this could occur:<br class="">
<br class="">
1. this process did not call "init" before exiting, but others in<br class="">
the job did. This can cause a job to hang indefinitely while it waits<br class="">
for all processes to call "init". By rule, if one process calls "init",<br class="">
then ALL processes must call "init" prior to termination.<br class="">
<br class="">
2. this process called "init", but exited without calling "finalize".<br class="">
By rule, all processes that call "init" MUST call "finalize" prior to<br class="">
exiting or it will be considered an "abnormal termination"<br class="">
<br class="">
This may have caused other processes in the application to be<br class="">
terminated by signals sent by mpirun (as reported here).<br class="">
--------------------------------------------------------------------------<br class="">
Uni_+6%.scf1up_1: No such file or directory.<br class="">
grep: *scf1up*: No such file or directory</font></div>
<div class=""><br class="">
</div>
<div class="">-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------</div>
<div class=""><br class="">
</div>
<div class="">The job script I’m using is:</div>
<div class=""> </div>
<div class="">-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------</div>
<div class=""><font face="Menlo" class="">!/bin/csh -f<br class="">
# -S /bin/sh<br class="">
#<br class="">
#$ -N uni_6<br class="">
#$ -q MF<br class="">
#$ -m be<br class="">
#$ -M <a href="mailto:wenhao-hu@uiowa.edu" class="">wenhao-hu@uiowa.edu</a><br class="">
#$ -pe smp 16<br class="">
#$ -cwd<br class="">
#$ -j y<br class="">
<br class="">
cp $PE_HOSTFILE hostfile<br class="">
echo "PE_HOSTFILE:"<br class="">
echo $PE_HOSTFILE<br class="">
rm .machines<br class="">
echo granularity:1 >>.machines<br class="">
while read hostname slot useless; do<br class="">
i=0<br class="">
l0=$hostname<br class="">
while [ $i -lt $slot ]; do<br class="">
echo 1:$hostname:2 >>.machines<br class="">
let i=i+2<br class="">
done<br class="">
done<hostfile<br class="">
<br class="">
echo lapw0:$l0:16 >>.machines<br class="">
<br class="">
runsp_lapw -p -min -ec 0.0001 -cc 0.001 -fc 0.5</font></div>
<div class="">-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------<br class="">
<br class="">
Is there any mistake I made or something missing in my script?</div>
<div class=""><br class="">
</div>
<div class="">Thank your very much for your help.</div>
<div class=""><br class="">
</div>
<div class="">Wenhao</div>
<div class=""><br class="">
</div>
<div class="">
<blockquote type="cite" class="">I do not know many compounds, for which an RMT=1.2 bohr for H makes any sense (maybe LiH). Use setrmt and follow the suggestion. Usually, H spheres of CH or OH bonds should be less than 0.6 bohr. Experimental H-position are
often very unreliable.<br class="">
How many k-points ? Often 1 k-point is enough for 50+ atoms (at least at the beginning), in particular when you ahve an insulator.<br class="">
<div class="">Otherwise, follow the suggestions of L.Marks about parallelization.</div>
<div class=""><br class="">
</div>
<div class=""></div>
<blockquote type="cite" class="">
<div class="">Am 08.01.2016 um 07:28 schrieb Hu, Wenhao:</div>
<br class="Apple-interchange-newline">
<div class="">Hi, all:</div>
<div class=""><br class="">
</div>
<div class="">I have some confusions on the Rkm in calculations with 50+ atoms. In my wien2k, </div>
<div class="">the NATMAX and NUME are set to 15000 and 1700. With the highest NE and NAT, the </div>
<div class="">Rkmax can only be as large as 2.05, which is much lower than the suggested </div>
<div class="">value in FAQ page of WIEN2K (the smallest atom in my case is a H atom with </div>
<div class="">radius of 1.2). By checking the :RKM flag in case.scf, I have the following </div>
<div class="">information:</div>
<div class=""><br class="">
</div>
<div class="">:RKM : MATRIX SIZE 11292LOs: 979 RKM= 2.05 WEIGHT= 1.00 PGR:</div>
<div class=""><br class="">
</div>
<div class="">With such a matrix size, the single cycle can take as long as two and half </div>
<div class="">hours. Although I can increase the NATMAX and NUME to raise Rkmax, the </div>
<div class="">calculation will be way slower, which will make the optimization calculation </div>
<div class="">almost impossible. Before making convergence test on Rkmax, can anyone tell me </div>
<div class="">whether such a Rkmax is a reasonable value?</div>
<div class=""><br class="">
</div>
<div class="">If any further information is needed, please let me know. Thanks in advance.</div>
<div class=""><br class="">
</div>
<div class="">Best,</div>
<div class="">Wenhao</div>
</blockquote>
<div class=""></div>
<blockquote type="cite" class="">
<div class="">_______________________________________________</div>
<div class="">Wien mailing list</div>
<div class=""><a href="mailto:Wien@zeus.theochem.tuwien.ac.at" class="">Wien@zeus.theochem.tuwien.ac.at</a></div>
<br class="Apple-interchange-newline">
<a href="http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien" class="">http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien</a>
<div class=""><br class="">
</div>
<div class="">SEARCH the MAILING-LIST at: </div>
<br class="Apple-interchange-newline">
http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html</blockquote>
<div class=""><br class="">
</div>
<div class=""><br class="">
</div>
<br class="Apple-interchange-newline">
<div class="">--</div>
<div class="">--------------------------------------------------------------------------</div>
<div class="">Peter BLAHA, Inst.f. Materials Chemistry, TU Vienna, A-1060 Vienna</div>
<div class="">Phone: +43-1-58801-165300 FAX: +43-1-58801-165982</div>
<div class="">Email: bl...@<a href="http://theochem.tuwien.ac.at" class="">theochem.tuwien.ac.at</a> WIEN2k: </div>
<a href="http://www.wien2k.at" class="">http://www.wien2k.at</a>
<div class=""><br class="">
</div>
<div class="">WWW: </div>
http://www.imc.tuwien.ac.at/staff/tc_group_e.php
<div class=""><br class="">
</div>
<div class="">--------------------------------------------------------------------------</div>
<div class="">_______________________________________________</div>
<div class="">Wien mailing list</div>
<div class="">Wien@zeus.theochem.tuwien.ac.at</div>
<br class="Apple-interchange-newline">
http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
<div class=""><br class="">
</div>
<div class="">SEARCH the MAILING-LIST at: </div>
<br class="Apple-interchange-newline">
http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html<br class="">
<br class="">
</blockquote>
</div>
</body>
</html>