<div style="line-height:1.7;color:#000000;font-size:14px;font-family:arial"><div style="line-height:1.7;color:#000000;font-size:14px;font-family:arial"><span style="font-size: 14px;">Dear all,<br> I compile wien2k 11</span><span style="font-size: 14px;"> on linux centos 5.5 with icc , ifort 11.1, openmpi mpif90, and intel mkl with the following parameter:</span><br><blockquote><span style="font-size: 14px;"><span style="color: rgb(136, 0, 0); font-size: 10px;"><span style="color: rgb(136, 0, 0);"><span style="color: rgb(0, 0, 255);">K1 Linux (Intel ifort 11.1 compiler + mkl )<br><span style="color: rgb(136, 0, 0);"> O Compiler options: -FR -mp1 -w -prec_div -pc80 -pad -ip -DINTEL_VML -traceback</span><span style="color: rgb(136, 0, 0);"><br><span style="color: rgb(136, 0, 0);"> L Linker Flags: $(FOP!
T) -L/home/yljia/intel/Compiler/11.1/072/mkl/lib/em64t -pthread</span><br><span style="color: rgb(136, 0, 0);"> P Preprocessor flags '-DParallel'</span><br><span style="color: rgb(136, 0, 0);"> R R_LIB (LAPACK+BLAS): -lmkl_lapack -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -openmp -lpthread -lguide</span></span></span></span></span></span><br><span style="font-size: 10px;"></span><span style="color: rgb(0, 0, 255);"><span style="font-size: 10px;"><span style="color: rgb(0, 0, 255);">RP RP_LIB(SCALAPACK+PBLAS): -lmkl_scalapack_lp64 -lmkl_solver_lp64 -lmkl_blacs_openmpi_lp64 -L/home/yljia/compiler_library/fftw-2.1.5/lib/ -lfftw_mpi -lfftw $(R_LIBS)<br>FP FPOPT(par.comp.options): -FR -mp1 -w -prec_div -pc80 -pad -ip -DINTEL_VML -traceback<br>MP MPIRUN commando : mpirun -np _NP_ --hostfile _HOSTS_ _EXEC_</span></span></span!
><br></blockquote> The program can run in non parallel mode, k point paralle. But in mpi parallel mode , it has error messages in the following two files:<br>1. STDOUT:<br><blockquote style="color: rgb(136, 0, 0);"><span style="font-size: 10px; color: rgb(136, 0, 0);"> LAPW0 END</span><br><span style="font-size: 10px; color: rgb(136, 0, 0);"> LAPW0 END</span><br><span style="font-size: 10px; color: rgb(136, 0, 0);"> .........</span><br><span style="font-size: 10px; color: rgb(136, 0, 0);"> LAPW0 END</span><br><span style="font-size: 10px; color: rgb(136, 0, 0);"> LAPW1 END</span><br><span style="font-size: 10px; color: rgb(136, 0, 0);"> LAPW1 END</span><br><span style="font-size: 10px; color: rgb(136, 0, 0);"> LAPW1 END</span><br><span style="font-size: 10px; color: rgb(136, 0, 0);"> LAPW1 END</span><br><span style="font-size: 10px; color: rgb!
(136, 0, 0);"> --------------------------------------------------------------------------</span><br><span style="font-size: 10px; color: rgb(136, 0, 0);"> There are no allocated resources for the application</span><br><span style="font-size: 10px; color: rgb(136, 0, 0);"> /home/yljia/software/wien2k_11/lapw1_mpi</span><br><span style="font-size: 10px; color: rgb(136, 0, 0);"> that match the requested mapping:</span><br><span style="font-size: 10px; color: rgb(136, 0, 0);"> .machine5</span><br><br><span style="font-size: 10px; color: rgb(136, 0, 0);"> Verify that you have mapped the allocated resources properly using the</span><br><span style="font-size: 10px; color: rgb(136, 0, 0);"> --host or --hostfile specification.</span><br><span style="font-size: 10px; color: rgb(136, 0, 0);"> --------------------------!
------------------------------------------------</span><br><span style="font-size: 10px; color: rgb(136, 0, 0);"> LAPW1 END</span><br><span style="font-size: 10px; color: rgb(136, 0, 0);"> LAPW1 END</span><br><span style="font-size: 10px; color: rgb(136, 0, 0);"> --------------------------------------------------------------------------</span><br><span style="font-size: 10px; color: rgb(136, 0, 0);"> There are no allocated resources for the application</span><br><span style="font-size: 10px; color: rgb(136, 0, 0);"> /home/yljia/software/wien2k_11/lapw1_mpi</span><br><span style="font-size: 10px; color: rgb(136, 0, 0);"> that match the requested mapping:</span><br><span style="font-size: 10px; color: rgb(136, 0, 0);"> .machine6</span><br><span style="font-size: 10px; color: rgb(136, 0, 0);"> ...........</span><br><span style="font-size: 10px; color: rgb(136, 0, 0);">&!
nbsp; ...........</span><br><span style="font-size: 10px; color: rgb(136, 0, 0);"> .machine8</span><br><span style="font-size: 10px; color: rgb(136, 0, 0);"> </span><br><span style="font-size: 10px; color: rgb(136, 0, 0);"> Verify that you have mapped the allocated resources properly using the</span><br><span style="font-size: 10px; color: rgb(136, 0, 0);"> --host or --hostfile specification.</span><br><span style="font-size: 10px; color: rgb(136, 0, 0);"> --------------------------------------------------------------------------</span><br><span style="font-size: 10px; color: rgb(136, 0, 0);"> &nbs!
p; FERMI - Error</span><br><span style="font-size: 10px; color: rgb(136, 0, 0);"> cp: cannot stat `.in.tmp': No such file or directory</span><br><span style="font-size: 10px; color: rgb(136, 0, 0);"> rm: cannot remove `.in.tmp': No such file or directory</span><br><span style="font-size: 10px; color: rgb(136, 0, 0);"> rm: cannot remove `.in.tmp1': No such file or directory</span><br><br><span style="font-size: 10px; color: rgb(136, 0, 0);"> > stop error</span><br></blockquote><span style="font-size: 10px; color: rgb(136, 0, 0);"></span><span style="font-size: 14px; color: rgb(0, 0, 0);"><span style="color: rgb(0, 0, 0);">2. TiC.dayfile:</span></span><br><blockquote style="color: rgb(136, 0, 0);"><span style="font-size: 10px; color: rgb(136, 0, 0);"> Calculating TiC in /home/yljia/wien2k/TiC/testqsub/TiC</span><br><span style="font-size: 10px; color: rgb(136, 0, 0);"> on compute-0-12.local with PID 16027</span><br><span sty!
le="font-size: 10px; color: rgb(136, 0, 0);"> using WIEN2k_11.1 (Release 14/6/2011) in /home/yljia/software/wien2k_11</span><br><br><span style="font-size: 10px; color: rgb(136, 0, 0);"> start (Sat Aug 3 00:42:07 CST 2013) with lapw0 (40/99 to go)</span><br><br><span style="font-size: 10px; color: rgb(136, 0, 0);"> cycle 1 (Sat Aug 3 00:42:07 CST 2013) (40/99 to go)</span><br><br><span style="font-size: 10px; color: rgb(136, 0, 0);"> > lapw0 -p (00:42:07) starting parallel lapw0 at Sat Aug 3 00:42:07 CST 2013</span><br><span style="font-size: 10px; color: rgb(136, 0, 0);"> -------- .machine0 : 16 processors</span><br><span style="font-size: 10px; color: rgb(136, 0, 0);"> 5.812u 22.540s 0:04.23 670.2% 0+0k 0+0io 205pf+0w</span><br><span style!
="font-size: 10px; color: rgb(136, 0, 0);"> > lapw1 -p (00:42:11) starting parallel lapw1 at Sat Aug 3 00:42:12 CST 2013</span><br><span style="font-size: 10px; color: rgb(136, 0, 0);"> -> starting parallel LAPW1 jobs at Sat Aug 3 00:42:12 CST 2013</span><br><span style="font-size: 10px; color: rgb(136, 0, 0);"> running LAPW1 in parallel mode (using .machines)</span><br><span style="font-size: 10px; color: rgb(136, 0, 0);"> 8 number_of_parallel_jobs</span><br><span style="font-size: 10px; color: rgb(136, 0, 0);"> compute-0-12 compute-0-12(32) 3.181u 0.181s 0:02.77 121.2% 0+0k 0+0io 33pf+0w</span><br><span style="font-size: 10px; color: rgb(136, 0, 0);"> compute-0-12 compute-0-12(32) 2.781u 0.117s 0:02.58 112.0% 0+0k 0+0io 18pf+0w</span><br><span style="font-size: 10px; color: rgb(136, 0, 0);"> &nb!
sp; compute-0-12 compute-0-12(32) 2.343u 0.089s 0:02.28 106.1% 0+0k 0+0io 17pf+0w</span><br><span style="font-size: 10px; color: rgb(136, 0, 0);"> compute-0-12 compute-0-12(32) 2.818u 0.126s 0:02.52 116.2% 0+0k 0+0io 17pf+0w</span><br><span style="font-size: 10px; color: rgb(136, 0, 0);"> compute-0-2 compute-0-2(32) 0.010u 0.012s 0:00.03 66.6% 0+0k 0+0io 0pf+0w</span><br><span style="font-size: 10px; color: rgb(136, 0, 0);"> compute-0-2 compute-0-2(32) 0.009u 0.014s 0:00.03 33.3% 0+0k 0+0io 0pf+0w</span><br><span style="font-size: 10px; color: rgb(136, 0, 0);"> compute-0-2 compute-0-2(32) 0.010u 0.020s 0:00.04 75.0% 0+0k 0+0io 0pf+0w</span><br><span style="font-size: 10px; color: rgb(136, 0, 0);"> !
compute-0-2 compute-0-2(32) 0.012u 0.020s 0:00.04 75.0% 0+0k 0+0io 0pf+0w</span><br><span style="font-size: 10px; color: rgb(136, 0, 0);"> Summary of lapw1para:</span><br><span style="font-size: 10px; color: rgb(136, 0, 0);"> compute-0-12 k=0 user=128 wallclock=30.78</span><br><span style="font-size: 10px; color: rgb(136, 0, 0);"> 11.349u 1.617s 0:10.77 120.2% 0+0k 0+0io 85pf+0w</span><br><span style="font-size: 10px; color: rgb(136, 0, 0);"> > lapw2 -p (00:42:22) running LAPW2 in parallel mode</span><br><span style="font-size: 10px; color: rgb(136, 0, 0);"> ** LAPW2 crashed!</span><br><span style="font-size: 10px; color: rgb(136, 0, 0);"> 0.076u 0.108s 0:00.20 85.0% 0+0k 0+0io 9pf+0w</span><br><span style="font-size: 10px; color: rgb(136, 0, 0);"> &nb!
sp; error: command /home/yljia/software/wien2k_11/lapw2para lapw2.def failed</span><br><br><span style="font-size: 10px; color: rgb(136, 0, 0);"> > stop error </span><br></blockquote> The following is the shell script I submit. I have 2 nodes, and each has 8 cores[except the host node]:<br><blockquote><blockquote style="color: rgb(136, 0, 0); font-size: 10px;"><span style="color: rgb(136, 0, 0);">#!/bin/tcsh</span><br><span style="color: rgb(136, 0, 0);">#$ -S /bin/tcsh</span><br><span style="color: rgb(136, 0, 0);">#$ -N W2web_Job</span><br><span style="color: rgb(136, 0, 0);"># MPIR_HOME from submitting environment</span><br><span style="color: rgb(136, 0, 0);">#$ -v MPIR_HOME</span><br><span style="color: rgb(136, 0, 0);"># needs in</span><br><span style="color: rgb(136, 0, 0);"># $NSLOTS</span><br><span style="color: rgb(136, 0, 0);"># the number of !
tasks to be used</span><br><span style="color: rgb(136, 0, 0);"># $TMPDIR/machines</span><br><span style="color: rgb(136, 0, 0);"># a valid machine file to be passed to mpirun</span><br><span style="color: rgb(136, 0, 0);">#$ -cwd</span><br><span style="color: rgb(136, 0, 0);">#$ -o job.out</span><br><span style="color: rgb(136, 0, 0);">#$ -e job.err</span><br><span style="color: rgb(136, 0, 0);">#$ -q parallel.q </span><br><span style="color: rgb(136, 0, 0);">#$ -pe mpich 8</span><br><span style="color: rgb(136, 0, 0);"># mpich / jobs_per_node = number of nodes </span><br><br><span style="color: rgb(136, 0, 0);">set mpijob=1</span><br><span style="color: rgb(136, 0, 0);">set jobs_per_node=8</span><br><span style="color: rgb(136, 0, 0);">setenv OMP_NUM_THREADS 1</span><br><span style="color: rgb(136, 0, 0);">setenv USE_REMOTE 0</span><br><br><span style="color: rgb(136, 0, 0);">echo "Got $NSLOTS slots." > job.out</span><br><span style="color: rgb(136, 0, 0);">echo "Got $NSLOTS slot!
s." > job.err</span><br><br><span style="color: rgb(136, 0, 0);">pwd</span><br><br><span style="color: rgb(136, 0, 0);">set proclist=`cat $TMPDIR/machines`</span><br><span style="color: rgb(136, 0, 0);">set nproc=$NSLOTS</span><br><span style="color: rgb(136, 0, 0);">echo $nproc nodes for this job: $proclist</span><br><span style="color: rgb(136, 0, 0);">if( -e .proclist_tmp) rm .proclist_tmp<br>if ($jobs_per_node != 8 ) then<br>set j=1<br>while ($j <= $nproc )<br>@ j1 = $j + $jobs_per_node<br>@ j1 = $j1 - 1<br>echo $proclist[$j-$j1] >>.proclist_tmp<br>@ j = $j + 8<br>end<br>set proclist=`cat .proclist_tmp`<br>rm .proclist_tmp<br>set nproc=$#proclist<br>endif<br>echo $nproc nodes for this job: $proclist<br><br>echo '#' > .machines<br><br># example for an MPI parallel lapw0<br>echo -n 'lapw0:' >> .machines<br>echo $proclist >>.machines</span><br>#example for k-point and mpi parallel lapw1/2<br>#set j=1<br>#while ($j <= $jobs_per_node )<br>!
set i=1<br>while ($i <= $nproc )<br>echo -n '1:' >>.machines<br>@ i1 = $i + $mpijob<br>@ i2 = $i1 - 1<br>echo $proclist[$i-$i2] >>.machines<br>set i=$i1<br>end<br>echo 'granularity:1' >>.machines<br>echo 'extrafine:1' >>.machines<br><br>date<br><br>run_lapw -p -ec 0.0001 -NI >& STDOUT<br></blockquote></blockquote>Any comment is welcome! Thanks in advance!<br><br>Have a nice weekend!<br>Jia Yalei<br><span style="font-size: 14px;"></span> </div></div>