[Wien] k-point parallel problem

liyh lyhua at fudan.edu.cn
Wed Jan 5 04:25:22 CET 2005


Dear wien users,
      
       I am trying to run wien2k.04 using k-point parallel method in our NFS cluster.
We use PBS to submit a program. but it failed when run the lapw1para, and give no any more error message.
It only say "you should submit the program using PBS". There is no problem when run lapw1 in parral. And I find
there are different results when we use different number of nodes. When we use two nodes it only give such error message, the case.output1_* and case.scf1_* are empty. When we use 16 nodes, we find the case.output1_* are not empty, but the end of these file are different.
  28 -rw-rw-r--    1 gong     gong        28672 Jan  5 10:33 sisiy4.output1_13
  28 -rw-rw-r--    1 gong     gong        28672 Jan  5 10:33 sisiy4.output1_14
  28 -rw-rw-r--    1 gong     gong        28672 Jan  5 10:33 sisiy4.output1_15
  28 -rw-rw-r--    1 gong     gong        28672 Jan  5 10:33 sisiy4.output1_16
  28 -rw-rw-r--    1 gong     gong        28672 Jan  5 10:34 sisiy4.output1_17
  32 -rw-rw-r--    1 gong     gong        31928 Jan  5 10:38 sisiy4.output1_18  <-------
  28 -rw-rw-r--    1 gong     gong        28672 Jan  5 10:33 sisiy4.output1_2
  28 -rw-rw-r--    1 gong     gong        28672 Jan  5 10:33 sisiy4.output1_3
  28 -rw-rw-r--    1 gong     gong        28672 Jan  5 10:33 sisiy4.output1_4 
for sisiy4.output_18, the end of this file is:
      1.3149241    1.3153842    1.3165951    1.3169547    1.3543941
      1.3546119    1.3603153    1.3606674    1.4093531    1.4103108
      1.4156424    1.4167195    1.4719094    1.4720293
            0 EIGENVALUES BELOW THE ENERGY   -7.00000
       ********************************************************

       NUMBER OF K-POINTS:           1
   ===> TOTAL CPU       TIME:    211.2 (INIT =      0.6 + K-POINTS =    210.6)
   > SUM OF WALL CLOCK TIMES:    212.8 (INIT =      0.6 + K-POINTS =    212.2)
      Maximum WALL clock time:    213.306272029877
      Maximum CPU time:           211.330000000000
for other files the end of file are:
 K=    0    0    1  IND= 1
                     1. WAVE=    0    0    1    TAUP=   1.00000   0.00000
                                                WARPING=  -0.00010  -0.00272
   K=    0    0   -2  IND= 1
                     1. WAVE=    0    0   -2    TAUP=   1.00000   0.00000
                                                WARPING=   0.00004   0.00000
   K=    0    0    2  IND= 1
                     1. WAVE=    0    0    2    TAUP=   1.00000   0.00000
the scf files are also different.
 0 -rw-rw-r--    1 gong     gong            0 Jan  5 10:33 sisiy4.scf1_13
   0 -rw-rw-r--    1 gong     gong            0 Jan  5 10:33 sisiy4.scf1_14
   0 -rw-rw-r--    1 gong     gong            0 Jan  5 10:33 sisiy4.scf1_15
   0 -rw-rw-r--    1 gong     gong            0 Jan  5 10:33 sisiy4.scf1_16
   4 -rw-rw-r--    1 gong     gong         4096 Jan  5 10:38 sisiy4.scf1_17 <--------
   8 -rw-rw-r--    1 gong     gong         5778 Jan  5 10:38 sisiy4.scf1_18 <--------
   0 -rw-rw-r--    1 gong     gong            0 Jan  5 10:33 sisiy4.scf1_2
   0 -rw-rw-r--    1 gong     gong            0 Jan  5 10:33 sisiy4.scf1_3
   0 -rw-rw-r--    1 gong     gong            0 Jan  5 10:33 sisiy4.scf1_4
the end of sisiy4.scf1_18 is:
  1.2477656    1.2874735    1.2881612    1.2952041    1.2954086
         1.3149241    1.3153842    1.3165951    1.3169547    1.3543941
         1.3546119    1.3603153    1.3606674    1.4093531    1.4103108
         1.4156424    1.4167195    1.4719094    1.4720293
       ********************************************************

       NUMBER OF K-POINTS:           1
the end of sisiy4.scf1_17 is

          ATOMIC SPHERE DEPENDENT PARAMETERS FOR ATOM  Si
          OVERALL ENERGY PARAMETER IS    0.3000
          OVERALL BASIS SET ON ATOM IS LAPW
          E( 0)=   -0.2500
             APW+lo
          E( 1)=    0.3000
             APW+lo

       K=   0.45000   0.35714   0.25000
                                               
Which means when one node finished (lapw1) then the program finished by force?
this is our .machines file:
1:comp29
1:comp94
1:comp28
1:comp41
1:comp21
1:comp11
1:comp39
1:comp18
1:comp9
1:comp35
1:comp61
1:comp58
1:comp54
1:comp68
1:comp95
1:comp57
granularity:1
extrafine
this is our script file for PBS,
${RSH} ${MASTERNODE}
cd ${WORKDIR}
#${MKDIR} ${SCRATCHDIR}
#cd ${SCRATCHDIR}
#${CP} ${WORKDIR}/* .
#start creating .machines
rm -f .machines
awk '{print "1:"$1}' $PBS_NODEFILE > .machines
echo 'granularity:1' >>.machines
echo 'extrafine' >>.machines

#define here your WIEN2k command
#${LAUNCH} $PBS_NODEFILE ${PROGRAMEXEC}>run.out
run_lapw -p -i 15 -ec 0.000001 >run.out
#${CP} ./* ${WORKDIR}
#cd ${WORKDIR}
#rm -rf ${SCRATCHDIR}

Best Wish to You!
    
        Yonghua
  Fudan University of China 




More information about the Wien mailing list