[Wien] k-point parallel problem
liyh
lyhua at fudan.edu.cn
Wed Jan 5 04:25:22 CET 2005
Dear wien users,
I am trying to run wien2k.04 using k-point parallel method in our NFS cluster.
We use PBS to submit a program. but it failed when run the lapw1para, and give no any more error message.
It only say "you should submit the program using PBS". There is no problem when run lapw1 in parral. And I find
there are different results when we use different number of nodes. When we use two nodes it only give such error message, the case.output1_* and case.scf1_* are empty. When we use 16 nodes, we find the case.output1_* are not empty, but the end of these file are different.
28 -rw-rw-r-- 1 gong gong 28672 Jan 5 10:33 sisiy4.output1_13
28 -rw-rw-r-- 1 gong gong 28672 Jan 5 10:33 sisiy4.output1_14
28 -rw-rw-r-- 1 gong gong 28672 Jan 5 10:33 sisiy4.output1_15
28 -rw-rw-r-- 1 gong gong 28672 Jan 5 10:33 sisiy4.output1_16
28 -rw-rw-r-- 1 gong gong 28672 Jan 5 10:34 sisiy4.output1_17
32 -rw-rw-r-- 1 gong gong 31928 Jan 5 10:38 sisiy4.output1_18 <-------
28 -rw-rw-r-- 1 gong gong 28672 Jan 5 10:33 sisiy4.output1_2
28 -rw-rw-r-- 1 gong gong 28672 Jan 5 10:33 sisiy4.output1_3
28 -rw-rw-r-- 1 gong gong 28672 Jan 5 10:33 sisiy4.output1_4
for sisiy4.output_18, the end of this file is:
1.3149241 1.3153842 1.3165951 1.3169547 1.3543941
1.3546119 1.3603153 1.3606674 1.4093531 1.4103108
1.4156424 1.4167195 1.4719094 1.4720293
0 EIGENVALUES BELOW THE ENERGY -7.00000
********************************************************
NUMBER OF K-POINTS: 1
===> TOTAL CPU TIME: 211.2 (INIT = 0.6 + K-POINTS = 210.6)
> SUM OF WALL CLOCK TIMES: 212.8 (INIT = 0.6 + K-POINTS = 212.2)
Maximum WALL clock time: 213.306272029877
Maximum CPU time: 211.330000000000
for other files the end of file are:
K= 0 0 1 IND= 1
1. WAVE= 0 0 1 TAUP= 1.00000 0.00000
WARPING= -0.00010 -0.00272
K= 0 0 -2 IND= 1
1. WAVE= 0 0 -2 TAUP= 1.00000 0.00000
WARPING= 0.00004 0.00000
K= 0 0 2 IND= 1
1. WAVE= 0 0 2 TAUP= 1.00000 0.00000
the scf files are also different.
0 -rw-rw-r-- 1 gong gong 0 Jan 5 10:33 sisiy4.scf1_13
0 -rw-rw-r-- 1 gong gong 0 Jan 5 10:33 sisiy4.scf1_14
0 -rw-rw-r-- 1 gong gong 0 Jan 5 10:33 sisiy4.scf1_15
0 -rw-rw-r-- 1 gong gong 0 Jan 5 10:33 sisiy4.scf1_16
4 -rw-rw-r-- 1 gong gong 4096 Jan 5 10:38 sisiy4.scf1_17 <--------
8 -rw-rw-r-- 1 gong gong 5778 Jan 5 10:38 sisiy4.scf1_18 <--------
0 -rw-rw-r-- 1 gong gong 0 Jan 5 10:33 sisiy4.scf1_2
0 -rw-rw-r-- 1 gong gong 0 Jan 5 10:33 sisiy4.scf1_3
0 -rw-rw-r-- 1 gong gong 0 Jan 5 10:33 sisiy4.scf1_4
the end of sisiy4.scf1_18 is:
1.2477656 1.2874735 1.2881612 1.2952041 1.2954086
1.3149241 1.3153842 1.3165951 1.3169547 1.3543941
1.3546119 1.3603153 1.3606674 1.4093531 1.4103108
1.4156424 1.4167195 1.4719094 1.4720293
********************************************************
NUMBER OF K-POINTS: 1
the end of sisiy4.scf1_17 is
ATOMIC SPHERE DEPENDENT PARAMETERS FOR ATOM Si
OVERALL ENERGY PARAMETER IS 0.3000
OVERALL BASIS SET ON ATOM IS LAPW
E( 0)= -0.2500
APW+lo
E( 1)= 0.3000
APW+lo
K= 0.45000 0.35714 0.25000
Which means when one node finished (lapw1) then the program finished by force?
this is our .machines file:
1:comp29
1:comp94
1:comp28
1:comp41
1:comp21
1:comp11
1:comp39
1:comp18
1:comp9
1:comp35
1:comp61
1:comp58
1:comp54
1:comp68
1:comp95
1:comp57
granularity:1
extrafine
this is our script file for PBS,
${RSH} ${MASTERNODE}
cd ${WORKDIR}
#${MKDIR} ${SCRATCHDIR}
#cd ${SCRATCHDIR}
#${CP} ${WORKDIR}/* .
#start creating .machines
rm -f .machines
awk '{print "1:"$1}' $PBS_NODEFILE > .machines
echo 'granularity:1' >>.machines
echo 'extrafine' >>.machines
#define here your WIEN2k command
#${LAUNCH} $PBS_NODEFILE ${PROGRAMEXEC}>run.out
run_lapw -p -i 15 -ec 0.000001 >run.out
#${CP} ./* ${WORKDIR}
#cd ${WORKDIR}
#rm -rf ${SCRATCHDIR}
Best Wish to You!
Yonghua
Fudan University of China
More information about the Wien
mailing list