<DIV>
<BLOCKQUOTE class=replbq style="PADDING-LEFT: 5px; MARGIN-LEFT: 5px; BORDER-LEFT: #1010ff 2px solid">
<DIV>
<BLOCKQUOTE class=replbq style="PADDING-LEFT: 5px; MARGIN-LEFT: 5px; BORDER-LEFT: #1010ff 2px solid">
<DIV>
<BLOCKQUOTE class=replbq style="PADDING-LEFT: 5px; MARGIN-LEFT: 5px; BORDER-LEFT: #1010ff 2px solid">
<DIV>
<DIV>Dear all:</DIV>
<DIV> I use the supercomputer to calculate,I use four computers to calculate parelly, But there are always an error as following(case.dayfile and nohup.out file):</DIV>
<DIV>case.dayfile:</DIV>
<DIV>*************************************************************************************</DIV>
<DIV>Calculating CoCNP in /home/zul3/CoCNP<BR>on n12</DIV>
<DIV> start (Fri May 13 18:03:48 PDT 2005) with lapw0 (50/20 to go)<BR>> lapw0 -p (18:03:48) starting parallel lapw0 at Fri May 13 18:03:48 PDT 2005<BR>--------<BR>running lapw0 in single mode<BR>34.667u 0.156s 0:35.91 96.9% 0+0k 0+0io 1pf+0w<BR>> lapw1 -up -p (18:04:24) starting parallel lapw1 at Fri May 13 18:04:24 PDT 2005<BR>-> starting parallel LAPW1 jobs at Fri May 13 18:04:24 PDT 2005<BR>running LAPW1 in parallel mode (using .machines)<BR>4 number_of_parallel_jobs<BR> 192.20.110.113(6) 192.20.110.102(6) 192.20.110.103(6) 192.20.110.112(6) 192.20.110.113(3) 'unknown','formatted',0<BR> 5,'CoCNP.in1', 'old',
'formatted',0<BR> 6,'CoCNP.output1up','unknown','formatted',0<BR>10,'./CoCNP.vectorup', 'unknown','unformatted',9000<BR>11,'CoCNP.energyup', 'unknown','formatted',0<BR>18,'CoCNP.vspup', 'old', 'formatted',0<BR>19,'CoCNP.vnsup', 'unknown','formatted',0<BR>20,'CoCNP.struct', 'old', 'formatted',0<BR>21,'CoCNP.scf1up', 'unknown','formatted',0<BR>55,'CoCNP.vec', 'unknown','formatted',0<BR>71,'CoCNP.nshup', 'unknown','formatted',0<BR> Summary of lapw1para:<BR> 192.20.110.113 k=6 user=192.2 wallclock=11538<BR> 192.20.110.102 k=6 user=192.2 wallclock=11538<BR> 192.20.110.103 k=6 user=192.2 wallclock=11538<BR>
192.20.110.112 k=6 user=192.2 wallclock=11538<BR>0.485u 0.636s 3:56.77 0.4% 0+0k 0+0io 0pf+0w<BR>> lapw1 -dn -p (18:08:21) starting parallel lapw1 at Fri May 13 18:08:21 PDT 2005<BR>-> starting parallel LAPW1 jobs at Fri May 13 18:08:21 PDT 2005<BR>running LAPW1 in parallel mode (using .machines.help)<BR>4 number_of_parallel_jobs<BR> 192.20.110.113(6) 192.20.110.102(6) 192.20.110.103(6) 192.20.110.112(6) 192.20.110.113(3) Summary of lapw1para:<BR> 192.20.110.113 k=6 user=192.2 wallclock=11538<BR> 192.20.110.102 k=6 user=192.2 wallclock=11538<BR> 192.20.110.103 k=6 user=192.2 wallclock=11538<BR> 192.20.110.112 k=6 user=192.2
wallclock=11538<BR>0.497u 0.616s 3:56.99 0.4% 0+0k 0+0io 0pf+0w<BR>> lapw2 -up -p (18:12:18) running LAPW2 in parallel mode<BR> 192.20.110.113<BR> 192.20.110.102<BR> 192.20.110.103<BR> 192.20.110.112<BR> 192.20.110.113<BR> Summary of lapw2para:<BR> 192.20.110.113 user=192.2 wallclock=11724.2<BR> 192.20.110.102 user=192.2 wallclock=11724.2<BR> 192.20.110.103 user=192.2 wallclock=11724.2<BR> 192.20.110.112 user=192.2 wallclock=11724.2<BR>14.088u 0.602s 1:07.72 21.6% 0+0k 0+0io 26pf+0w<BR>> lapw2 -dn -p (18:13:26) running LAPW2 in parallel mode<BR> 192.20.110.113<BR> 192.20.110.102<BR>
192.20.110.103<BR> 192.20.110.112<BR> 192.20.110.113<BR> Summary of lapw2para:<BR> 192.20.110.113 user=192.2 wallclock=11724.2<BR> 192.20.110.102 user=192.2 wallclock=11724.2<BR> 192.20.110.103 user=192.2 wallclock=11724.2<BR> 192.20.110.112 user=192.2 wallclock=11724.2<BR>4.179u 0.666s 0:51.71 9.3% 0+0k 0+0io 0pf+0w<BR>> lcore -up (18:14:18) 0.306u 0.005s 0:00.51 58.8% 0+0k 0+0io 5pf+0w<BR>> lcore -dn (18:14:19) 0.305u 0.006s 0:00.45 66.6% 0+0k 0+0io 0pf+0w<BR>> mixer (18:14:21) 4.539u 0.156s 0:06.25 74.8% 0+0k 0+0io 10pf+0w<BR>:ENERGY convergence: 0 0 0<BR>:CHARGE convergence: 0 0.0001 0<BR>49/19 to go</DIV>
<DIV>.................</DIV>
<DIV> </DIV>
<DIV> 46/16 to go</DIV>
<DIV>> lapw0 -p (18:45:30) starting parallel lapw0 at Fri May 13 18:45:30 PDT 2005<BR>--------<BR>running lapw0 in single mode<BR>34.723u 0.128s 0:35.82 97.2% 0+0k 0+0io 0pf+0w<BR>> lapw1 -up -p (18:46:06) starting parallel lapw1 at Fri May 13 18:46:06 PDT 2005<BR>-> starting parallel LAPW1 jobs at Fri May 13 18:46:06 PDT 2005<BR>running LAPW1 in parallel mode (using .machines)<BR>4 number_of_parallel_jobs<BR> 192.20.110.113(6) 192.20.110.102(6) 'unknown','formatted',0<BR> 5,'CoCNP.in1', 'old', 'formatted',0<BR> 6,'CoCNP.output1up','unknown','formatted',0<BR>10,'./CoCNP.vectorup', 'unknown','unformatted',9000<BR>11,'CoCNP.energyup', 'unknown','formatted',0<BR>18,'CoCNP.vspup', 'old',
'formatted',0<BR>19,'CoCNP.vnsup', 'unknown','formatted',0<BR>20,'CoCNP.struct', 'old', 'formatted',0<BR>21,'CoCNP.scf1up', 'unknown','formatted',0<BR>55,'CoCNP.vec', 'unknown','formatted',0<BR>71,'CoCNP.nshup', 'unknown','formatted',0<BR> 192.20.110.103(6) 192.20.110.112(6) 192.20.110.113(3) Summary of lapw1para:<BR> 'unknown','formatted',0<BR> 192.20.110.113 k=12 user=384.4 wallclock=11535<BR> 192.20.110.102 k=6 user=192.2 wallclock=0<BR> 192.20.110.103 k=6 user=192.2 wallclock=11535<BR> 192.20.110.112 k=6 user=192.2 wallclock=11535<BR>0.503u!
0.583s
3:57.90 0.4% 0+0k 0+0io 0pf+0w<BR>> lapw1 -dn -p (18:50:04) starting parallel lapw1 at Fri May 13 18:50:04 PDT 2005<BR>-> starting parallel LAPW1 jobs at Fri May 13 18:50:04 PDT 2005<BR>running LAPW1 in parallel mode (using .machines.help)<BR>4 number_of_parallel_jobs<BR> 192.20.110.113(6) 192.20.110.102(6) 192.20.110.103(6) 192.20.110.112(6) 192.20.110.113(3) Summary of lapw1para:<BR> 192.20.110.113 k=6 user=192.2 wallclock=11538<BR> 192.20.110.102 k=6 user=192.2 wallclock=11538<BR> 192.20.110.103 k=6 user=192.2 wallclock=11538<BR> 192.20.110.112 k=6 user=192.2 wallclock=11538<BR>0.528u 0.587s 3:56.11 0.4% 0+0k 0+0io 0pf+0w<BR>> l!
apw2 -up
-p (18:54:00) running LAPW2 in parallel mode<BR> 192.20.110.113<BR> 192.20.110.102<BR> 192.20.110.103<BR> 192.20.110.112<BR> 192.20.110.113<BR> Summary of lapw2para:<BR> 192.20.110.113 user=192.2 wallclock=11724.2<BR> 192.20.110.102 user=192.2 wallclock=11724.2<BR> 192.20.110.103 user=192.2 wallclock=11724.2<BR> 192.20.110.112 user=192.2 wallclock=11724.2<BR>4.214u 0.645s 0:57.40 8.4% 0+0k 0+0io 0pf+0w<BR>> lapw2 -dn -p (18:54:58) running LAPW2 in parallel mode<BR> 192.20.110.113<BR> 192.20.110.102<BR> 192.20.110.103<BR> 192.20.110.112<BR>
192.20.110.113<BR> Summary of lapw2para:<BR> 192.20.110.113 user=192.2 wallclock=11724.2<BR> 192.20.110.102 user=192.2 wallclock=11724.2<BR> 192.20.110.103 user=192.2 wallclock=11724.2<BR> 192.20.110.112 user=192.2 wallclock=11724.2<BR>4.276u 0.566s 0:51.71 9.3% 0+0k 0+0io 0pf+0w<BR>> lcore -up (18:55:50) 0.302u 0.007s 0:00.45 66.6% 0+0k 0+0io 0pf+0w<BR>> lcore -dn (18:55:50) 0.302u 0.008s 0:00.45 66.6% 0+0k 0+0io 0pf+0w<BR>> mixer (18:55:53) 4.592u 0.227s 0:06.69 71.8% 0+0k 0+0io 0pf+0w<BR>:ENERGY convergence: 0 0 25.9122400000000000<BR>:CHARGE convergence: 0 0.0001 .9523060<BR>45/15 to go</DIV>
<DIV>> lapw0 -p (18:56:00) starting parallel lapw0 at Fri May 13 18:56:00 PDT 2005<BR>--------<BR>running lapw0 in single mode<BR>34.770u 0.142s 0:35.85 97.3% 0+0k 0+0io 0pf+0w<BR>> lapw1 -up -p (18:56:36) starting parallel lapw1 at Fri May 13 18:56:36 PDT 2005<BR>-> starting parallel LAPW1 jobs at Fri May 13 18:56:36 PDT 2005<BR>running LAPW1 in parallel mode (using .machines)<BR>4 number_of_parallel_jobs<BR> 192.20.110.113(6) 192.20.110.102(6) 192.20.110.103(6) 192.20.110.112(6) 192.20.110.113(3) Summary of lapw1para:<BR> 'unknown','formatted',0<BR> 192.20.110.113 k=6 user=192.2 wallclock=11538<BR> 192.20.110.102 k=6 user=192.2 wallclock=11538<BR> 192.20.110.1!
03
k=6 user=192.2 wallclock=11538<BR> 192.20.110.112 k=6 user=192.2 wallclock=11538<BR>0.516u 0.556s 3:57.05 0.4% 0+0k 0+0io 0pf+0w<BR>> lapw1 -dn -p (19:00:34) starting parallel lapw1 at Fri May 13 19:00:34 PDT 2005<BR>-> starting parallel LAPW1 jobs at Fri May 13 19:00:34 PDT 2005<BR>running LAPW1 in parallel mode (using .machines.help)<BR>4 number_of_parallel_jobs<BR> 192.20.110.113(6) 192.20.110.102(6) 192.20.110.103(6) 192.20.110.112(6) 192.20.110.113(3) Summary of lapw1para:<BR> 192.20.110.113 k=6 user=192.2 wallclock=11538<BR> 192.20.110.102 k=6 user=192.2 wallclock=11538<BR> 192.20.110.103 k=6 user=192.2 wallclock=11538<BR>
192.20.110.112 k=6 user=192.2 wallclock=11538<BR>0.526u 0.545s 3:56.46 0.4% 0+0k 0+0io 0pf+0w<BR>> lapw2 -up -p (19:04:30) running LAPW2 in parallel mode<BR> 192.20.110.113<BR> 192.20.110.102<BR> 192.20.110.103<BR> 192.20.110.112<BR> 192.20.110.113<BR> Summary of lapw2para:<BR> 192.20.110.113 user=192.2 wallclock=11724.2<BR> 192.20.110.102 user=192.2 wallclock=11724.2<BR> 192.20.110.103 user=192.2 wallclock=11724.2<BR> 192.20.110.112 user=192.2 wallclock=11724.2<BR>4.343u 0.520s 0:57.59 8.4% 0+0k 0+0io 0pf+0w<BR>> lapw2 -dn -p (19:05:28) running LAPW2 in parallel mode<BR> 192.20.110.113<BR>
192.20.110.102<BR> 192.20.110.103<BR> 192.20.110.112<BR> 192.20.110.113<BR> Summary of lapw2para:<BR> 192.20.110.113 user=192.2 wallclock=11724.2<BR> 192.20.110.102 user=192.2 wallclock=11724.2<BR> 192.20.110.103 user=192.2 wallclock=11724.2<BR> 192.20.110.112 user=192.2 wallclock=11724.2<BR>4.282u 0.576s 0:52.12 9.3% 0+0k 0+0io 0pf+0w<BR>> lcore -up (19:06:20) 0.299u 0.007s 0:00.45 64.4% 0+0k 0+0io 0pf+0w<BR>> lcore -dn (19:06:21) 0.301u 0.002s 0:00.45 66.6% 0+0k 0+0io 0pf+0w<BR>> mixer (19:06:24) 4.579u 0.228s 0:06.68 71.7% 0+0k 0+0io 0pf+0w<BR>:ENERGY convergence: 0 0 25.9101540000000000<BR>:CHARGE convergence: 0 0.0001 .9395848<BR>44/14 to go</DIV>
<DIV>...........</DIV>
<DIV> </DIV>
<DIV>>41/11 to go</DIV>
<DIV>> lapw0 -p (19:38:05) starting parallel lapw0 at Fri May 13 19:38:06 PDT 2005<BR>--------<BR>running lapw0 in single mode<BR>34.719u 0.130s 0:35.85 97.1% 0+0k 0+0io 0pf+0w<BR>> lapw1 -up -p (19:38:41) starting parallel lapw1 at Fri May 13 19:38:42 PDT 2005<BR>-> starting parallel LAPW1 jobs at Fri May 13 19:38:42 PDT 2005<BR>running LAPW1 in parallel mode (using .machines)<BR>4 number_of_parallel_jobs<BR> 192.20.110.113(6) 192.20.110.102(6) 192.20.110.103(6) 192.20.110.112(6) 192.20.110.113(3) Summary of lapw1para:<BR> 192.20.110.113 k=6 user=192.2 wallclock=11538<BR> 192.20.110.102 k=6 user=192.2 wallclock=11538<BR> 192.20.110.103 k=6 user=192.2
wallclock=11538<BR> 192.20.110.112 k=6 user=192.2 wallclock=11538<BR>0.499u 0.597s 3:57.62 0.4% 0+0k 0+0io 0pf+0w<BR>> lapw1 -dn -p (19:42:39) starting parallel lapw1 at Fri May 13 19:42:39 PDT 2005<BR>-> starting parallel LAPW1 jobs at Fri May 13 19:42:39 PDT 2005<BR>running LAPW1 in parallel mode (using .machines.help)<BR>4 number_of_parallel_jobs<BR> 192.20.110.113(6) 192.20.110.102(6) 192.20.110.103(6) 192.20.110.112(6) 192.20.110.113(3) Summary of lapw1para:<BR> 'unknown','formatted',0<BR> 192.20.110.113 k=6 user=192.2 wallclock=11538<BR> 192.20.110.102 k=6 user=192.2 wallclock=11538<BR> 192.20.110.103 k=6 user=192.2
wallclock=11538<BR> 192.20.110.112 k=6 user=192.2 wallclock=11538<BR>0.524u 0.572s 3:56.08 0.4% 0+0k 0+0io 0pf+0w<BR>> lapw2 -up -p (19:46:35) running LAPW2 in parallel mode<BR>** LAPW2 crashed!<BR>0.031u 0.054s 0:00.46 17.3% 0+0k 0+0io 0pf+0w</DIV>
<DIV>> stop error<BR>*******************************************************************************************</DIV>
<DIV>nohup.out file:</DIV>
<DIV>**************************************************************************************</DIV>
<DIV>real 0m16.442s<BR>user 0m15.235s<BR>sys 0m0.206s<BR> SUMPARA END<BR> SUMPARA END<BR>LAPW2 - FERMI; weighs written<BR>What manual page do you want?<BR>What manual page do you want?<BR>What manual page do you want?<BR>What manual page do you want?<BR> LAPW2 END</DIV>
<DIV>real 0m28.442s<BR>user 0m27.135s<BR>sys 0m0.335s<BR>What manual page do you want?<BR> LAPW2 END</DIV>
<DIV>real 0m28.615s<BR>user 0m27.177s<BR>sys 0m0.367s<BR> LAPW2 END</DIV>
<DIV>real 0m28.479s<BR>user 0m27.053s<BR>sys 0m0.363s<BR> LAPW2 END</DIV>
<DIV>real 0m28.533s<BR>user 0m27.121s<BR>sys 0m0.360s<BR> LAPW2 END</DIV>
<DIV>real 0m15.251s<BR>user 0m14.101s<BR>sys 0m0.213s<BR> SUMPARA END<BR> SUMPARA END<BR> CORE END<BR> CORE END<BR> MIXER END<BR>in cycle 10 ETEST: .0047225000000000 CTEST: .9605146<BR> LAPW0 END<BR>What manual page do you want?<BR>What manual page do you want?<BR>What manual page do you want?<BR>What manual page do you want?<BR> LAPW1 END</DIV>
<DIV>real 2m34.056s<BR>user 2m32.098s<BR>sys 0m1.343s<BR> LAPW1 END</DIV>
<DIV>real 2m34.199s<BR>user 2m32.522s<BR>sys 0m1.257s<BR> LAPW1 END</DIV>
<DIV>real 2m33.341s<BR>user 2m31.461s<BR>sys 0m1.460s<BR> LAPW1 END</DIV>
<DIV>real 2m33.341s<BR>user 2m31.567s<BR>sys 0m1.366s<BR>What manual page do you want?<BR> LAPW1 END</DIV>
<DIV>real 1m17.194s<BR>user 1m16.236s<BR>sys 0m0.669s<BR>What manual page do you want?<BR>What manual page do you want?<BR>What manual page do you want?<BR>What manual page do you want?<BR> LAPW1 END</DIV>
<DIV>real 2m32.828s<BR>user 2m31.137s<BR>sys 0m1.306s<BR> LAPW1 END</DIV>
<DIV>real 2m32.411s<BR>user 2m30.582s<BR>sys 0m1.414s<BR> LAPW1 END<BR> LAPW1 END</DIV>
<DIV>real 2m34.797s<BR>user 2m32.246s<BR>sys 0m1.520s</DIV>
<DIV>real 2m32.347s<BR>user 2m30.540s<BR>sys 0m1.329s<BR>What manual page do you want?<BR> LAPW1 END</DIV>
<DIV>real 1m16.317s<BR>user 1m15.325s<BR>sys 0m0.715s<BR>PGFIO-F-231/formatted read/unit=5/error on data conversion.<BR> File name = CoCNP.in2 formatted, sequential access record = 3<BR> In source file lapw2_tmp_.F, at line number 164<BR>cp: cannot stat `.in.tmp': No such file or directory<BR>rm: cannot remove `.in.tmp': No such file or directory<BR>rm: cannot remove `.in.tmp1': No such file or directory<BR>**************************************************************************************</DIV>
<DIV>I find if there appears 'unknown','formatted' in dayfile, the calculation while stop. </DIV>
<DIV>Can you tell me why appears 'unknown','formatted' in dayfile, and how to solve it?</DIV>
<DIV>If I use the single computer, it works very well.</DIV></DIV>
<P> </P></BLOCKQUOTE></DIV></BLOCKQUOTE></DIV>
<P><BR>
<HR SIZE=1>
<B>Do You Yahoo!?</B><BR><A href="http://cn.rd.yahoo.com/mail_cn/tag/1g/*http://cn.mail.yahoo.com/" target=blank>注册世界一流品质的雅虎免费电邮</A></BLOCKQUOTE></DIV><p><br><hr size=1><b>Do You Yahoo!?</b><br>
<a href="http://music.yisou.com" target=blank>150万曲MP3疯狂搜,带您闯入音乐殿堂</a><br><a href="http://image.yisou.com" target=blank>美女明星应有尽有,搜遍美图、艳图和酷图</a><br>
<a href="http://cn.rd.yahoo.com/mail_cn/tag/1g/*http://cn.mail.yahoo.com/event/mail_1g/" target=blank>1G就是1000兆,雅虎电邮自助扩容!</a>