[Wien] =?iso-2022-jp?B?Lm1hY2hpbmVzIGZpbGUgZm9yIGZpbmUgZ3JhaW5lZCBwYXJhbGxlbCBleGVjdXRpb25z?=
tom_y at livedoor.com
tom_y at livedoor.com
Mon Aug 4 09:27:55 CEST 2003
Dear WIEN2k developers and users,
I have a problem in the fine grained parallel executions.
I tried to dig the informations up from the old digents,
but I couldn't find out the solutions.
Now I'm trying to perform calculations for huge systems
(more than 100 atoms with a few k-points)
I installed the WIEN2k_03 downloaded in the middle of
July, 2003, and compliled with the Intel ifc7 + mkl6.0.
Installation seems to be done correctly, in which lapw0_mpi,
lapw1_mpi, lapw1c_mpi, lapw2_mpi, lapw2c_mpi were created.
Our hardware system is very homogeneous pc-cluster,
i.e., pentium-4 machines with same performance
connected via gigabit ethernet. Each machine has, of course,
1CPU.
When I tried k-point parallel calculations, they worked
without problem. Followings are the test case results.
for 3 k-points with 3 machines(earth46, earth47 and earth48).
.machines file
-----------------------------------
lapw0:earth46:1 earth47:1 earth48:1
1:earth46
1:earth47
1:earth48
granularity:1
-----------------------------------
In this case, lapw0 was running with lapw0_mpi,
but lapw1 was running lapw1 (NOT lapw1_mpi).
each k-point is calculated on each machine.
My question is how I run the fine grained parallel version
with the modifications of the .machines file.
I tried to edit the .machines file as follows.
-----------------------------------
lapw0:earth46:1 earth47:1 earth48:1
3:earth46:1 earth47:1 earth48:1
granularity:1
-----------------------------------
By invoking the command "run_lapw -p"
I got the following messange on the scrren
FORTRAN STOP LAPW0 END
FORTRAN STOP LAPW0 END
FORTRAN STOP LAPW0 END
FORTRAN STOP LAPW0 END
cat: No match.
Following are the result for testpara.
------------------------------------------------------
#####################################################
# TESTPARA #
#####################################################
Test: LAPW1 in parallel mode (using .machines)
Granularity set to 1
Extrafine unset
klist: 1
machines: earth46
procs: 1
weigh(old): 3
sumw: 3
granularity: 1
weigh(new): 1
Distribution of k-point (under ideal conditions)
will be:
1 : earth46(1) 1k
-------------------------------------------------------
Following is the content of lpaw1.error
-------------------------------------------------------
** Error in Parallel LAPW1
** LAPW1 STOPPED at Mon Aug 4 15:19:09 JST 2003
** check ERROR FILES!
-------------------------------------------------------
Following is a content of case.dayfile
-------------------------------------------------------------------
Calculating mgo3 in /home/wien2k/WIEN2k_03_3/data/mgo3
on earth46
start (Mon Aug 4 15:20:55 JST 2003) with lapw0 (20/20 to go)
> lapw0 -p (15:20:55) starting parallel lapw0 at Mon Aug 4 15:20:56 JST 2003
-------- .machine1 : 3 processors
earth46:1
earth47:1
earth48:1
--------
2.100u 0.110s 0:05.56 39.7% 0+0k 0+0io 11297pf+0w
> lapw1 -p (15:21:01) starting parallel lapw1 at Mon Aug 4 15:21:01 JST 2003
-> starting parallel LAPW1 jobs at Mon Aug 4 15:21:01 JST 2003
running LAPW1 in parallel mode (using .machines)
1 number_of_parallel_jobs
** LAPW1 crashed!
0.080u 0.100s 0:03.59 5.0% 0+0k 0+0io 10665pf+0w
> stop error
---------------------------------------------------------------------
If you need more informations of my settings or results,
I'll submit them again as soon as possible.
I appreciate your help.
sincerely,
Tom Yamamoto
フレッツ始めるなら今 !キャンペーン実施中!
http://www.livedoor.com/flets/
More information about the Wien
mailing list