[Wien] =?iso-2022-jp?B?Lm1hY2hpbmVzIGZpbGUgZm9yIGZpbmUgZ3JhaW5lZCBwYXJhbGxlbCBleGVjdXRpb25z?=

tom_y at livedoor.com tom_y at livedoor.com
Mon Aug 4 09:27:55 CEST 2003


Dear WIEN2k developers and users, 

I have a problem in the fine grained parallel executions.
I tried to dig the informations up from the old digents, 
but I couldn't find out the solutions. 
Now I'm trying to perform calculations for huge systems
(more than 100 atoms with a few k-points)

I installed the WIEN2k_03 downloaded in the middle of 
July, 2003, and compliled with the Intel ifc7 + mkl6.0.
Installation seems to be done correctly, in which lapw0_mpi,
lapw1_mpi, lapw1c_mpi, lapw2_mpi, lapw2c_mpi were created.
Our hardware system is very homogeneous pc-cluster,
i.e., pentium-4 machines with same performance 
connected via gigabit ethernet. Each machine has, of course,
1CPU. 

When I tried k-point parallel calculations, they worked 
without problem. Followings are the test case results.

for 3 k-points with 3 machines(earth46, earth47 and earth48).
.machines file
-----------------------------------
lapw0:earth46:1 earth47:1 earth48:1
1:earth46
1:earth47
1:earth48
granularity:1
-----------------------------------

In this case, lapw0 was running with lapw0_mpi,
but lapw1 was running lapw1 (NOT lapw1_mpi).
each k-point is calculated on each machine.

My question is how I run the fine grained parallel version
with the modifications of the .machines file.

I tried to edit the .machines file as follows.

-----------------------------------
lapw0:earth46:1 earth47:1 earth48:1
3:earth46:1 earth47:1 earth48:1
granularity:1
-----------------------------------

By invoking the command "run_lapw -p"
I got the following messange on the scrren

FORTRAN STOP  LAPW0 END
FORTRAN STOP  LAPW0 END
FORTRAN STOP  LAPW0 END
FORTRAN STOP  LAPW0 END
cat: No match.

Following are the result for testpara.
------------------------------------------------------
#####################################################
#                     TESTPARA                      #
#####################################################

Test: LAPW1 in parallel mode (using .machines)
Granularity set to 1
Extrafine unset

    klist:       1
    machines:    earth46
    procs:       1
    weigh(old):  3
    sumw:        3
    granularity: 1
    weigh(new):  1

Distribution of k-point (under ideal conditions)
will be:

1 : earth46(1) 1k 
-------------------------------------------------------

Following is the content of lpaw1.error

-------------------------------------------------------
**  Error in Parallel LAPW1
**  LAPW1 STOPPED at Mon Aug 4 15:19:09 JST 2003
**  check ERROR FILES!
-------------------------------------------------------

Following is a content of case.dayfile

-------------------------------------------------------------------
Calculating mgo3 in /home/wien2k/WIEN2k_03_3/data/mgo3
on earth46

    start (Mon Aug  4 15:20:55 JST 2003) with lapw0 (20/20 to go)
>   lapw0 -p (15:20:55) starting parallel lapw0 at Mon Aug  4 15:20:56 JST 2003
-------- .machine1 : 3 processors
earth46:1
earth47:1
earth48:1
--------
2.100u 0.110s 0:05.56 39.7% 0+0k 0+0io 11297pf+0w
>   lapw1  -p (15:21:01) starting parallel lapw1 at Mon Aug  4 15:21:01 JST 2003
->  starting parallel LAPW1 jobs at Mon Aug  4 15:21:01 JST 2003
running LAPW1 in parallel mode (using .machines)
1 number_of_parallel_jobs
**  LAPW1 crashed!
0.080u 0.100s 0:03.59 5.0% 0+0k 0+0io 10665pf+0w

>   stop error
---------------------------------------------------------------------

If you need more informations of my settings or results,
I'll submit them again as soon as possible.

I appreciate your help.

sincerely,

Tom Yamamoto

フレッツ始めるなら今 !キャンペーン実施中!
               http://www.livedoor.com/flets/





More information about the Wien mailing list