[Wien] MPI Problem

Mon Jan 23 06:03:12 CET 2012

Thank you very much for your suggestion.  I actually managed to figure this out by myself an hour or so ago.  At the same time (usually not a good idea) I also compiled the mkl interface for fftw2 rather than using the standalone version I had compiled by myself earlier.  Thus the RP library line looks like

-lmkl_scalapack_lp64 -lmkl_solver_lp64 -lmkl_blacs_intelmpi_lp64 /opt/intel/mkl/lib/intel64/libfftw2x_cdft_DOUBLE.a $(R_LIBS)

All compiles 

In the past ten minutes I have been running the TiC example under mpi and I must say the mpi version seems tricker to use than I first though as each mpi process seems to use multiple threads and it is easy to over saturate the number of cores one has.  I have been testing the mpi version on a single node (32 cores) of a 500 core cluster to get a feel for things.

Running the TiC calculation (with 4000 k-points) in serial mode (the mkl routine uses multiple threads) took about 75 seconds

Running the sample calculation in parallel mode (with 5 mpi processes each which spawn multiple threads for an average cpu load of 28 in a 32 core machine) takes longer, specifically 300 seconds.

Carrying out the same calculation with "1:localhost* on six separate lines (e.g. lapw runs six processes locally and then aggregates without using mpi) finishes quickest of all in about 60 seconds.

I don't really understand what is going on.  Obviously I don't really need to use the mpi/or parallel calculation for small problems as in the TiC example, but I would like to use it for larger supercells.  Is there some sort of guiding principle which will help me decide which approach is best?  I assume the parallelization by running multiple jobs is also limited to a single machine so this won't scale very well.

Thanks for any advice.

	Paul

On Jan 23, 2012, at 8:08 AM, Laurence Marks wrote:

> A guess: you are using the wrong version of blacs. You need a
> -lmkl_blacs_intelmpi_XX
> where "XX" is the one for your system. I have seen this give the same error.
> 
> Use http://software.intel.com/en-us/articles/intel-mkl-link-line-advisor/
> 
> For reference, with openmpi it is _openmpi_ instead of _intelmpi_, and
> similarly for sgi.
> 
> 2012/1/22 Paul Fons <paul-fons at aist.go.jp>:
>> 
>> Hi,
>> I have Wien2K running on a cluster of linux boxes each with 32 cores and
>> connected by 10Gb ethernet.  I have compiled Wien2K by the 3.174 version of
>> Wien2K (I learned the hard way that bugs in the newer versions of the Intel
>> compiler lead to crashes in Wien2K).  I have also installed Intel's MPI.
>>  First, the single process Wien2K, let's say for the TiC case, works fine.
>>  It also works fine when I use a .machines file like
>> 
>> granulaity:1
>> localhost:1
>> localhost:1
>> …  (24 times).
>> 
>> This file leads to parallel execution without error.  I can vary the number
>> of processes by increasing the number of localhost:1 and the number of
>> localhost:1 lines in the file and still everything works fine.  When I try
>> to use mpi to communicate with one process, it works as well.
>> 
>> 1:localhost:1
>> 
>> lstarting parallel lapw1 at Mon Jan 23 06:49:16 JST 2012
>> 
>> ->  starting parallel LAPW1 jobs at Mon Jan 23 06:49:16 JST 2012
>> running LAPW1 in parallel mode (using .machines)
>> 1 number_of_parallel_jobs
>> [1] 22417
>> LAPW1 END
>> [1]  + Done                          ( cd $PWD; $t $exe ${def}_$loop.def; rm
>> -f .lock_$lockfile[$p] ) >> .time1_$loop
>>     localhost(111) 179.004u 4.635s 0:32.73 561.0%	0+0k 0+26392io 0pf+0w
>>   Summary of lapw1para:
>>   localhost	 k=111	 user=179.004	 wallclock=32.73
>> 179.167u 4.791s 0:35.61 516.5%	0+0k 0+26624io 0pf+0w
>> 
>> 
>> Changing the machine file to use more than one process  (the same form of
>> error occurs for more than 2)
>> 
>> 1:localhost:2
>> 
>> lead to a run time error in the MPI subsystem.
>> 
>> starting parallel lapw1 at Mon Jan 23 06:51:04 JST 2012
>> ->  starting parallel LAPW1 jobs at Mon Jan 23 06:51:04 JST 2012
>> running LAPW1 in parallel mode (using .machines)
>> 1 number_of_parallel_jobs
>> [1] 22673
>> Fatal error in MPI_Comm_size: Invalid communicator, error stack:
>> MPI_Comm_size(123): MPI_Comm_size(comm=0x5b, size=0x7ed20c) failed
>> MPI_Comm_size(76).: Invalid communicator
>> Fatal error in MPI_Comm_size: Invalid communicator, error stack:
>> MPI_Comm_size(123): MPI_Comm_size(comm=0x5b, size=0x7ed20c) failed
>> MPI_Comm_size(76).: Invalid communicator
>> [1]  + Done                          ( cd $PWD; $t $ttt; rm -f
>> .lock_$lockfile[$p] ) >> .time1_$loop
>>     localhost localhost(111) APPLICATION TERMINATED WITH THE EXIT STRING:
>> Hangup (signal 1)
>> 0.037u 0.036s 0:00.06 100.0%	0+0k 0+0io 0pf+0w
>> TiC.scf1_1: No such file or directory.
>>   Summary of lapw1para:
>>   localhost	 k=0	 user=111	 wallclock=0
>> 0.105u 0.168s 0:03.21 8.0%	0+0k 0+216io 0pf+0w
>> 
>> 
>> I have properly sourced the appropriate runtime environment for the Intel
>> system.  For example, compiling (mpiifort) and running the f90 mpi test
>> program from intel produces:
>> 
>> 
>> 
>> mpirun -np 32 /home/paulfons/mpitest/testf90
>>  Hello world: rank            0  of           32  running on
>>  asccmp177
>> 
>> 
>>  Hello world: rank            1  of           32  running on    (32 times)
>> 
>> 
>> Does anyone have any suggestions as to what to try next?  I am not sure how
>> to debug things from here.  I have about 512 nodes that I can use for larger
>> calculations that only can be accessed by mpi (the ssh setup works fine as
>> well by the way).  It would be great to figure out what is wrong.
>> 
>> Thanks.
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> Dr. Paul Fons
>> Functional Nano-phase-change Research Team
>> Team Leader
>> Nanodevice Innovation Research Center (NIRC)
>> National Institute for Advanced Industrial Science & Technology
>> METI
>> 
>> AIST Central 4, Higashi 1-1-1
>> Tsukuba, Ibaraki JAPAN 305-8568
>> 
>> tel. +81-298-61-5636
>> fax. +81-298-61-2939
>> 
>> email: paul-fons at aist.go.jp
>> 
>> The following lines are in a Japanese font
>> 
>> 〒305-8562 茨城県つくば市つくば中央東 1-1-1
>> 産業技術総合研究所
>> ナノ電子デバイス研究センター
>> 相変化新規機能デバイス研究チーム　チームリーダー
>> ポール・フォンス
>> 
>> 
>> 
>> 
>> 
>> _______________________________________________
>> Wien mailing list
>> Wien at zeus.theochem.tuwien.ac.at
>> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
>> 
> 
> 
> 
> -- 
> Professor Laurence Marks
> Department of Materials Science and Engineering
> Northwestern University
> www.numis.northwestern.edu 1-847-491-3996
> "Research is to see what everybody else has seen, and to think what
> nobody else has thought"
> Albert Szent-Gyorgi
> _______________________________________________
> Wien mailing list
> Wien at zeus.theochem.tuwien.ac.at
> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien

Dr. Paul Fons
Functional Nano-phase-change Research Team
Team Leader
Nanodevice Innovation Research Center (NIRC)
National Institute for Advanced Industrial Science & Technology
METI

AIST Central 4, Higashi 1-1-1
Tsukuba, Ibaraki JAPAN 305-8568

tel. +81-298-61-5636
fax. +81-298-61-2939

email: paul-fons at aist.go.jp

The following lines are in a Japanese font

〒305-8562 茨城県つくば市つくば中央東 1-1-1
産業技術総合研究所
ナノ電子デバイス研究センター
相変化新規機能デバイス研究チーム　チームリーダー
ポール・フォンス

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://zeus.theochem.tuwien.ac.at/pipermail/wien/attachments/20120123/f34aacdd/attachment-0001.htm>