[Wien] Problems with mpi for Wien12.1
Peter Blaha
pblaha at theochem.tuwien.ac.at
Fri Aug 24 08:05:55 CEST 2012
Hard to say.
What is in $WIENROOT/parallel_options ?
MPI_REMOTE should be 0 !
Otherwise run lapw0_mpi by "hand":
mpirun -np 4 $WIENROOT/lapw0_mpi lapw0.def (or including .machinefile
.machine0)
Am 24.08.2012 02:24, schrieb Paul Fons:
> Greetings all,
> I have compiled Wien2K 12.1 under OpenSuse 11.4 (and OpenSuse 12.1)
> and the latest Intel compilers with identical mpi launch problems and I
> am hoping for some suggestions as to where to look to fix things. Note
> that the serial and k-point parallel versions of the code run fine (I
> have optimized GaAs a lot in my troubleshooting!).
>
> Environment.
>
> I am using the latest intel fort, icc, and impi libraries for linux.
>
> matstud at pyxis:~/Wien2K> ifort --version
> ifort (IFORT) 12.1.5 20120612
> Copyright (C) 1985-2012 Intel Corporation. All rights reserved.
>
> matstud at pyxis:~/Wien2K> mpirun --version
> Intel(R) MPI Library for Linux* OS, Version 4.0 Update 3 Build 20110824
> Copyright (C) 2003-2011, Intel Corporation. All rights reserved.
>
> matstud at pyxis:~/Wien2K> icc --version
> icc (ICC) 12.1.5 20120612
> Copyright (C) 1985-2012 Intel Corporation. All rights reserved.
>
>
> My OPTIONS files from /siteconfig_lapw
>
> current:FOPT:-FR -mp1 -w -prec_div -pc80 -pad -ip -DINTEL_VML -traceback
> current:FPOPT:-I$(MKLROOT)/include/intel64/lp64 -I$(MKLROOT)/include -FR
> -mp1 -w -prec_div -pc80 -pad -ip -DINTEL_VML -DFFTW3 -traceback
> current:LDFLAGS:$(FOPT) -L$(MKLROOT)/lib/$(MKL_TARGET_ARCH) -pthread
> current:DPARALLEL:'-DParallel'
> current:R_LIBS:-lmkl_lapack95_lp64 -lmkl_intel_lp64 -lmkl_intel_thread
> -lmkl_core -openmp -lpthread
> current:RP_LIBS:-L$(MKLROOT)/lib/intel64
> $(MKLROOT)/lib/intel64/libmkl_blas95_lp64.a
> $(MKLROOT)/lib/intel64/libmkl_lapack95_lp64.a -lmkl_scalapack_lp64
> -lmkl_cdft_core -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core
> -lmkl_blacs_intelmpi_lp64 -openmp -lpthread -lm -L/opt/local/fftw3/lib/
> -lfftw3_mpi -lfftw3 $(R_LIBS)
> current:MPIRUN:mpirun -np _NP_ -machinefile _HOSTS_ _EXEC_
>
>
>
>
> The code compiles and links without error. It runs fine in serial mode
> and in k-point parallel mode, e.g.
>
> .machines with
>
> 1:localhost
> 1:localhost
> 1:localhost
> granularity:1
> extrafine:1
>
> This runs fine. When I attempt to run a mpi process with 12 processes
> (on a 12 core machine), I crash and burn (see below) with a SIGSEV error
> with instructions to contact the developers.
>
> The linking options were derived from Intel's mkl link advisor (the
> version on the intel site. I should add that the mpi-bench in fftw3
> works fine using the intel mpi as do commands like hostname or even
> abinit so it would appear that that the Intel MPI environment itself is
> fine. I have wasted a lot of time trying to figure out how to fix this
> before writing to the list, but at this point, I feel like a monkey at a
> keyboard attempting to duplicate Shakesphere -- if you know what I mean.
> Thanks in advance for any heads up that you can offer.
>
>
>
> .machines
>
> lapw0:localhost:12
> 1:localhost:12
> granularity:1
> extrafine:1
>
>> stop error
>
> error: command /home/matstud/Wien2K/lapw0para -c lapw0.def failed
> 0.029u 0.046s 0:00.93 6.4% 0+0k 0+176io 0pf+0w
> Child id 2 SIGSEGV, contact developers
> Child id 8 SIGSEGV, contact developers
> Child id 7 SIGSEGV, contact developers
> Child id 11 SIGSEGV, contact developers
> Child id 10 SIGSEGV, contact developers
> Child id 9 SIGSEGV, contact developers
> Child id 6 SIGSEGV, contact developers
> Child id 5 SIGSEGV, contact developers
> Child id 4 SIGSEGV, contact developers
> Child id 3 SIGSEGV, contact developers
> Child id 1 SIGSEGV, contact developers
> Child id 0 SIGSEGV, contact developers
> -------- .machine0 : 12 processors
>> lapw0 -p (09:04:45) starting parallel lapw0 at Fri Aug 24 09:04:45 JST 2012
>
> cycle 1 (Fri Aug 24 09:04:45 JST 2012) (40/99 to go)
>
> start (Fri Aug 24 09:04:45 JST 2012) with lapw0 (40/99 to go)
>
>
> using WIEN2k_12.1 (Release 22/7/2012) in /home/matstud/Wien2K
> on pyxis with PID 15375
> Calculating GaAs in /usr/local/share/Wien2K/Fons/GaAs
>
>
>
>
> _______________________________________________
> Wien mailing list
> Wien at zeus.theochem.tuwien.ac.at
> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
>
--
Peter Blaha
Inst.Materials Chemistry
TU Vienna
Getreidemarkt 9
A-1060 Vienna
Austria
+43-1-5880115671
More information about the Wien
mailing list