[Wien] Segmentation fault in Supercell Calculation
Lan, Wangwei
wl13c at my.fsu.edu
Tue Jul 28 22:41:14 CEST 2015
Dear professor:
I use Open MPI, version 1.4.5.
I added "-C -g" because some people in the mail-list said it probably will solve the problem.
Thanks for your advice, I will recompile the package soon.
Sincerely
Wangwei
________________________________
From: wien-bounces at zeus.theochem.tuwien.ac.at <wien-bounces at zeus.theochem.tuwien.ac.at> on behalf of Laurence Marks <L-marks at northwestern.edu>
Sent: Tuesday, July 28, 2015 15:36
To: A Mailing list for WIEN2k users
Subject: Re: [Wien] Segmentation fault in Supercell Calculation
N.B., unless you are a code developer "-C -g" are a terrible idea. Remove them, they may easily lead to the code crashing. Replace them by just "-O1"
On Tue, Jul 28, 2015 at 3:28 PM, Lan, Wangwei <wl13c at my.fsu.edu<mailto:wl13c at my.fsu.edu>> wrote:
Dear Professor:
When I type "mpif90 --version", it give me " ifort (IFORT) 12.1.3 20120212". So, I thought it should work.
My Libraries linking are listed below:
Parallel excution:
FFTW_LIB + FFTW_OPT : -lfftw3_mpi -lfftw3 -L/opt/fftw3.3.3/lib + -DFFTW3 -I/opt/fftw3.3.3/include (already set)
RP RP_LIB(SCALAPACK+PBLAS): -lmkl_scalapack_lp64 -lmkl_blacs_lp64 $(R_LIBS)
FP FPOPT(par.comp.options): -FR -mp1 -w -prec_div -pc80 -pad -ip -DINTEL_VML -traceback -assume buffered_io
Compiler Option
O Compiler options: -FR -mp1 -w -prec_div -pc80 -pad -ip -DINTEL_VML -traceback -assume buffered_io -C -g
F FFTW options: -DFFTW3 -I/opt/fftw3.3.3/include
L Linker Flags: $(FOPT) -L$(MKLROOT)/lib/$(MKL_TARGET_ARCH) -pthread
P Preprocessor flags '-DParallel'
R R_LIB (LAPACK+BLAS): -lmkl_lapack95_lp64 -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -liomp5 -lpthread -lmkl_solver_lp64
FL FFTW_LIBS: -lfftw3_mpi -lfftw3 -L/opt/fftw3.3.3/lib
Sincerely
Wangwei
________________________________
From: wien-bounces at zeus.theochem.tuwien.ac.at<mailto:wien-bounces at zeus.theochem.tuwien.ac.at> <wien-bounces at zeus.theochem.tuwien.ac.at<mailto:wien-bounces at zeus.theochem.tuwien.ac.at>> on behalf of Laurence Marks <L-marks at northwestern.edu<mailto:L-marks at northwestern.edu>>
Sent: Tuesday, July 28, 2015 14:59
To: A Mailing list for WIEN2k users
Subject: Re: [Wien] Segmentation fault in Supercell Calculation
Your options are probably wrong:
a) mpif90 is normally gfortran, the Intel version is mpiifort
b) It is easy to use the wrong linking with the Intel mkl libraries. Please provide the information I requested.
On Tue, Jul 28, 2015 at 2:55 PM, Lan, Wangwei <wl13c at my.fsu.edu<mailto:wl13c at my.fsu.edu>> wrote:
Dear Professor:
Yes, "x lapw0" works without mpi.
My mpi compile : mpif90
I use Open MPI, version 1.4.5
the parallel compilation options are
-FR -mp1 -w -prec_div -pc80 -pad -ip -DINTEL_VML -traceback -assume buffered_io
I use Intel MKL libraries, that part should be fine.
Thanks very much for your help.
Sincerely
Wangwei Lan
________________________________
From: wien-bounces at zeus.theochem.tuwien.ac.at<mailto:wien-bounces at zeus.theochem.tuwien.ac.at> <wien-bounces at zeus.theochem.tuwien.ac.at<mailto:wien-bounces at zeus.theochem.tuwien.ac.at>> on behalf of Laurence Marks <L-marks at northwestern.edu<mailto:L-marks at northwestern.edu>>
Sent: Tuesday, July 28, 2015 14:30
To: A Mailing list for WIEN2k users
Subject: Re: [Wien] Segmentation fault in Supercell Calculation
Does a simple "x lapw0" work, i.e. without mpi, for this specific case?
If it does then there is probably an error in how you have linked/compiled the mpi versions. Please provide:
a) The mpi compiler you used.
b) Which type of mpi you are using (openmpi, mvapich, intel mpi etc)
c) The parallel compilation options.
N.B., a useful resource is https://software.intel.com/en-us/articles/intel-mkl-link-line-advisor
N.N.B., ulimit -s is not needed, this is (now) done in the software.
On Tue, Jul 28, 2015 at 2:22 PM, Lan, Wangwei <wl13c at my.fsu.edu<mailto:wl13c at my.fsu.edu>> wrote:
Dear Professor Marks:
I've check everything you have mentioned, they are all fine, nevertheless it still don't work. I think the input files are ok since I have no problem running in non-parallel mode.
I tried to make the supercell smaller (2x1x1), then it works. However, I don't know why that happens.
By the way, I have "ulimit -s unlimited " in my .bashrc file. I'v also adjusted the RKMAX and RMT before.
Sincerely
Wangwei Lan
________________________________
From: wien-bounces at zeus.theochem.tuwien.ac.at<mailto:wien-bounces at zeus.theochem.tuwien.ac.at> <wien-bounces at zeus.theochem.tuwien.ac.at<mailto:wien-bounces at zeus.theochem.tuwien.ac.at>> on behalf of Laurence Marks <L-marks at northwestern.edu<mailto:L-marks at northwestern.edu>>
Sent: Tuesday, July 28, 2015 13:09
To: A Mailing list for WIEN2k users
Subject: Re: [Wien] Segmentation fault in Supercell Calculation
You have what is called a "Segmentation Violation" which was detected by 4 of the nodes and they called an error handler which stopped the mpi job on all the CPU's.
This is normally because you have an error of some sort in your input files, any of case.in0, case.clmsum (and clmup/dn if you are using spin polarized).
1) Check that you do not have overlapping spheres and/or other mistakes.
2) Check your error files, e.g. "cat *.error". Are any others (e.g. dstart.error) not empty? Did you ignore an error during setup?
3) Check the lapw0 output in case.output0* -- maybe shows what is wrong.
There are many possible sources, you have to find the specific one.
On Tue, Jul 28, 2015 at 12:57 PM, Lan, Wangwei <wl13c at my.fsu.edu<mailto:wl13c at my.fsu.edu>> wrote:
Dear WIEN2k user:
I am using wien2k_14.2 on CentOS release 5.8. ifort version 12.1.3 with MKL.
After generating a 2x2x1 supercell with 30 atoms, I tried to do the scf calculation. However, I got some errors. I'v attached it at the end of this email. My wien2k was installed correctly. It works well for other calculations. It also worked if I run non-parallel calculation for supercell. I'v searched the mail-list, but can't find any solutions. Could you give me a hint on how to solve the problem? Thank you very much.
Sincerely
Wangwei Lan
On lapw0.error shows:
'Unknown' - SIGSEGV
On super.dayfile shows:
Child id 0 SIGSEGV
Child id 8 SIGSEGV
Child id 18 SIGSEGV
Child id 23 SIGSEGV
Child id 17 SIGSEGV
On Screen shows:
w2k_dispatch_signal(): received: Segmentation fault
w2k_dispatch_signal(): received: Segmentation fault
w2k_dispatch_signal(): received: Segmentation fault
w2k_dispatch_signal(): received: Segmentation fault
w2k_dispatch_signal(): received: Segmentation fault
w2k_dispatch_signal(): received: Segmentation fault
w2k_dispatch_signal(): received: Segmentation fault
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 18 in communicator MPI_COMM_WORLD
with errorcode 451782144.
NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun has exited due to process rank 18 with PID 26388 on
node corfu.magnet.fsu.edu<http://corfu.magnet.fsu.edu> exiting without calling "finalize". This may
have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
--------------------------------------------------------------------------
[corfu.magnet.fsu.edu:26369<http://corfu.magnet.fsu.edu:26369>] 23 more processes have sent help message help-mpi-api.txt / mpi-abort
[corfu.magnet.fsu.edu:26369<http://corfu.magnet.fsu.edu:26369>] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
> stop error
--
Professor Laurence Marks
Department of Materials Science and Engineering
Northwestern University
www.numis.northwestern.edu<http://www.numis.northwestern.edu>
Corrosion in 4D: MURI4D.numis.northwestern.edu<http://MURI4D.numis.northwestern.edu>
Co-Editor, Acta Cryst A
"Research is to see what everybody else has seen, and to think what nobody else has thought"
Albert Szent-Gyorgi
--
Professor Laurence Marks
Department of Materials Science and Engineering
Northwestern University
www.numis.northwestern.edu<http://www.numis.northwestern.edu>
Corrosion in 4D: MURI4D.numis.northwestern.edu<http://MURI4D.numis.northwestern.edu>
Co-Editor, Acta Cryst A
"Research is to see what everybody else has seen, and to think what nobody else has thought"
Albert Szent-Gyorgi
--
Professor Laurence Marks
Department of Materials Science and Engineering
Northwestern University
www.numis.northwestern.edu<http://www.numis.northwestern.edu>
Corrosion in 4D: MURI4D.numis.northwestern.edu<http://MURI4D.numis.northwestern.edu>
Co-Editor, Acta Cryst A
"Research is to see what everybody else has seen, and to think what nobody else has thought"
Albert Szent-Gyorgi
--
Professor Laurence Marks
Department of Materials Science and Engineering
Northwestern University
www.numis.northwestern.edu<http://www.numis.northwestern.edu>
Corrosion in 4D: MURI4D.numis.northwestern.edu<http://MURI4D.numis.northwestern.edu>
Co-Editor, Acta Cryst A
"Research is to see what everybody else has seen, and to think what nobody else has thought"
Albert Szent-Gyorgi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://zeus.theochem.tuwien.ac.at/pipermail/wien/attachments/20150728/0d0186f4/attachment.html>
More information about the Wien
mailing list