[Wien] Segmentation fault occured
msoumeli at physics.auth.gr
msoumeli at physics.auth.gr
Thu Dec 16 10:10:00 CET 2010
Dear Wien2k users,
I am using the latest version of WIEN2k (WIEN2k_10.1 Release 7/6/2010).
Segmentation fault occures during the scf in lapw1 and lapw2. I have
tried using the example structure files in case my structure caused
the problem
but the same error occures in both parallel and serial calculations.
The following lines are from the error in lapw1para:
starting parallel lapw1 at Thu Dec 16 10:54:24 EET 2010
-> starting parallel LAPW1 jobs at Thu Dec 16 10:54:24 EET 2010
running LAPW1 in parallel mode (using .machines)
1 number_of_parallel_jobs
[1] 20134
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 7 in communicator MPI_COMM_WORLD
with errorcode 91.
NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
w2k_dispatch_signal(): received: Segmentation fault
w2k_dispatch_signal(): received: Segmentation fault
w2k_dispatch_signal(): received: Segmentation fault
w2k_dispatch_signal(): received: Segmentation fault
w2k_dispatch_signal(): received: Segmentation fault
w2k_dispatch_signal(): received: Segmentation fault
w2k_dispatch_signal(): received: Segmentation fault
w2k_dispatch_signal(): received: Segmentation fault
--------------------------------------------------------------------------
mpirun has exited due to process rank 0 with PID 20139 on
node s-hirem1.physics.auth.gr exiting without calling "finalize". This may
have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
--------------------------------------------------------------------------
[s-hirem1.physics.auth.gr:20135] 7 more processes have sent help
message help-mpi-api.txt / mpi-abort
[s-hirem1.physics.auth.gr:20135] Set MCA parameter
"orte_base_help_aggregate" to 0 to see all help / error messages
[1] + Done ( cd $PWD; $t $ttt; rm -f
.lock_$lockfile[$p] ) >> .time1_$loop
localhost localhost localhost localhost localhost localhost
localhost localhost(560) Child id 0 SIGSEGV, contact
developers
Child id 1 SIGSEGV, contact developers
Child id 2 SIGSEGV, contact developers
Child id 3 SIGSEGV, contact developers
Child id 4 SIGSEGV, contact developers
Child id 5 SIGSEGV, contact developers
Child id 6 SIGSEGV, contact developers
Child id 7 SIGSEGV, contact developers
0.328u 0.268s 0:01.17 49.5% 0+0k 0+0io 73pf+0w
InN.scf1_1: No such file or directory.
Summary of lapw1para:
localhost k=0 user=0 wallclock=0
0.384u 0.424s 0:03.33 24.0% 0+0k 0+0io 73pf+0w
Does anyone know how to fix this? Could there be a problem from the
installation of the latest version?
Thanks in advance,
M.Soumelidou
More information about the Wien
mailing list