[Wien] stubborn segmentation fault

Stefaan Cottenier Stefaan.Cottenier at UGent.be
Thu Oct 25 13:46:31 CEST 2012


Dear wien2k community,

I do not succeed to get wien2k running flawlessly on our university 
cluster (Intel Xeon Harpertown (L5420)). For some cases, a reproducible 
segmentation fault error appears in lapw2. Our very capable sysadmins 
gave up, and blame it to 'a wien2k coding problem'. That's why I want to 
describe the problem for you:

A) Description of the problem:

* It is a "forrtl: severe (174): SIGSEGV, segmentation fault occurred" 
error, which appears in lapw2 with FOR in case.in2 (never with TOT). The 
full screen output (compiled with ifort, including -g -traceback) for 
k-point parallelization over 2 cores is:

LAPW2 - FERMI; weighs written
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image              PC                Routine            Line        Source
lapw2              0000000000484D28  l2main_                   893 
l2main_tmp_.F
lapw2              00000000004A1C2D  MAIN__                    564 
lapw2_tmp_.F
lapw2              0000000000403C4C  Unknown               Unknown  Unknown
libc.so.6          000000300081D994  Unknown               Unknown  Unknown
lapw2              0000000000403B59  Unknown               Unknown  Unknown
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image              PC                Routine            Line        Source
lapw2              0000000000484D28  l2main_                   893 
l2main_tmp_.F
lapw2              00000000004A1C2D  MAIN__                    564 
lapw2_tmp_.F
lapw2              0000000000403C4C  Unknown               Unknown  Unknown
libc.so.6          000000300081D994  Unknown               Unknown  Unknown
lapw2              0000000000403B59  Unknown               Unknown  Unknown

* It appears only for a limited number of cases (say 20% of all the ones 
I tried). The others run just fine.

* The problem appears only in parallel runs. If a case shows the 
problem, one additional serial iteration is sufficient to complete the 
scf-cycle.

* If the problem appears, it can be reproduced only by 'run_lapw -p'. If 
one tries a manual 'parallel' execution as hereunder (which I thought 
should execute exactly the same processes), the error does no show up:

lapw0 lapw0.def
lapw1 lapw1.def [1]
lapw2 lapw2.def [1]
lapw1 lapw1.def [2]
lapw2 lapw2.def [2]
...


B) Detailed analysis

Trying different compiler versions was the first guess. Three different 
ifort versions were tested (including the celebrated 2011.3.174 that was 
reported on the wien2k mailing list to work fine for v12.1), but all 
result in the same error:

v2011.1.073
v2011.3.174
v2011.10.319

Next, I searched for the possible reason by going through all steps 
described at the following link (a very useful piece of information for 
this mailing list, I suggest to mention it in the FAQ):

http://software.intel.com/en-us/articles/determining-root-cause-of-sigsegv-or-sigbus-errors/

All steps described there lead to no improvement up to the first half of 
"possible cause #5". The second test described in #5 yields something, 
however. When compiling with the additional options

-fp-stack-check -g -traceback -gen-interfaces -warn interfaces

there is the following compile crash for lapw2 :

c3fft_tmp_.F(267): error #6633: The type of the actual argument differs 
from the type of the dummy argument.   [WSAVE]
       CALL CFFTB1 (N,C,WSAVE,WSAVE(IW1),WSAVE(IW2))
----------------------------------------^
compilation aborted for c3fft_tmp_.F (code 1)

When searching the wien2k mailing list for c3fft, it turns out there had 
been problems before with this routine, and an updated version had been 
provided one year ago (=before v12.1):

http://zeus.theochem.tuwien.ac.at/pipermail/wien/2011-April/014541.html

It seems to have been a different problem, however, and both the present 
version and that (slightly different) version of april 2011 give the 
same compilation error.

Can anyone use this information to find a solution?

Thanks !

Stefaan



More information about the Wien mailing list