[Wien] Segmentation fault in lapw2c

Laurence Marks L-marks at northwestern.edu
Sat Nov 1 15:04:21 CET 2008


Thankfully you compiled with -traceback because that helps (probably)
locate what is going wrong. If you look at the relevant piece of code
(in fermi.F) it reads:

!     **    NB          NUMBER OF BANDS                               *
!     **    NKP         NUMBER OF IRREDUCIBLE K-POINTS                *
(Some lines omitted)
      DIMENSION EB(NB,NKP),E(4),IKP(4)
      INTEGER W(NWX)
      CHARACTER*67 ERRMSG
      DATA NP/1000/
      CALL DEF0(NWX)
!     -----------------------------------------------------------------
!     --  FIND EMIN EMAX     (ENERGYBANDS ARE ASSUMED                 -
!     --                      TO BE ORDERED WITH RESPECT TO SYMMETRY  -
!     -----------------------------------------------------------------
      EMIN=EB(1,1)
<< This is line 812
      EMAX=EB(NB,1)

where I've added the "<< This is line 812" and condensed it slightly.
If, somehow, the number of bands (NB) or KPTS (NKP) has got corrupted,
for instance are negative or zero, then the definition of the size of
EB "DIMENSION EB(NB,NKP)" is wrong. Almost certainly this has happened
because something has gone wrong earlier in either lapw1c or lapwso
which have run to completion but not produced sensible output. You
should look at the output files they produced and see if they are
sensible; probably not. Peter may have some specific ideas.

N.B. While the problem is almost certainly not in lapw2, rather
something earlier, there are several things you could do to help sort
out what the problem. One is to add -C to the compilation options (for
testing purposes) for lapw2. This is noticeably slower but will give
more information (but might also show some non-bugs, so be careful if
your fortran programming skills are weak). An alternative would be to
add a debug line, for instance

        write(*,*)'Checking Dimensions ',NB,NKP

before line 812 in fermi.F

On Sat, Nov 1, 2008 at 6:39 AM, ROBERTO LUIS IGLESIAS PASTRANA
<roberto at uniovi.es> wrote:
> Hello all!
>
> I was trying to run a runsp_lapw job for a spin-polarized 16 atom Cr supercell in our local cluster. This is a 50 double processor node Xeon system. I'm using ifort and mkl 64-bit 10.1 versions. I tried to use k-point parallelization. I flipped half the spins in case.inst before going through a complete initialization procedure, since I try to resemble antiferromagnetic alignment. Previous tests with the same supercell size in ferromagnetic Fe went OK and a complete SCF cycle finished without errors. We're using the latest WIEN2k_08.3 version.
>
> I found a crash in lapw2 -c -up with a SIGSEGV, segmentation fault error. The error file reads as follows:
>
>  LAPW0 END
>  LAPW1 END
>  LAPW1 END
> forrtl: severe (174): SIGSEGV, segmentation fault occurred
> Image              PC                Routine            Line        Source
> lapw2c             0000000000430441  efermi_                   812  fermi_tmp_.F
> lapw2c             0000000000430199  dos_                      752  fermi_tmp_.F
> lapw2c             000000000042FE3A  fermi_tetra_              556  fermi_tmp_.F
> lapw2c             000000000042D7FA  fermi_                    110  fermi_tmp_.F
> lapw2c             0000000000457EE7  MAIN__                    258  lapw2_tmp_.F
> lapw2c             000000000040FAAA  Unknown               Unknown  Unknown
> libc.so.6          000000336481C3FB  Unknown               Unknown  Unknown
> lapw2c             000000000040F9EA  Unknown               Unknown  Unknown
>
> I thought something could be wrong in my input files. I ported everything to my PC and I found the same error output to the screen, except for the line showing the "dos" routine. Of course, I tried to change from TETRA to TEMP 0.003, for instance, in case.in2c but it did not help.
> The funny thing is that I once had a simliar error in running a spin-orbit plus orbital polarization correction calculation and after countless efforts from P. Blaha and L. Marks there was no conclusive workaround. I am very sorry to say I don't remember when or how I solved this problem, if I did at all. Most possibly I skipped it and turned my attention to a different issue. If desired, it can be checked at:
>
> http://zeus.theochem.tuwien.ac.at/pipermail/wien/2006-October/008036.html
>
> I tried to run the sequence
>
> x lapw0
> x lapw1 -c -up
> x lapw1 -c -dn
> x lapw2 -c -up
> .....
>
> I tested
>
> lapw2c uplapw2.def
>
> as well, and in both cases I got the same error.
>
> Soon afterwards, I did a complete clean initialization in my PC and left it running. There was again a crash in lapw2c:
>
> $ runsp_lapw -it -I -i 200 -ec 0.00001 -cc 0.0001
> hup: Command not found.
> Invalid null command.
>  LAPW0 END
>  LAPW1 END
>  LAPW1 END
> forrtl: severe (174): SIGSEGV, segmentation fault occurred
> Image              PC        Routine            Line        Source
> lapw2c             082FB8D0  Unknown               Unknown  Unknown
> lapw2c             080B28B6  read_vec_                  88  read_vec_tmp_.F
> lapw2c             0809086A  l2main_                   507  l2main_tmp_.F
> lapw2c             080A4A37  MAIN__                    543  lapw2_tmp_.F
> lapw2c             0804D1F1  Unknown               Unknown  Unknown
> libc.so.6          4008A450  Unknown               Unknown  Unknown
> lapw2c             0804D151  Unknown               Unknown  Unknown
>
>>   stop error
>
> The routines have now changed, now no fermi-routine related error appears, but something is still going wrong. I found the same problem again with the x lapw* and lapw2c uplapw2.def tests.
>
> Could it be that this is really a memory limit or system size issue?
>
> I would be very glad to welcome all possible suggestions. Please let me know if you need any extra info.
>
> Greetings
>
> Roberto
>
> Roberto Iglesias
> Departamento de Física
> Universidad de Oviedo
> Calvo Sotelo, s/n
> 33013 Oviedo SPAIN
> _______________________________________________
> Wien mailing list
> Wien at zeus.theochem.tuwien.ac.at
> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
>



-- 
Laurence Marks
Department of Materials Science and Engineering
MSE Rm 2036 Cook Hall
2220 N Campus Drive
Northwestern University
Evanston, IL 60208, USA
Tel: (847) 491-3996 Fax: (847) 491-7820
email: L-marks at northwestern dot edu
Web: www.numis.northwestern.edu
Chair, Commission on Electron Crystallography of IUCR
www.numis.northwestern.edu/
Electron crystallography is the branch of science that uses electron
scattering to study the structure of matter.


More information about the Wien mailing list