[Wien] segmentation fault in lapwso

Pavel Ondračka pavel.ondracka at email.cz
Wed Aug 18 09:31:34 CEST 2021


Dear Luis,

one very easy thing to try could be to set environment variable
OMP_STACKSIZE to something large like "1g", i.e., "export
OMP_STACKSIZE=1g" before run_lapw. Small OpenMP stacksize caused issues
for us previously so could be the case here as well. The only explicit
omp loop in hsocalc.F does allocates all private variables on the stack
and few of them are arrays, it is feasible this could be the case.

2 prof. Blaha:
from a very brief visual inspection of the OpenMP code in lapwso, I
believe there could be another small issue with combined MPI OpenMP. At
lines hsocalc.F:159 and hsocalc.F:160 the variables ibf_local and
ibi_local should be probably private. This should not be the cause of
the here reported problems though as that would only influence the
lapwso_mpi. The rest seems OK though (at first glance).

Best regards
Pavel

On Tue, 2021-08-17 at 18:18 -0300, Luis Ogando wrote:
> Dear Wien2k Community,
>    Greetings!
>    This message is only to inform that I also had a fragmentation
> problem with lapwso and Wien2k-21.
>    It was a very strange case. After a converged SCF cycle with mBJ
> and SO, I could not run "run_lapw -NI -so ...". In this case, I
> always got the following error after lapwso:
> 
> forrtl: severe (174): SIGSEGV, segmentation fault occurred
> Image              PC                Routine            Line      
>  Source            
> lapwso             000000000046A0EA  Unknown               Unknown
>  Unknown
> libpthread-2.28.s  00001530B217B730  Unknown               Unknown
>  Unknown
> libiomp5.so        00001530B1D132FB  Unknown               Unknown
>  Unknown
> libiomp5.so        00001530B1D13049  Unknown               Unknown
>  Unknown
> libiomp5.so        00001530B1D14B59  Unknown               Unknown
>  Unknown
> libiomp5.so        00001530B1D161E8  Unknown               Unknown
>  Unknown
> libiomp5.so        00001530B1D0C926  Unknown               Unknown
>  Unknown
> lapwso             000000000049CA86  Unknown               Unknown
>  Unknown
> lapwso             000000000040D77F  hmsout_mp_finit_h         119
>  modules.F
> lapwso             000000000042B94E  MAIN__                    622
>  lapwso.F
> lapwso             0000000000404D22  Unknown               Unknown
>  Unknown
> libc-2.28.so       00001530A3E3609B  __libc_start_main     Unknown
>  Unknown
> lapwso             0000000000404C2A  Unknown               Unknown
>  Unknown
> 0.167u 0.051s 0:00.10 210.0% 0+0k 0+1976io 0pf+0w
> error: command   /home/ogando/Wien/Wien21/lapwso lapwso.def   failed
> 
>    The solution was to change OMP_NUM_THREADS from 4 to 1.
>    I checked and it also worked with OMP_NUM_THREADS equal to 2 but
> not 3.
>    If someone is interested in the compilation options or any other
> information, please ask.
>    All the best,
>                   Luis
> 
>    
> 
> Em qui., 10 de jun. de 2021 às 08:17, Fecher, Gerhard
> <fecher at uni-mainz.de> escreveu:
> > Dear all,
> > while running a -so calculation I hit a segmentation fault in
> > lapwso
> > (see below) with the latest version Wien2k21.1 that does NOT appear
> > in 19.2.
> > (appeared for two different systems in fresh directories)
> > 
> > Did someone experience the same, or did I miss a report and may be
> > not up to date?
> > 
> > I used all settings the same (mostly default values), and the same
> > compilers and options (Intel OneAPI 2021 2.0 and Parallel Studio XE
> > 2017.4.056) for both versions, 21.1 and 19.2
> > 
> > forrtl: severe (174): SIGSEGV, segmentation fault occurred
> > Image              PC                Routine            Line       
> > Source             
> > lapwso             000000000046CE0A  Unknown               Unknown 
> > Unknown
> > libpthread-2.22.s  00002AFBCC6DAB10  Unknown               Unknown 
> > Unknown
> > libiomp5.so        00002AFBCCF2C8E8  Unknown               Unknown 
> > Unknown
> > lapwso             000000000049F7A6  Unknown               Unknown 
> > Unknown
> > lapwso             0000000000421E9E  hmsec_                    926 
> > hmsec.F
> > 
> > line 926 is;       deallocate(meigve) 
> > indeed, if  this is the correct line at all.
> > 
> > indeed in 21.2 (I have seen that hmsec.F is different in 19.2)
> > 
> > Thanks for any suggestions that help
> > 
> > Gerhard
> > 
> > DEEP THOUGHT in D. Adams; Hitchhikers Guide to the Galaxy:
> > "I think the problem, to be quite honest with you,
> > is that you have never actually known what the question is."
> > 
> > ====================================
> > Dr. Gerhard H. Fecher
> > Institut of Physics
> > Johannes Gutenberg - University
> > 55099 Mainz
> > _______________________________________________
> > Wien mailing list
> > Wien at zeus.theochem.tuwien.ac.at
> > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
> > SEARCH the MAILING-LIST at: 
> > http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
> _______________________________________________
> Wien mailing list
> Wien at zeus.theochem.tuwien.ac.at
> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
> SEARCH the MAILING-LIST at: 
> http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html




More information about the Wien mailing list