[Wien] New findings on the lapw0 seg fault core dump error
Michael Fechtelkord
Michael.Fechtelkord at ruhr-uni-bochum.de
Fri Jun 6 22:25:44 CEST 2025
and a additional comment.
lapw0 crashes only in the first cycle with OMP_NUM_THREADS higher than
1. When I set lapw0:1 for the first cycle (using -i 1 in run_lapw) and
then after the first run set it back to lapw0:8 it runs without a
problem for the complete scf cycle. It seems that is a problem with the
initial case.clmsum file (init_lapw -b -prec 1).
Am 06.06.2025 um 22:07 schrieb Michael Fechtelkord via Wien:
> Hello Peter,
>
>
> omp_lapw0 in .machines was 8. I reduced it from 8 to 4, then to 2 and
> finally to 1. Only in the case of omp_lapw0:1 lapw0 does not crash.
>
> omp_global:2
>
>
> Best regards,
>
> Michael
>
>
> Am 06.06.2025 um 17:59 schrieb Peter Blaha:
>> What was your OMP_NUM_THREADS variable ?
>>
>> Set it to 1, 2, ... and check if the error occurs again.
>>
>> Am 06.06.2025 um 14:07 schrieb Michael Fechtelkord via Wien:
>>> I debugged the core-dump file with gdb and using debugging symbols
>>> in compilation of lapw0.
>>>
>>> The debugger gave me the line which causes the coredump
>>>
>>> _----------------------------------------
>>>
>>> Debuginfod has been enabled.
>>> To make this setting permanent, add 'set debuginfod enabled on' to
>>> .gdbinit.
>>> [Thread debugging using libthread_db enabled]
>>> Using host libthread_db library "/lib64/libthread_db.so.1".
>>> Core was generated by `/usr/local/WIEN2k/lapw0 lapw0.def'.
>>> Program terminated with signal SIGSEGV, Segmentation fault.
>>>
>>> #0 0x000000000048b89b in
>>> MAIN__.DIR.OMP.PARALLEL.LOOP.12.split63842.split63939 ()*at
>>> lapw0.F:1649*
>>>
>>> *1649 !$omp parallel do reduction(+:rhopw00,cwk,cvout) &*
>>>
>>>
>>> [Current thread is 1 (Thread 0x14823edbe740 (LWP 339344))]
>>>
>>> ------------------------------------
>>>
>>> Maybe somebody has an idea how to fix it..
>>>
>>>
>>> Best regards
>>>
>>> Michael
>>>
>>>
>>> Am 17.05.2025 um 13:48 schrieb Michael Fechtelkord via Wien:
>>>> Hello everybody,
>>>>
>>>>
>>>> I have new results considering the lapw0 crash which happens
>>>> partially (segmentation fault error - core dump).
>>>>
>>>> It seems that the crucial thing is the case.clmsum file. (I am no
>>>> expert here) But if this is somehow the key. It can produce the
>>>> lapw0 so it might be that it is sometimes triggering the lapw0.
>>>>
>>>> I calculated MgF2 and substituted the new generated clmsum by an
>>>> older one and then there was no crash. I cannot attach them because
>>>> the file size is too large.
>>>>
>>>>
>>>> I am not so into debugging, to find out why and where it happens.
>>>>
>>>>
>>>> Best regards,
>>>>
>>>> Michael
>>>>
>>>>
>>> --
>>> Dr. Michael Fechtelkord
>>>
>>> Institut für Geologie, Mineralogie und Geophysik
>>> Ruhr-Universität Bochum
>>> Universitätsstr. 150
>>> D-44780 Bochum
>>>
>>> Phone: +49 (234) 32-24380
>>> Fax: +49 (234) 32-04380
>>> Email:Michael.Fechtelkord at ruhr-uni-bochum.de
>>> Web
>>> Page:https://www.ruhr-uni-bochum.de/kristallographie/kc/mitarbeiter/fechtelkord/
>>>
>>>
>>> _______________________________________________
>>> Wien mailing list
>>> Wien at zeus.theochem.tuwien.ac.at
>>> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
>>> SEARCH the MAILING-LIST at:
>>> http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
>>
--
Dr. Michael Fechtelkord
Institut für Geologie, Mineralogie und Geophysik
Ruhr-Universität Bochum
Universitätsstr. 150
D-44780 Bochum
Phone: +49 (234) 32-24380
Fax: +49 (234) 32-04380
Email: Michael.Fechtelkord at ruhr-uni-bochum.de
Web Page: https://www.ruhr-uni-bochum.de/kristallographie/kc/mitarbeiter/fechtelkord/
More information about the Wien
mailing list