<div dir="ltr"><div class="gmail_default" style="font-family:verdana,sans-serif;color:#000000">I do wonder about this. I suggest editing module.F and changing lines 118 and 119 to</div><div class="gmail_default" style="font-family:verdana,sans-serif;color:#000000"> DEALLOCATE(en,stat=Ien) ; if(Ien .ne. 0)write(*,*)'Err en ',ien<br> DEALLOCATE(vnorm,stat=Ivn ; ) if(Ivn .ne. 0)write(*,*)'Err vnorm ',Ivn<br></div><div class="gmail_default" style="font-family:verdana,sans-serif;color:#000000"><br></div><div class="gmail_default" style="font-family:verdana,sans-serif;color:#000000">There is every chance that the bug is not in those lines, but somewhere completely different. SIGSEV often means that the code has been overwritten, for instance arrays going out of bounds.</div><div class="gmail_default" style="font-family:verdana,sans-serif;color:#000000"><br></div><div class="gmail_default" style="font-family:verdana,sans-serif;color:#000000">You can also recompile with -g (don't change other options) added, and/or -C. Sometimes this is better. Or use other things like debuggers or valgrind.</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, Aug 18, 2021 at 10:47 AM Pavel Ondračka <<a href="mailto:pavel.ondracka@email.cz" target="_blank">pavel.ondracka@email.cz</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">I'm CCing the list back as the crash was now diagnosed to a likely MKL<br>
problem, see below for more details.<br>
> <br>
> <br>
> > So just to be clear, explicitly setting OMP_STACKSIZE=1g does not<br>
> > help<br>
> > to solve the issue?<br>
> > <br>
> <br>
> <br>
> Right! OMP_STACKSIZE=1g with OMP_NUM_THREADS=4 does not solve the<br>
> problem!<br>
> <br>
> > <br>
> > The problem is that the OpenMP code in lapwso is very simple, so I'm<br>
> > having problems seeing how it could be causing the problems.<br>
> > <br>
> > Could you also try to see what happens if run with:<br>
> > OMP_NUM_THREADS=1<br>
> > MKL_NUM_THREADS=4<br>
> > <br>
> <br>
> <br>
> It does not work with these values, but I checked and it works<br>
> reverting them:<br>
> OMP_NUM_THREADS=4<br>
> MKL_NUM_THREADS=1<br>
<br>
This was very helpfull and IMO points to a problem with MKL instead of<br>
Wien2k.<br>
<br>
Unfortunatelly setting MKL_NUM_THREADS=1 globally will reduce the<br>
OpenMP performance, mostly in lapw1 but also at other places. So if you<br>
want to keep the OpenMP BLAS/lapack level parallelism you have to<br>
either find some MKL version that works (if you do please report it<br>
here), link with OpenBLAS (using it for lapwso is enough) or create a<br>
simple wrapper that sets the MKL_NUM_THREADS=1 just for lapwso, i.e.,<br>
rename lapwso binary in WIENROOT to lapwso_bin and create new lapwso<br>
file there with:<br>
<br>
#!/bin/bash<br>
MKL_NUM_THREADS=1 lapwso_bin $1<br>
<br>
and set it to executable with chmod +x lapwso.<br>
<br>
Or maybe MKL has a non-OpenMP version which you could link with just<br>
lapwso and use standard one in other parts, but dunno, I mostly use<br>
OpenBLAS. If you need some further help, let me know.<br>
<br>
Reporting the issue to intel could be also nice, however I never had<br>
any real luck there and it is also a bit problematic as you can't<br>
provide testcase due to Wien2k being proprietary code...<br>
<br>
Best regards<br>
Pavel<br>
<br>
> <br>
> > <br>
> > This should disable the Wien2k-specific OpenMP parallelism but still<br>
> > keep the rest of paralellism at the BLAS/lapack level.<br>
> > <br>
> <br>
> <br>
> So, perhaps, the problem is related to MKL!<br>
> <br>
> > <br>
> > Another option is that something is going wrong before lapwso and the<br>
> > lapwso crash is just the symptom. What happens if you run everything<br>
> > up<br>
> > to lapwso without OpenMP (OMP_NUM_THREADS=1) and than enable it just<br>
> > for lapwso?<br>
> > <br>
> <br>
> <br>
> If I run lapw0 and lapw1 with OMP_NUM_THREADS=4 and then change it to 1<br>
> just before lapwso, it works. <br>
> If I do the opposite, starting with OMP_NUM_THREADS=1 and then change<br>
> it to 4 just before lapwso, it does not work.<br>
> So I believe that the problem is really at lapwso.<br>
> <br>
> If you need more information, please, let me know!<br>
> All the best,<br>
> Luis<br>
<br>
<br>
_______________________________________________<br>
Wien mailing list<br>
<a href="mailto:Wien@zeus.theochem.tuwien.ac.at" target="_blank">Wien@zeus.theochem.tuwien.ac.at</a><br>
<a href="https://urldefense.com/v3/__http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien__;!!Dq0X2DkFhyF93HkjWTBQKhk!H_VXJmyf6v2ZSCmTICvdVDv1QuKxPqCDcjbbytr7Fh51-KF5rv8A2uvyMlW3x3YA4jSb3A$" rel="noreferrer" target="_blank">https://urldefense.com/v3/__http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien__;!!Dq0X2DkFhyF93HkjWTBQKhk!H_VXJmyf6v2ZSCmTICvdVDv1QuKxPqCDcjbbytr7Fh51-KF5rv8A2uvyMlW3x3YA4jSb3A$</a> <br>
SEARCH the MAILING-LIST at: <a href="https://urldefense.com/v3/__http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html__;!!Dq0X2DkFhyF93HkjWTBQKhk!H_VXJmyf6v2ZSCmTICvdVDv1QuKxPqCDcjbbytr7Fh51-KF5rv8A2uvyMlW3x3aDFmAN4g$" rel="noreferrer" target="_blank">https://urldefense.com/v3/__http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html__;!!Dq0X2DkFhyF93HkjWTBQKhk!H_VXJmyf6v2ZSCmTICvdVDv1QuKxPqCDcjbbytr7Fh51-KF5rv8A2uvyMlW3x3aDFmAN4g$</a> <br>
</blockquote></div><br clear="all"><div><br></div>-- <br><div dir="ltr"><div dir="ltr">Professor Laurence Marks<br>Department of Materials Science and Engineering<br>Northwestern University<br><a href="http://www.numis.northwestern.edu/" target="_blank">www.numis.northwestern.edu</a><div>"Research is to see what everybody else has seen, and to think what nobody else has thought" Albert <span style="font-family:Arial,Helvetica,sans-serif;font-size:12.8px">Szent-</span><span style="font-family:Arial,Helvetica,sans-serif;font-size:small;color:rgb(34,34,34)">Györgyi</span></div></div></div>