[Wien] MIXER runtime error + solution on Mac OS X
Laurence Marks
L-marks at northwestern.edu
Mon Sep 1 08:23:13 CEST 2014
Dear Kevin,
No problem with your email. All large codes have bugs, and sometimes I
write sloppy code. I do try and keep mixer as free of bugs as I can since I
wrote the multisecant algorithms.
Listing the W2kutils issue - good idea, hint to Peter.
___________________________
Professor Laurence Marks
Department of Materials Science and Engineering
Northwestern University
www.numis.northwestern.edu1-847-491-3996
Co-Editor, Acta Cryst A
"Research is to see what everybody else has seen, and to think what nobody
else has thought"
Albert Szent-Gyorgi
On Sep 1, 2014 1:13 AM, "Kevin Jorissen" <kevinjorissenpdx at gmail.com> wrote:
> Hi Laurence,
>
> thanks for your comments.
>
> I hope I didn't call the issue we observed a code bug -- I meant to use
> unsensational language and avoid assumptions. For sure this could be a
> problem on the Mac side or in ifort (we all know these exist). I haven't
> edited the W2kutils. But didn't we fix the Mac problems with that file a
> few years ago? In any case, I'm not using MPI and stacksize is set to
> unlimited in my shell startup file, so I doubt this is the culprit. Or
> could the W2kutils somehow override my shell startup configuration?
>
> It's probably not urgent since we have a remedy that will do for now.
> If you can think of any tests you'd like to see done on Mac, let us know.
>
> By the way, this W2kutils thing is ***NOT*** on the list of known issues
> and bugs on the WIEN2k website. It would be very, very valuable and
> time-saving if that list could be updated to reflect the knowledge inside
> the experts' heads.
>
> Cheers,
>
> Kevin
>
>
>
> On Sun, Aug 31, 2014 at 10:47 PM, Laurence Marks <L-marks at northwestern.edu
> > wrote:
>
>> I am currently at a conference in Montenegro, so don't have enough time
>> to check properly. While this could be a code bug, I suspect an OS bug
>> connected to the known problem in W2kutils for Mac of setting the stack
>> size. Do you have this commented out?
>>
>> To expand, the reason W2kutils sets the stack size is because this was
>> a very common problem (look at the mail list some years ago for ulimit),
>> some sys_admins were setting it too low and openmpi was not by default
>> passing ulimit values. If it is not large enough problems occur. The
>> argument you are using -heap-arrays puts arrays onto disc (it is similar
>> to the Fortran save command). This is slower, although this does not matter
>> much in mixer.
>>
>> Unless you can identify something specific, I am not sure what I can do
>> as I have no access to Mac. Maybe run mixer using ddd (or gdb) ? As one
>> caveat, with this type of issue sometimes it does not show up at the source.
>>
>> N.B. mixer is a bit of a memory hog, and sometime I should try and
>> clean up some of the arrays. Unfortunately this is hard with code that is
>> changing.
>>
>>
>> On Sun, Aug 31, 2014 at 6:30 PM, Kevin Jorissen <
>> kevinjorissenpdx at gmail.com> wrote:
>>
>>> Thanks, Martin, for sharing some advanced ideas.
>>>
>>> I spent a few minutes trying to find out more, throwing a diagnostic
>>> compile line at the problem :
>>>
>>> -gen-interfaces -warn interfaces -fp-stack-check -g -traceback -check
>>> arg_temp_created -check bounds
>>> trying to catch anything potentially suspicious. The problem with
>>> most codes I've worked on is that you typically catch a bunch of unrelated
>>> things that obscure the analysis :). In this case, e.g., the argument F to
>>> TrustStep (called before the NormS mentioned earlier) is an allocated array
>>> on one side and implicit on the other, and that offends the compile options
>>> above. I don't have much time for analysis right now - maybe the mixer
>>> developers will immediately spot what's going on in my earlier e-mail.
>>> "check bounds" or "check all" by themselves don't give any runtime
>>> diagnostics, so I'm guessing we're not overstepping array bounds explicitly.
>>>
>>> If you have a more specific idea for a test, I or maybe Jianxin can
>>> try to run it for you. I guess a basic one would be to just do the
>>> run_lapw calculation on Linux vs. Mac (with -heap-arrays) and see if the
>>> results are identical.
>>>
>>> Cheers,
>>>
>>> Kevin
>>>
>>>
>>>
>>>
>>>
>>> On Sun, Aug 31, 2014 at 4:29 PM, Martin Kroeker <
>>> martin at ruby.chemie.uni-freiburg.de> wrote:
>>>
>>>> This might warrant closer scrutiny - was it reproducible with any odd
>>>> tutorial problem, or does it require a particular case or type of
>>>> calculation ?
>>>> The "illegal instruction" abort signals that data was somehow spilling
>>>> over into the memory ranges holding the executable code. Now I would not
>>>> expect a "simple" heap-stack-collision (from an array that is simply too
>>>> big to put on the stack with impunity) to occur on any modern system
>>>> except perhaps severely constrained embedded ones. At worst, the abort
>>>> should have been accompanied by a "segmentation fault" message as the
>>>> attempt to overwrite the running program got caught. So other possible
>>>> explanations could be that the code tries to store more array elements
>>>> than the array was designed to hold, or that the indexes into the array
>>>> are miscalculated (overflowing or not clamped to positive values).
>>>> Moving data to the heap may have just changed the location of the
>>>> inadvertently overwritten memory to ranges where the effects are more
>>>> subtle (unrelated data) or not noticable (lucky hit on unused memory).
>>>> --
>>>> Dr. Martin Kroeker martin at ruby.chemie.uni-freiburg.de
>>>> c/o Prof.Dr. Caroline Roehr
>>>> Institut fuer Anorganische und Analytische Chemie der Universitaet
>>>> Freiburg
>>>>
>>>> _______________________________________________
>>>> Wien mailing list
>>>> Wien at zeus.theochem.tuwien.ac.at
>>>> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
>>>> SEARCH the MAILING-LIST at:
>>>> http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
>>>>
>>>
>>>
>>
>>
>> --
>> Professor Laurence Marks
>> Department of Materials Science and Engineering
>> Northwestern University
>> www.numis.northwestern.edu
>> Corrosion in 4D: MURI4D.numis.northwestern.edu
>> Co-Editor, Acta Cryst A
>> "Research is to see what everybody else has seen, and to think what
>> nobody else has thought"
>> Albert Szent-Gyorgi
>>
>> _______________________________________________
>> Wien mailing list
>> Wien at zeus.theochem.tuwien.ac.at
>> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
>> SEARCH the MAILING-LIST at:
>> http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://zeus.theochem.tuwien.ac.at/pipermail/wien/attachments/20140901/b0b3abdb/attachment.htm>
More information about the Wien
mailing list