[Wien] Segmentation faults

Gerhard H Fecher fecher at uni-mainz.de
Thu Apr 28 16:03:24 CEST 2005


Dear Laurence,

I have the experience that some segmentation faults appear due to a wrong 
transmission of variables to subroutines, in my own programs those are 
usually caused by incomplete or quick declarations using the implicit 
statement or similar.

Very often it is suggested to go back from intel 8.1 to 7.1, however, I have 
programms running well using 8.1 but not with 7.1. It seems that there are 
just some small bugs (or incompatibilities) in the code (or indeed in the 
compiler, too) where one compiler (or library, depending on the Linux 
Version, etc.) does not take care about whereas the other does.

We use also to change /etc/ld.so.conf followed by running ldconfig to overcome 
some problems with LD_LIBRARY_PATH, unfortunately it does not help in all 
cases.

Concerning youre last problem: Did you check if everything that is supposed to 
be written into a file was written or still sticking somewhere in the 
buffer ? In the latter case a FLUSH at the end of the subroutine under 
question may help to release memory as it forces Linux to write into the file 
instead of keeping the data in memory (one of the really unpleasant features 
of Linux).

Did you try to discuss the problems in the Intel Fortran compiler forum ?
Maybe, one should ask them why there are much more segmentation faults (and 
other memory problems) appearing with 8.1 compared to 7.1 or earlier 
versions.

Ciao Gerhard

Am Donnerstag, 28. April 2005 14:27 schrieb L. D. Marks:
> Like others, we've seen some segmentation faults. I've managed to cure 95%
> of them, but there is one I've not been able to and maybe someone can make
> a suggestion.
> 
> Quick intro: in general a segmentation fault occurs when a program tries
> to access memory it should not. In most cases this is due to an error in
> the code or the something similar. For instance, not every "allocate" call
> in Wien2k checks the IOSTAT flag to see if there is enough memory
> available. They can also occur with "broken" systems; perhaps 90% of linux
> systems (and others) and not correctly setup.
> 
> One issue we've traced and cured concerns multiple versions of ifc and the
> mkl libraries. One way to switch versions is to use the LD_LIBRARY_PATH
> variable, but this can be dangerous (see
> http://www.visi.com/~barr/ldpath.html for some interesting comments), and
> we've found that it does not always work the way we expected. Much more
> stable (needs su) is to edit /etc/ld.so.conf on every node and run
> ldconfig, although this might not be the safest method. The ldd tool is
> also useful to check that you really have the libraries set the way you
> think you do. It's also advisable to use "which ifc" to see if you are
> really using the version that you think you are. Compiling with one
> version then running with another will give problems almost impossible to
> trace.
> 
> The one problem we have not solved is that our nodes seem sometime to be
> hogging some memory in the cache at times, and not releasing it. In one
> case this was 1.6G out of 2G (using vmstat). When several codes (lapw1 for
> a large calculation) started they were not able to get enough memory and
> crashed with a segmentation fault. The "cure" was to reboot the node. I'd
> appreciate any suggestions; I don't think this is an intrinsic Wien
> problem, although ...
> 
> -----------------------------------------------
> Laurence Marks
> Department of Materials Science and Engineering
> MSE Rm 2036 Cook Hall
> 2220 N Campus Drive
> Northwestern University
> Evanston, IL 60201, USA
> Tel: (847) 491-3996 Fax: (847) 491-7820
> email: L - marks @ northwestern . edu
> http://www.numis.northwestern.edu
> -----------------------------------------------
> 
> _______________________________________________
> Wien mailing list
> Wien at zeus.theochem.tuwien.ac.at
> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
> 



More information about the Wien mailing list