[Wien] NSF cache kernel bug can break Wien2k
L. D. Marks
L-marks at northwestern.edu
Tue Feb 7 17:38:01 CET 2006
We've been experiencing irreproducible problems in Wien2k ever since we
moved to rocks 4.0 or 4.1 (RedHat kernel 2.6.9). After many months we've
traced that it is probably a real kernel bug
(http://bugs.centos.org/view.php?id=1039) which (according to Trond
Myklebust) should be fixed in 2.6.15-rc5 and newer kernels. The presence
of the bug is masked by the default use of automount in rocks and probably
other systems.
There is a very simple test you can do if you think you have it. In a
directory which is nfs mounted (not automounted) on a compute node c0-0,
create a script test.sh
containing:
echo 10 > Probe
ssh -x c0-0 cat Probe
echo 11 > Probe
ssh -x c0-0 cat Probe
Then "sh test.sh" will report 10 & 11 the very first time you do it, but
afterwards probably 10 & 10.
Cure, none yet.
-----------------------------------------------
Laurence Marks
Department of Materials Science and Engineering
MSE Rm 2036 Cook Hall
2220 N Campus Drive
Northwestern University
Evanston, IL 60201, USA
Tel: (847) 491-3996 Fax: (847) 491-7820
email: L-marks at northwestern dot edu
http://www.numis.northwestern.edu
-----------------------------------------------
More information about the Wien
mailing list