[Wien] A trick for mpi debugging

Laurence Marks L-marks at northwestern.edu
Thu Jul 18 17:38:07 CEST 2013


On a cluster I am using I am having a problem with ssh connections as
part of impi/mpirun about 0.1-0.2% of the time; what happens is that
they fail to launch and become zombie's (ps shows "[ssh] <defunct>").
Since fiddling through all the options within mpirun can be hard
(particularly for impi which is rather fast), I found (after a comment
from someone on the openssh list) a useful hack. I am providing it
here as it is a nice way around things, and might be useful to others
in the future.

The "trick" is to add --bootstrap-exec ~/bin/hssh or similar to the
mpirun line in $WIENROOT/parallel_options, then create the executable
~/bin/hssh with something similar to:

#!/bin/bash
a=`echo $@ | sed -e 's/-q/-v/'`
ssh $a


The above allows me to turn verbose output on in the ssh command since
impi insists on setting -q (quiet). For other cases something similar
can be done.

-- 
Professor Laurence Marks
Department of Materials Science and Engineering
Northwestern University
www.numis.northwestern.edu 1-847-491-3996
"Research is to see what everybody else has seen, and to think what
nobody else has thought"
Albert Szent-Gyorgi


More information about the Wien mailing list