[Wien] A trick for mpi debugging

Luis Ogando lcodacal at gmail.com
Sat Jul 27 21:53:41 CEST 2013


Dear Prof. Marks,

   Could you, please, send me a template for the parallel_options file
where this implementation was done ?
   I am sorry for that, but I am really far from being an expert.
   All the best,
                    Luis


2013/7/22 Laurence Marks <L-marks at northwestern.edu>

> A brief followup which may be useful (or not) for others in the future
> with mpi problems. I have been able to work around a mysterious
> impi/ssh bug on NU's supercomputer by replacing ssh by the
> openmpi/mpirun launcher. The hack is gross, but very stable.
>
> Step 1:
> 1) Add "--bootstrap-exec=$WIENROOT/hopen" to $WIENROOT/parallel_options.
> 2) Create the executable file $WIENROOT/hopen containing
> #!/bin/bash
> a=`echo $@ | sed -e 's/-x -q//'`
> $OPENMPI/bin/mpirun -np 1 --host $a
>
> (change $OPENMPI to where it has been compiled).
>
> On Thu, Jul 18, 2013 at 10:38 AM, Laurence Marks
> <L-marks at northwestern.edu> wrote:
> > On a cluster I am using I am having a problem with ssh connections as
> > part of impi/mpirun about 0.1-0.2% of the time; what happens is that
> > they fail to launch and become zombie's (ps shows "[ssh] <defunct>").
> > Since fiddling through all the options within mpirun can be hard
> > (particularly for impi which is rather fast), I found (after a comment
> > from someone on the openssh list) a useful hack. I am providing it
> > here as it is a nice way around things, and might be useful to others
> > in the future.
> >
> > The "trick" is to add --bootstrap-exec ~/bin/hssh or similar to the
> > mpirun line in $WIENROOT/parallel_options, then create the executable
> > ~/bin/hssh with something similar to:
> >
> > #!/bin/bash
> > a=`echo $@ | sed -e 's/-q/-v/'`
> > ssh $a
> >
> >
> > The above allows me to turn verbose output on in the ssh command since
> > impi insists on setting -q (quiet). For other cases something similar
> > can be done.
> >
> > --
> > Professor Laurence Marks
> > Department of Materials Science and Engineering
> > Northwestern University
> > www.numis.northwestern.edu 1-847-491-3996
> > "Research is to see what everybody else has seen, and to think what
> > nobody else has thought"
> > Albert Szent-Gyorgi
>
>
>
> --
> Professor Laurence Marks
> Department of Materials Science and Engineering
> Northwestern University
> www.numis.northwestern.edu 1-847-491-3996
> "Research is to see what everybody else has seen, and to think what
> nobody else has thought"
> Albert Szent-Gyorgi
> _______________________________________________
> Wien mailing list
> Wien at zeus.theochem.tuwien.ac.at
> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
> SEARCH the MAILING-LIST at:
> http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://zeus.theochem.tuwien.ac.at/pipermail/wien/attachments/20130727/f4db7e4a/attachment-0001.htm>


More information about the Wien mailing list