[Wien] Error in parallel execution

Marcos Veríssimo Alves marcos.verissimo.alves at gmail.com
Tue Jul 27 17:47:22 CEST 2010


Worse of all is that the disks are correctly mounted, and that from the
command line I can do things like ls, even create and remove files. Only
from within lapw1para it gives me an error. I am starting to insert lines
with calls to unix utilities such as whoami in order to see what weird thing
is going on there...

Thanks all for the suggestions. If I track this bug down I'll let you know.

Cheers,

Marcos

On Tue, Jul 27, 2010 at 5:26 PM, Laurence Marks <L-marks at northwestern.edu>wrote:

> It is a system problem. Maybe the relevant disc is not mounted on the
> remote node or something? Try doing a simple ssh to the node and test
> things like ls, cd etc. Too many possibilities to list here. Good
> luck, just try computer experiments until you track it down.....
>
> 2010/7/27 Marcos Veríssimo Alves <marcos.verissimo.alves at gmail.com>:
> > Hi Laurence,
> > I am not running mpi, only using rsh/ssh for the plain k-point
> > parallelization. I couldn't really figure out how to make a .machines
> file
> > to run parallel over k-points on mpi, with one processor per machine.
> > However, I think Stefaan's tip has gone right to the point: in my job
> error
> > file I get the following errors:
> >  LAPW0 END
> > .machinetmp222: No such file or directory
> > bash: line 0: cd: /afs/atc.unican.es/u/m/mverissi/WIEN2k/sro1sto6:
> > Permission denied
> >  Cannot open error-file
> > ERRFLG - couldn't open errorflag-file.
> > The fact that from inside lapw1para the ssh command cannot cd to my home
> > directory puzzles me... it seems to be a system problem, then. However,
> if
> > you have any suggsestions, they will be more than welcome!
> > Thanks,
> > Marcos
> >
> > On Tue, Jul 27, 2010 at 4:27 PM, Laurence Marks <
> L-marks at northwestern.edu>
> > wrote:
> >>
> >> I doubt (although I may be wrong) that this has anything to do with
> >> the OS. Do you have -traceback in your compile options? This will give
> >> information as to which program this is happening in. Also, are you
> >> running mpi or not?
> >>
> >> 2010/7/27 Marcos Veríssimo Alves <marcos.verissimo.alves at gmail.com>:
> >> > Hi Stefaan and Laurence,
> >> > @Stefaan: I will try it.
> >> > @Laurence: it's the latest version, which I have downloaded about two
> >> > weeks
> >> > ago. Hope this helps.
> >> > Thanks,
> >> > Marcos
> >> > On Tue, Jul 27, 2010 at 3:47 PM, Laurence Marks
> >> > <L-marks at northwestern.edu>
> >> > wrote:
> >> >>
> >> >> Is this the latest version, or an older one? Some changes were made
> in
> >> >> the error file access in the latest version for mpi reasons.
> >> >>
> >> >> 2010/7/27 Marcos Veríssimo Alves <marcos.verissimo.alves at gmail.com>:
> >> >> > Hi all,
> >> >> >
> >> >> > I am experiencing a problem in the execution in parallel over
> >> >> > k-points.
> >> >> >
> >> >> > I have compiled the code successfully in a cluster running Debian
> >> >> > Linux
> >> >> > and
> >> >> > with SGEEE as the queue system using ssh as the means to launch the
> >> >> > instances on the remote nodes, with /bin/bash as the shell. My
> script
> >> >> > successfully creates a .machines file and when I run runsp_lapw -p
> >> >> > -NI
> >> >> > -cc
> >> >> > 0.0001, the process dies. This is because, for some reason,
> lapw1para
> >> >> > is
> >> >> > not
> >> >> > being able to write to the up(dn)lapw1_*.error files:
> >> >> >
> >> >> > forrtl: severe (47): write to READONLY file, unit 99, file
> >> >> > /afs/atc.unican.es/u/m/mverissi/WIEN2k/sro1sto6/uplapw1_1.error
> >> >> >
> >> >> > And the same happens to the dnlapw1_*.error files.
> >> >> >
> >> >> > lapw0, on the other hand, runs fine. I have set up parallel
> execution
> >> >> > successfully on my dual-core desktop using ssh, using pretty much
> the
> >> >> > same
> >> >> > stuff, and it runs perfectly well.
> >> >> >
> >> >> > Now, I have changed the write permissions of the directory (and all
> >> >> > the
> >> >> > files) with chmod -R ugo+rw /afs/atc.unican.es/u..., but to no
> avail.
> >> >> > Has
> >> >> > anyone experienced any problem like this before? Could there be any
> >> >> > known
> >> >> > (but obscure) reason why lapw1para would not be able to write to
> its
> >> >> > files,
> >> >> > but lapw0para would?
> >> >> >
> >> >> > Best regards,
> >> >> >
> >> >> > Marcos
> >> >> >
> >> >> > _______________________________________________
> >> >> > Wien mailing list
> >> >> > Wien at zeus.theochem.tuwien.ac.at
> >> >> > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
> >> >> >
> >> >> >
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >> Laurence Marks
> >> >> Department of Materials Science and Engineering
> >> >> MSE Rm 2036 Cook Hall
> >> >> 2220 N Campus Drive
> >> >> Northwestern University
> >> >> Evanston, IL 60208, USA
> >> >> Tel: (847) 491-3996 Fax: (847) 491-7820
> >> >> email: L-marks at northwestern dot edu
> >> >> Web: www.numis.northwestern.edu
> >> >> Chair, Commission on Electron Crystallography of IUCR
> >> >> www.numis.northwestern.edu/
> >> >> Electron crystallography is the branch of science that uses electron
> >> >> scattering and imaging to study the structure of matter.
> >> >> _______________________________________________
> >> >> Wien mailing list
> >> >> Wien at zeus.theochem.tuwien.ac.at
> >> >> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
> >> >
> >> >
> >> > _______________________________________________
> >> > Wien mailing list
> >> > Wien at zeus.theochem.tuwien.ac.at
> >> > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
> >> >
> >> >
> >>
> >>
> >>
> >> --
> >> Laurence Marks
> >> Department of Materials Science and Engineering
> >> MSE Rm 2036 Cook Hall
> >> 2220 N Campus Drive
> >> Northwestern University
> >> Evanston, IL 60208, USA
> >> Tel: (847) 491-3996 Fax: (847) 491-7820
> >> email: L-marks at northwestern dot edu
> >> Web: www.numis.northwestern.edu
> >> Chair, Commission on Electron Crystallography of IUCR
> >> www.numis.northwestern.edu/
> >> Electron crystallography is the branch of science that uses electron
> >> scattering and imaging to study the structure of matter.
> >> _______________________________________________
> >> Wien mailing list
> >> Wien at zeus.theochem.tuwien.ac.at
> >> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
> >
> >
> > _______________________________________________
> > Wien mailing list
> > Wien at zeus.theochem.tuwien.ac.at
> > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
> >
> >
>
>
>
> --
> Laurence Marks
> Department of Materials Science and Engineering
> MSE Rm 2036 Cook Hall
> 2220 N Campus Drive
> Northwestern University
> Evanston, IL 60208, USA
> Tel: (847) 491-3996 Fax: (847) 491-7820
> email: L-marks at northwestern dot edu
> Web: www.numis.northwestern.edu
> Chair, Commission on Electron Crystallography of IUCR
> www.numis.northwestern.edu/
> Electron crystallography is the branch of science that uses electron
> scattering and imaging to study the structure of matter.
> _______________________________________________
> Wien mailing list
> Wien at zeus.theochem.tuwien.ac.at
> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://zeus.theochem.tuwien.ac.at/pipermail/wien/attachments/20100727/98e81437/attachment.htm>


More information about the Wien mailing list