[Wien] Error in parallel execution
Marcos Veríssimo Alves
marcos.verissimo.alves at gmail.com
Tue Jul 27 17:17:58 CEST 2010
Hi Laurence,
I am not running mpi, only using rsh/ssh for the plain k-point
parallelization. I couldn't really figure out how to make a .machines file
to run parallel over k-points on mpi, with one processor per machine.
However, I think Stefaan's tip has gone right to the point: in my job error
file I get the following errors:
LAPW0 END
.machinetmp222: No such file or directory
bash: line 0: cd: /afs/atc.unican.es/u/m/mverissi/WIEN2k/sro1sto6:
Permission denied
Cannot open error-file
ERRFLG - couldn't open errorflag-file.
The fact that from inside lapw1para the ssh command cannot cd to my home
directory puzzles me... it seems to be a system problem, then. However, if
you have any suggsestions, they will be more than welcome!
Thanks,
Marcos
On Tue, Jul 27, 2010 at 4:27 PM, Laurence Marks <L-marks at northwestern.edu>wrote:
> I doubt (although I may be wrong) that this has anything to do with
> the OS. Do you have -traceback in your compile options? This will give
> information as to which program this is happening in. Also, are you
> running mpi or not?
>
> 2010/7/27 Marcos Veríssimo Alves <marcos.verissimo.alves at gmail.com>:
> > Hi Stefaan and Laurence,
> > @Stefaan: I will try it.
> > @Laurence: it's the latest version, which I have downloaded about two
> weeks
> > ago. Hope this helps.
> > Thanks,
> > Marcos
> > On Tue, Jul 27, 2010 at 3:47 PM, Laurence Marks <
> L-marks at northwestern.edu>
> > wrote:
> >>
> >> Is this the latest version, or an older one? Some changes were made in
> >> the error file access in the latest version for mpi reasons.
> >>
> >> 2010/7/27 Marcos Veríssimo Alves <marcos.verissimo.alves at gmail.com>:
> >> > Hi all,
> >> >
> >> > I am experiencing a problem in the execution in parallel over
> k-points.
> >> >
> >> > I have compiled the code successfully in a cluster running Debian
> Linux
> >> > and
> >> > with SGEEE as the queue system using ssh as the means to launch the
> >> > instances on the remote nodes, with /bin/bash as the shell. My script
> >> > successfully creates a .machines file and when I run runsp_lapw -p -NI
> >> > -cc
> >> > 0.0001, the process dies. This is because, for some reason, lapw1para
> is
> >> > not
> >> > being able to write to the up(dn)lapw1_*.error files:
> >> >
> >> > forrtl: severe (47): write to READONLY file, unit 99, file
> >> > /afs/atc.unican.es/u/m/mverissi/WIEN2k/sro1sto6/uplapw1_1.error
> >> >
> >> > And the same happens to the dnlapw1_*.error files.
> >> >
> >> > lapw0, on the other hand, runs fine. I have set up parallel execution
> >> > successfully on my dual-core desktop using ssh, using pretty much the
> >> > same
> >> > stuff, and it runs perfectly well.
> >> >
> >> > Now, I have changed the write permissions of the directory (and all
> the
> >> > files) with chmod -R ugo+rw /afs/atc.unican.es/u..., but to no avail.
> >> > Has
> >> > anyone experienced any problem like this before? Could there be any
> >> > known
> >> > (but obscure) reason why lapw1para would not be able to write to its
> >> > files,
> >> > but lapw0para would?
> >> >
> >> > Best regards,
> >> >
> >> > Marcos
> >> >
> >> > _______________________________________________
> >> > Wien mailing list
> >> > Wien at zeus.theochem.tuwien.ac.at
> >> > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
> >> >
> >> >
> >>
> >>
> >>
> >> --
> >> Laurence Marks
> >> Department of Materials Science and Engineering
> >> MSE Rm 2036 Cook Hall
> >> 2220 N Campus Drive
> >> Northwestern University
> >> Evanston, IL 60208, USA
> >> Tel: (847) 491-3996 Fax: (847) 491-7820
> >> email: L-marks at northwestern dot edu
> >> Web: www.numis.northwestern.edu
> >> Chair, Commission on Electron Crystallography of IUCR
> >> www.numis.northwestern.edu/
> >> Electron crystallography is the branch of science that uses electron
> >> scattering and imaging to study the structure of matter.
> >> _______________________________________________
> >> Wien mailing list
> >> Wien at zeus.theochem.tuwien.ac.at
> >> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
> >
> >
> > _______________________________________________
> > Wien mailing list
> > Wien at zeus.theochem.tuwien.ac.at
> > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
> >
> >
>
>
>
> --
> Laurence Marks
> Department of Materials Science and Engineering
> MSE Rm 2036 Cook Hall
> 2220 N Campus Drive
> Northwestern University
> Evanston, IL 60208, USA
> Tel: (847) 491-3996 Fax: (847) 491-7820
> email: L-marks at northwestern dot edu
> Web: www.numis.northwestern.edu
> Chair, Commission on Electron Crystallography of IUCR
> www.numis.northwestern.edu/
> Electron crystallography is the branch of science that uses electron
> scattering and imaging to study the structure of matter.
> _______________________________________________
> Wien mailing list
> Wien at zeus.theochem.tuwien.ac.at
> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://zeus.theochem.tuwien.ac.at/pipermail/wien/attachments/20100727/c60406a0/attachment.htm>
More information about the Wien
mailing list