[Wien] error in k-point parallel execution
Torsten Andersen
thor at physik.uni-kl.de
Thu Jul 29 09:27:19 CEST 2004
Stefaan Cottenier wrote:
>>>What I mean is: When you log in on condmat2, do you have the same home
>>>directory as on localhost? You can test this with "touch file_of_zero"
>>>on localhost, then see if it is also present on condmat2.
>>
>>I executed the command "touch file_of_zero" on both "localhost" and
>>"condmat2" and no output resulted but the shell prompt.
>
>
> What Torsten meant is this: the command 'touch file_of_zero' will create a
> file with name 'file_of_zero' and no content (this gives you indeed no
> output but the prompt). Do 'ls -l file_of_zero', to see whether this file is
> created on localhost (it must). Then go to condmat2, and do 'ls -l
> file_of_zero' again (without touch first). Does that same file exists also
> there? If so, then localhost and condmat2 share the same directory, as it
> should. If not, then wien2k will not work in parallel.
>
>
>>>What do you get from "echo $SCRATCH" on localhost and condmat2?
>>
>>I get "./" from both pc's.
>
>
> If the directories are indeed shared, ./ should be OK. But for this types of
> clusters, it is more efficient to put $SCRATCH to /tmp: the large vector
> files will then be stored in the local /tmp of each machine, and will not
> fill the single disk of localhost (but don't forget to clean the /tmp of
> each machine from time to time...)
>
>
>>>What do you get from "echo $path" on localhost and condmat2?
>>
>>I get nothing using "echo $path" but usin "env" I get:
>
>
> For your bash, you need 'echo $PATH'.
>
>
>>>When you log in on condmat2, can you execute lapw1 manually? Try, in a
>>>shell: "which lapw1", and if it gives you a response, try "lapw1". What
>>>is the result?
>>
>>I tried "which lapw1" on both PC's and obtained:
>>"/home/wien2k/.WIEN_ROOT/lapw1".
>>
>>I tried "run lapw1" on both PC's in the directory TiC which I had already
>>worked with, and obtained:
>>------------
>>ERROR: option lapw1 does not exist !
>
>
> 'run lapw1' does not exist. Use 'x lapw1' instead. (but with 'run lapw1' you
> effictively use the 'run' command, as your output shows).
>
> Conclusion: it seems you have installed wien2k on all machines separately,
> and did not connect properly their disks. Therefore serial calculations
> work, parallel ones don't. In a good set-up, you have to install wien2k only
> ONCE on a parallel cluster, namely on the disk that is NFS-shared by all
> machines (but I can't help you with the technical details about how to do
> that).
Right! Sorry for repeating it before I read your mail, Stefaan. I seem
to be a bit late in the forwarding stream after I moved to Kaiserslautern.
I would recommend searching the Linux HOWTO's for resolving the problem.
Or contact the local Linux UG. Or ask an electronic engineering student
at your university.
>
> Stefaan
>
> _______________________________________________
> Wien mailing list
> Wien at zeus.theochem.tuwien.ac.at
> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
>
--
Dr. Torsten Andersen TA-web: http://deep.at/myspace/
AG Hübner, Department of Physics, Kaiserslautern University
http://cmt.physik.uni-kl.de http://www.physik.uni-kl.de/
More information about the Wien
mailing list