[Wien] error in k-point parallel execution

Torsten Andersen thor at physik.uni-kl.de
Thu Jul 29 09:27:19 CEST 2004


Stefaan Cottenier wrote:
>>>What I mean is: When you log in on condmat2, do you have the same home
>>>directory as on localhost? You can test this with "touch file_of_zero"
>>>on localhost, then see if it is also present on condmat2.
>>
>>I executed the command "touch file_of_zero" on both "localhost" and
>>"condmat2" and no output resulted but the shell prompt.
> 
> 
> What Torsten meant is this: the command 'touch file_of_zero' will create a
> file with name 'file_of_zero' and no content (this gives you indeed no
> output but the prompt). Do 'ls -l file_of_zero', to see whether this file is
> created on localhost (it must). Then go to condmat2, and do 'ls -l
> file_of_zero' again (without touch first). Does that same file exists also
> there? If so, then localhost and condmat2 share the same directory, as it
> should. If not, then wien2k will not work in parallel.
> 
> 
>>>What do you get from "echo $SCRATCH" on localhost and condmat2?
>>
>>I get "./" from both pc's.
> 
> 
> If the directories are indeed shared, ./ should be OK. But for this types of
> clusters, it is more efficient to put $SCRATCH to /tmp: the large vector
> files will then be stored in the local /tmp of each machine, and will not
> fill the single disk of localhost (but don't forget to clean the /tmp of
> each machine from time to time...)
> 
> 
>>>What do you get from "echo $path" on localhost and condmat2?
>>
>>I get nothing using "echo $path" but usin "env" I get:
> 
> 
> For your bash, you need 'echo $PATH'.
> 
> 
>>>When you log in on condmat2, can you execute lapw1 manually? Try, in a
>>>shell: "which lapw1", and if it gives you a response, try "lapw1". What
>>>is the result?
>>
>>I tried "which lapw1" on both PC's and obtained:
>>"/home/wien2k/.WIEN_ROOT/lapw1".
>>
>>I tried "run lapw1" on both PC's in the directory TiC which I had already
>>worked with, and obtained:
>>------------
>>ERROR: option lapw1 does not exist !
> 
> 
> 'run lapw1' does not exist. Use 'x lapw1' instead. (but with 'run lapw1' you
> effictively use the 'run' command, as your output shows).
> 
> Conclusion: it seems you have installed wien2k on all machines separately,
> and did not connect properly their disks. Therefore serial calculations
> work, parallel ones don't. In a good set-up, you have to install wien2k only
> ONCE on a parallel cluster, namely on the disk that is NFS-shared by all
> machines (but I can't help you with the technical details about how to do
> that).

Right! Sorry for repeating it before I read your mail, Stefaan. I seem 
to be a bit late in the forwarding stream after I moved to Kaiserslautern.

I would recommend searching the Linux HOWTO's for resolving the problem. 
Or contact the local Linux UG. Or ask an electronic engineering student 
at your university.

> 
> Stefaan
> 
> _______________________________________________
> Wien mailing list
> Wien at zeus.theochem.tuwien.ac.at
> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
> 

-- 
Dr. Torsten Andersen        TA-web: http://deep.at/myspace/
AG Hübner, Department of Physics, Kaiserslautern University
http://cmt.physik.uni-kl.de    http://www.physik.uni-kl.de/




More information about the Wien mailing list