[Wien] problems in wien2k 10 run
saed alazar
q_saed74 at yahoo.com
Thu May 19 17:37:12 CEST 2011
The code works just fine on the student cluster where we have compiled
it using the extra '-assu buff' flag to alleviate NFS problems. I think
we have seen that the optimisation jobs work well there.
There are still problems with the optimisation jobs running on the planck cluster though.
It seems that there are 2 main problems:
1
One node appears to be doing nearly all the work while the others do
little... as seen in the dayfile. However if we login to the job nodes
while the job is running and run the top command, all cores seem to be
using ~100% cpu, load is normal. Also 'cat /proc/meminfo' shows there is
plenty of free memory (there should be as these nodes each have 32GB
RAM).
2 After some time it becomes impossible to login to your home
directory and I cannot even login to the job node from the console on
the machine. I also cannot delete the job from the queues. This means I
then have to turn the queuing system off (qterm -t quick), remove the job files from the jobs directory (rm
-rf /usr/local/torque/server_priv/jobs/2627.planck.*) and then restart
the queuing system (/usr/local/torque/sbin/pbs_server -t warm). I then
still have to reboot the nodes that were involved with that job. This is a problem.
The main difference between the planck cluster and the student cluster is that the planck cluster has a GPFS parallel file system and does not use NFS (well actually, GPFS uses something like NFS). The problems we were seeing on the student clsuter disappeared when we
recompiled with the extra '-assu buff' flag. I am recompiling wien2k on the planck cluster with this flag but it does not fix the
problems.
Other than that, both machines are running RHEL5.3
Operating System, and openmpi, fftw have been compiled the same way, as
has wien2k.
Thanks
Said
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://zeus.theochem.tuwien.ac.at/pipermail/wien/attachments/20110519/127a430a/attachment.htm>
More information about the Wien
mailing list