[Wien] Suse 9.2 problem

Alex Scherbatey alex_sch at imp.kiev.ua
Wed Jun 1 18:22:35 CEST 2005


Dear Stefaan,

On Tue, 31 May 2005 17:08:06 +0300, Stefaan Cottenier
<Stefaan.Cottenier at fys.kuleuven.be> wrote:

>
> Dear all,
>
> While setting up a new pc-cluster using P4 machines and Suse 9.2, we  
> have troubles with the k-point parallel execution. lapw1 gets correctly  
> executed on all machines in a parallel run (the 'fortran stop' statement  
> appears, all scf1 files are complete and correct, all error files  
> empty), but nevertheless lapw1para keeps running forever. The last lines  
> in case.dayfile are :
>
> [1] 4479
> [2] 4512
> [3] 4545
> [1]    Done                          ( $remote $machine[$p]  ...
> waiting for all processes to complete
>
> which shows that one of the three machine notified that the process was  
> completed, while the other 2 never do this. In other words, lapw1para is  
> stuck in the 'wait' statement :
>
>
> endkloop:
> if ($debug > 0) echo waiting for all processes to complete
> wait         <========================================
>
> if ($debug > 0) echo `date`" ->" "all processes done."
> sleep $sleepy
>
> We verified that exactly the same code works fine on another pc-cluster  
> with Suse 9.0, and that code copied from that other cluster has the same  
> problem when ran on this new cluster. The NFS setup is as basic as could  
> be, and is identical on both clusters. We therefore suspect the problem  
> is related to Suse 9.2. I'm pretty sure many of you are running wien2k  
> with Suse 9.2. Did anyone experienced a similar problem and knows a fix?

I have also encountered this bug. I think, this is the problem of some
suse shared library which is used by C-shell. Run the following test
script on SUSE and you will face the same behavior:

#!/bin/csh -xf
sleep 10 &
wait

The same script for bash works properly. Also this script works well on
any other Linux distributions I have used. The updating of tsch change
nothing as so as the updating of the Linux kernel.

I have posted bug reports to several Linux forums and also to SUSE
developers, but nobody answered.

So, I'm going to change the Linux distribution :(

-- 
Шурик



More information about the Wien mailing list