[Wien] lapw1 hangs over nfs

Peter Blaha pblaha at theochem.tuwien.ac.at
Fri Dec 20 08:44:05 CET 2013


This can happen for slow networks or not well configured ones (not 
enough NFS daemons, ...)

a) k-parallelization makes only sense up to a certain granularity. This 
means, you cannot expect to make the parallelization "faster" at a 
certain level of processors. For instance when you have 100 k-points and 
parallelization with 20 cores takes 20 seconds for lapw1 (i.e. 5 
k-points take just 20 sec/core); for sure in most setups parallelization 
with 50 cores will be even slower or even fail (from time to time 
because of network problems).

b) One can always reduce network load by defining a SCRATCH directory on 
the local nodes. These directories must exist and in that case your 
k-list and processor-list must be "compatible" (eg. 100 k and 20 cores, 
but not 16)

On 12/19/2013 11:16 PM, Oliver Albertini wrote:
> Hello,
>
> I am running k-point parallel over nfs, and every few iterations, a
> k-point process will hang, leaving 'ghost processes' visible under the
> top command. These processes have 0% cpu utilization.
>
> Looking at the error files, the k-point in question will have this type
> of error:
>
> $ cat dnlapw1_22.error
> Error in LAPW1
>   'INILPW' - can't open unit:  11
>   'INILPW' -        filename: AgMgOCo.energydn_22
>   'INILPW' -          status: unknown      form: formatted
>   'LAPW1' - INILPW aborted unsuccessfully.
>   'Unknow' - Unknown signal received
>
>
> However, case.energydn_22 is present, but empty.
>
> I suspect that this could be related to network speed. Has anyone had a
> similar experience?
>
> Sincerely,
>
> Oliver  Albertini
>
>
> _______________________________________________
> Wien mailing list
> Wien at zeus.theochem.tuwien.ac.at
> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
> SEARCH the MAILING-LIST at:  http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
>

-- 

                                       P.Blaha
--------------------------------------------------------------------------
Peter BLAHA, Inst.f. Materials Chemistry, TU Vienna, A-1060 Vienna
Phone: +43-1-58801-165300             FAX: +43-1-58801-165982
Email: blaha at theochem.tuwien.ac.at    WWW: 
http://info.tuwien.ac.at/theochem/
--------------------------------------------------------------------------


More information about the Wien mailing list