[Wien] problem with running of several simultaneous sessions

Laurence Marks L-marks at northwestern.edu
Fri Oct 8 16:18:17 CEST 2010


In general one cannot run efficiently more than one mpi task on a
given node and/or core without running into problems with memory,
swapping and communications; mpi is not designed to do this. No code,
Wien2k or other will do this well, just live with one mpi task at a
time.

On Fri, Oct 8, 2010 at 9:23 AM, Kakhaber Jandieri
<Kakhaber.Jandieri at physik.uni-marburg.de> wrote:
> Dear WIEN2K users
>
> I use WIEN2K_09.1 compiled with /opt/intel/mpich-1.2.5.3/bin/mpif90,
> /opt/intel/mkl/10.0.1.014/lib/em64t.  on a cluster of dual core AMD
> Opteron processors.
>
> Recently I was faced with the following problem.
>
> I ran the task with MPI parallelization. The execution of the task was
> quite normal with the usual sequence of scf cycles.
> Then I ran the second task also with MPI parallelization.
> As a result, the execution of the Task2 was normal,  while the Task1
> became completely idle .
>
> These are the corresponding .machines files:
>
> For the Task1:
> granularity:1
> 1:node106:1 node111:1 node141:1 node105:1
> 1:node113:1 node109:1 node104:1 node114:1
> 1:node123:1 node118:1 node129:1 node132:1
> 1:node140:1 node130:1 node120:1 node134:1
> 1:node122:1 node126:1 node131:1 node135:1
> lapw0: node106:1 node111:1 node141:1 node105:1 node113:1 node109:1
> node104:1 node114:1 node123:1 node118:1 node129:1 node132:1 node140:1
> node130:1 node120:1 node134:1 node122:1 node126:1 node131:1 node135:1
>
> For the Task2:
> granularity:1
> 1:node105:1 node113:1 node109:1 node116:1
> 1:node114:1 node110:1 node111:1 node141:1
> 1:node130:1 node120:1 node134:1 node122:1
> 1:node126:1 node131:1 node135:1 node124:1
> 1:node140:1 node123:1 node118:1 node129:1
> lapw0: node105:1 node113:1 node109:1 node116:1 node114:1 node110:1
> node111:1 node141:1 node130:1 node120:1 node134:1 node122:1 node126:1
> node131:1 node135:1 node124:1 node140:1 node123:1 node118:1 node129:1
>
>
> One can see from the .machines files that some nodes are shared between
> these two tasks  (for example node105).  When the Task1 was ran I saw
> the corresponding process on node105. When the Task2 was added, the
> process for the Task1 became idle and I was left only with the process
> running for the Task2.
> Furthermore even in case of just single task, if several cores of the
> same node (for example .... node105:2 .... in .machines file) should be
> used, I see single (instead of several) processes on this node.
>
> As a summary: in most cases I cannot run several tasks simultaneously -
> only the last task is executed. The others became idle.
>
> I could not find the reason for such behavior and would be extremely
> thankful for any suggestions and advices.
> _______________________________________________
> Wien mailing list
> Wien at zeus.theochem.tuwien.ac.at
> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
>



-- 
Laurence Marks
Department of Materials Science and Engineering
MSE Rm 2036 Cook Hall
2220 N Campus Drive
Northwestern University
Evanston, IL 60208, USA
Tel: (847) 491-3996 Fax: (847) 491-7820
email: L-marks at northwestern dot edu
Web: www.numis.northwestern.edu
Chair, Commission on Electron Crystallography of IUCR
www.numis.northwestern.edu/
Electron crystallography is the branch of science that uses electron
scattering and imaging to study the structure of matter.


More information about the Wien mailing list