[Wien] problem with running of several simultaneous sessions

Kakhaber Jandieri Kakhaber.Jandieri at physik.uni-marburg.de
Fri Oct 8 16:46:53 CEST 2010


Laurence Marks wrote:
> In general one cannot run efficiently more than one mpi task on a
> given node and/or core without running into problems with memory,
> swapping and communications; mpi is not designed to do this. No code,
> Wien2k or other will do this well, just live with one mpi task at a
> time.
>
> On Fri, Oct 8, 2010 at 9:23 AM, Kakhaber Jandieri
> <Kakhaber.Jandieri at physik.uni-marburg.de> wrote:
>   
>> Dear WIEN2K users
>>
>> I use WIEN2K_09.1 compiled with /opt/intel/mpich-1.2.5.3/bin/mpif90,
>> /opt/intel/mkl/10.0.1.014/lib/em64t.  on a cluster of dual core AMD
>> Opteron processors.
>>
>> Recently I was faced with the following problem.
>>
>> I ran the task with MPI parallelization. The execution of the task was
>> quite normal with the usual sequence of scf cycles.
>> Then I ran the second task also with MPI parallelization.
>> As a result, the execution of the Task2 was normal,  while the Task1
>> became completely idle .
>>
>> These are the corresponding .machines files:
>>
>> For the Task1:
>> granularity:1
>> 1:node106:1 node111:1 node141:1 node105:1
>> 1:node113:1 node109:1 node104:1 node114:1
>> 1:node123:1 node118:1 node129:1 node132:1
>> 1:node140:1 node130:1 node120:1 node134:1
>> 1:node122:1 node126:1 node131:1 node135:1
>> lapw0: node106:1 node111:1 node141:1 node105:1 node113:1 node109:1
>> node104:1 node114:1 node123:1 node118:1 node129:1 node132:1 node140:1
>> node130:1 node120:1 node134:1 node122:1 node126:1 node131:1 node135:1
>>
>> For the Task2:
>> granularity:1
>> 1:node105:1 node113:1 node109:1 node116:1
>> 1:node114:1 node110:1 node111:1 node141:1
>> 1:node130:1 node120:1 node134:1 node122:1
>> 1:node126:1 node131:1 node135:1 node124:1
>> 1:node140:1 node123:1 node118:1 node129:1
>> lapw0: node105:1 node113:1 node109:1 node116:1 node114:1 node110:1
>> node111:1 node141:1 node130:1 node120:1 node134:1 node122:1 node126:1
>> node131:1 node135:1 node124:1 node140:1 node123:1 node118:1 node129:1
>>
>>
>> One can see from the .machines files that some nodes are shared between
>> these two tasks  (for example node105).  When the Task1 was ran I saw
>> the corresponding process on node105. When the Task2 was added, the
>> process for the Task1 became idle and I was left only with the process
>> running for the Task2.
>> Furthermore even in case of just single task, if several cores of the
>> same node (for example .... node105:2 .... in .machines file) should be
>> used, I see single (instead of several) processes on this node.
>>
>> As a summary: in most cases I cannot run several tasks simultaneously -
>> only the last task is executed. The others became idle.
>>
>> I could not find the reason for such behavior and would be extremely
>> thankful for any suggestions and advices.
>> _______________________________________________
>> Wien mailing list
>> Wien at zeus.theochem.tuwien.ac.at
>> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
>>
>>     
>
>
>
>   
Dear Laurence,

Thank you very much for the information.



More information about the Wien mailing list