[Wien] problem with running of several simultaneous sessions
Kakhaber Jandieri
Kakhaber.Jandieri at physik.uni-marburg.de
Fri Oct 8 16:23:02 CEST 2010
Dear WIEN2K users
I use WIEN2K_09.1 compiled with /opt/intel/mpich-1.2.5.3/bin/mpif90,
/opt/intel/mkl/10.0.1.014/lib/em64t. on a cluster of dual core AMD
Opteron processors.
Recently I was faced with the following problem.
I ran the task with MPI parallelization. The execution of the task was
quite normal with the usual sequence of scf cycles.
Then I ran the second task also with MPI parallelization.
As a result, the execution of the Task2 was normal, while the Task1
became completely idle .
These are the corresponding .machines files:
For the Task1:
granularity:1
1:node106:1 node111:1 node141:1 node105:1
1:node113:1 node109:1 node104:1 node114:1
1:node123:1 node118:1 node129:1 node132:1
1:node140:1 node130:1 node120:1 node134:1
1:node122:1 node126:1 node131:1 node135:1
lapw0: node106:1 node111:1 node141:1 node105:1 node113:1 node109:1
node104:1 node114:1 node123:1 node118:1 node129:1 node132:1 node140:1
node130:1 node120:1 node134:1 node122:1 node126:1 node131:1 node135:1
For the Task2:
granularity:1
1:node105:1 node113:1 node109:1 node116:1
1:node114:1 node110:1 node111:1 node141:1
1:node130:1 node120:1 node134:1 node122:1
1:node126:1 node131:1 node135:1 node124:1
1:node140:1 node123:1 node118:1 node129:1
lapw0: node105:1 node113:1 node109:1 node116:1 node114:1 node110:1
node111:1 node141:1 node130:1 node120:1 node134:1 node122:1 node126:1
node131:1 node135:1 node124:1 node140:1 node123:1 node118:1 node129:1
One can see from the .machines files that some nodes are shared between
these two tasks (for example node105). When the Task1 was ran I saw
the corresponding process on node105. When the Task2 was added, the
process for the Task1 became idle and I was left only with the process
running for the Task2.
Furthermore even in case of just single task, if several cores of the
same node (for example .... node105:2 .... in .machines file) should be
used, I see single (instead of several) processes on this node.
As a summary: in most cases I cannot run several tasks simultaneously -
only the last task is executed. The others became idle.
I could not find the reason for such behavior and would be extremely
thankful for any suggestions and advices.
More information about the Wien
mailing list