[Wien] lapw2 QTL-B crash with MPI, but not with k-parallel
Johan Eriksson
joher at ifm.liu.se
Fri Jun 13 09:47:56 CEST 2008
Dear Wien community,
I'm running the latest Wien2k release on a linux cluster. IFORT 10.1,
cmkl 9.1, openmpi 1.2.5).
The cases are running fine with k-point parallelization + MPI lapw0.
However, since there are many more cpus than k-points and infiniband
interconnects I want to use full MPI parallelization. First I ran my
case with k-point parallel for a few cycles, stopped, ran clean_lapw and
then switched to MPI. After a few iterations I started getting QTL-B
warnings and it crash. If I switch back to k-point parallel it runs just
fine again.
What am I doing wrong here? Could it be that I'm using the iterative
diagonalization scheme (-it switch)? Should I try some other mkl och MPI
implementation?
Also, why is it that the serial benchmark 'x lapw1 -c' is so unstable
with mkl 10 then using OMP_NUM_THREADS>=4? With cmkl 9.1 it works fine
with 1,2,4 and 8 threads. When mkl 10 works it is however faster than
cmkl 9.1.
/Johan Eriksson
More information about the Wien
mailing list