[Wien] lapw2 mpi parallelization limits
Peter Blaha
pblaha at theochem.tuwien.ac.at
Wed Mar 18 07:38:24 CET 2009
Your fix is a good choice, since lapw2 is fast anyway.
I suspect a bit that it might have to do with the specific test case.
Do you have the same problems with another test (you may need to ask
your research group for a big one) ?
Eventually you can send me the basic input data (*.in* *.struct *.klist
case.clmsum/up/dn (if spin-polarized)) as tar file (to my private email)
and I'll try this case. One never knows which strange bugs are present.
Reading and interpreting the strace is not straightforward for me. It
seems to happen in subroutine atpar (reading...)? I prefer good old
"print*, ..." statements and after a couple of tests I can trace the
lines where the problem appears.
>
>> How many different k-points and how many different atoms are you using?
>
> How do I find this? I'm the sys admin not the researcher. I'm using a
> config given to me by the researcher. I think it is 4 atoms (Ce, Co, In,
> In) and 84 k-points but I'm not positive. Only one parallel job (ie
> strictly mpi parallelization). I believe it was from an example taken
> from a presentation.
>
>> No ideas at the moment beyond turn on all plausible debug flags (not
>> fun); someone like Peter Blaha may have some suggestions tomorrow.
>
> See the strace I sent before. I enabled debug and csh tracing (-xv) but
> it takes me straight to mpirun when then crashes.
>
> Incidentally, I used the attached patch to lure WIEN into running. It
> forces a maximum of 4cpus during the lapw2 stage. Ugly but it will work
> for us until the underlying problem gets solved.
>
> Scott
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> Wien mailing list
> Wien at zeus.theochem.tuwien.ac.at
> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
More information about the Wien
mailing list