[Wien] Which mpi?

Masato YOSHIYA yoshiya at ams.eng.osaka-u.ac.jp
Thu Jan 17 17:13:22 CET 2008


Dear all:

Below my comment is based not only on Wien2k but on other 
ab-initio/first-principles codes in which relatively larger memory is 
consumed and the number of iteration is fewer, than, say, molecular 
dynamics codes.

> With the introduction of an mpi benchmark (thanks Peter), I would like
> to start a thread on this which would help me and perhaps others. Some
> questions:
> 1) Has anyone tested Intel's mpi to see how much better (if at all) it is?

In our case, three points needs to be evaluated for this question: (1) 
stability, (2) commanding procedure, and (3) speed.

As far as I tried Intel's for other pseudo-potential code, I saw no 
remarkable difference in all three points above, and I've never heard 
opinions that are in great favor for Intel's, compared with MPICH1, 
MPICH2, LAM, or OpenMPI. One of advantages of Intel's over other free 
ones is that if you buy it you receive support.

> 2) Is there much difference between mpich-1 and mpich-2?

In my experiences, statistical error is larger than the difference 
between mpich-1 and mpich-2 if you do not encounter bugs specific to a 
specific version, except for (2) commanding procedure: MPICH-1 or 
OpenMPI doesn't require any daemon to be executed in prior to an actual 
parallel run while MPICH-2 or LAM requires a daemon needs to be booted 
before a parallel run is executed. And, if you use ones that require the 
daemon, you can kill all the parallel run threads safely.

I'm (still) using mpich-1 (ver. 1.2.6) and mpich-2 (ver. 1.0.5p4), 
depending on what?, weather of the day.

> 3) Is there much effect for 1) and 2) with ethernet versus myrinet or
> infiniband?

Sorry, I have no idea.

> 4) Should one use rsh/ssh or something different for multiple CPU's on
> one computer?

If you execute a parallel run using mpiXX, you needs to use either rsh 
or ssh, even if you are using other core/CPU in a computer. But 
configuring routing table not to use NIC but to use loopback to reach 
the machine's own other CPU greatly reduces communication speed loss.

Hope this helps.

Looking forward to hearing others' opinions on Wien2k since I have just 
a little experience on parallel Wien2k.

Masato


More information about the Wien mailing list