[Wien] MPI Problem

Oliver Albertini ora at georgetown.edu
Sat May 4 01:46:43 CEST 2013


Thanks to you both for the suggestions. The OS was recently updated beyond
those versions mentioned in the link (now 6100-08).

Adding the iostat statement to all the errclr.f files prevents the program
from stopping altogether although error messages sill appear in the output:

STOP  LAPW0 END
STOP  LAPW0 END
STOP  LAPW0 END
STOP  LAPW0 END
STOP  LAPW0 END
STOP LAPW1 - Error
STOP  LAPW1 END
STOP  LAPW1 END
STOP  LAPW1 END
STOP  LAPW1 END
STOP LAPW1 - Error
STOP  LAPW1 END
STOP  LAPW1 END
STOP  LAPW1 END
STOP  LAPW1 END
STOP LAPW2 - FERMI; weighs written
STOP  LAPW2 END
STOP  LAPW2 END
STOP  LAPW2 END
STOP  LAPW2 END
STOP  LAPW2 END
STOP  SUMPARA END
STOP LAPW2 - FERMI; weighs written
STOP  LAPW2 END
STOP  LAPW2 END
STOP  LAPW2 END
STOP  LAPW2 END
STOP  LAPW2 END
STOP  SUMPARA END
STOP  CORE  END
STOP  CORE  END
STOP  MIXER END


which are more prevalent when using higher processor counts. After
completing a few runs with more processors, the times have continually
increased:

real    6m43.33s


user    6m19.18s    serial


sys     0m13.59s





real    10m36.03s


user    1m4.68s       2proc


sys     0m47.79s





real    11m11.25s


user    1m5.24s     4proc


sys     0m52.17s





real    11m39.17s


user    1m6.18s    8proc


sys     1m10.65s





real    14m31.16s


user    1m7.95s   16proc


sys     2m7.63s

After looking into various IBM Parallel Operating Environment (poe)
environmental variables (MP_SHARED_MEMORY,MP_IO_BUFFER_SIZE,MP_EAGER_LIMIT)
it seems like none of them are improving performance. Any ideas why this is
getting slower?


On Thu, May 2, 2013 at 8:49 PM, Gavin Abo <gsabo at crimson.ua.edu> wrote:

>
>  STOP  LAPW0 END
>> "inilpw.f", line 233: 1525-142 The CLOSE statement on unit 200 cannot be
>> completed because an errno value of 2 (A file or directory in the path name
>> does not exist.) was received while closing the file.  The program will
>> stop.
>> STOP  LAPW1 END
>>
> If this is on operating system AIX 6.1 [http://zeus.theochem.tuwien.**
> ac.at/pipermail/wien/2013-**March/018560.html<http://zeus.theochem.tuwien.ac.at/pipermail/wien/2013-March/018560.html>],
> the following link mentions that a fix might be needed for some release
> levels:
>
> http://www-01.ibm.com/support/**docview.wss?uid=isg1IZ23555<http://www-01.ibm.com/support/docview.wss?uid=isg1IZ23555>
> ______________________________**_________________
> Wien mailing list
> Wien at zeus.theochem.tuwien.ac.**at <Wien at zeus.theochem.tuwien.ac.at>
> http://zeus.theochem.tuwien.**ac.at/mailman/listinfo/wien<http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien>
> SEARCH the MAILING-LIST at:  http://www.mail-archive.com/**
> wien at zeus.theochem.tuwien.ac.**at/index.html<http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://zeus.theochem.tuwien.ac.at/pipermail/wien/attachments/20130503/1217d1f3/attachment.htm>


More information about the Wien mailing list