2151
Comment:
|
← Revision 4 as of 2010-05-11 16:47:46 ⇥
2307
|
Deletions are marked like this. | Additions are marked like this. |
Line 10: | Line 10: |
This is not strictly a magpar bug but a problem with MPICH2 (version 1.2 or later), which affects parallel runs on multiple machines. | This is not strictly a magpar bug but a problem with MPICH2 (version 1.2 and 1.2.1), which affects parallel runs on multiple machines. |
Line 14: | Line 14: |
1. Prepare parallel magpar run on Linux cluster using MPICH2 >= 1.2 | 1. Prepare parallel magpar run on Linux cluster using MPICH2 1.2 or 1.2.1 |
Line 26: | Line 26: |
* Follow the instructions at the bottom of [[https://trac.mcs.anl.gov/projects/mpich2/ticket/963|this page]] in the MPICH2 trac system. | * Use MPICH2 version [[http://www.mcs.anl.gov/research/projects/mpich2/downloads/tarballs/1.2.1p1/mpich2-1.2.1p1.tar.gz|1.2.1p1]] or later * or follow the instructions at the bottom of [[https://trac.mcs.anl.gov/projects/mpich2/ticket/963|this page]] in the MPICH2 trac system. |
Line 29: | Line 30: |
* or use older MPICH2 version [[http://www.mcs.anl.gov/research/projects/mpich2/downloads/tarballs/1.0.8p1/mpich2-1.0.8p1.tar.gz|1.0.8p]] | * or use older MPICH2 version [[http://www.mcs.anl.gov/research/projects/mpich2/downloads/tarballs/1.0.8p1/mpich2-1.0.8p1.tar.gz|1.0.8p1]] |
Line 36: | Line 37: |
* Status: fixed in MPICH2 revision [[https://trac.mcs.anl.gov/projects/mpich2/changeset/5923|5923]] Wait for new MPICH2 release, then update magpar's Makefile.libs to default to new (fixed) MPICH2 version. |
* Status: fixed in MPICH2 version [[http://www.mcs.anl.gov/research/projects/mpich2/downloads/tarballs/1.2.1p1/mpich2-1.2.1p1.tar.gz|1.2.1p1]]: revision [[https://trac.mcs.anl.gov/projects/mpich2/changeset/5923|5923]] |
Line 41: | Line 40: |
Category MagparBugConfirmed | CategoryMagparBugFixed |
Description
This is not strictly a magpar bug but a problem with MPICH2 (version 1.2 and 1.2.1), which affects parallel runs on multiple machines.
Steps to reproduce
- Prepare parallel magpar run on Linux cluster using MPICH2 1.2 or 1.2.1
- Start mpd ring spanning several machines using "mpdboot" command
- mpdboot command does not return (hangs)
Example and Details
Workaround
Use MPICH2 version 1.2.1p1 or later
or follow the instructions at the bottom of this page in the MPICH2 trac system.
or edit/patch mpd.py directly according to this changeset
or download this version of mpd.py and use it instead of the mpd.py installed by MPICH2 >=1.2.
or use older MPICH2 version 1.0.8p1
Plan
- Priority: Medium
- Assigned to: MPICH2 developers