=============================================================================== Changes in 4.3 =============================================================================== # Support MPI memory allocation kinds side document. # Support MPI ABI Proposal. Configure with --enable-mpi-abi and build with mpicc_abi. By default, mpicc still builds and links with MPICH ABI. # Experimental API MPIX_Op_create_x. It supports user callback function with extra_state context and op destructor callback. It supports language bindings to use proxy function for language-specific user callbacks. # Experimental API MPIX_{Comm,File,Session,Win}_create_errhandler_x. They allow user error handlers to have extra_state context and corresponding destructor. This allows language bindings to implement user error handlers via proxy. # Experimental API MPIX_Request_is_complete. This is a pure request state query function that will not invoke progress, nor will free the request. This should help applications that want separate task dependency checking from progress engine to avoid progress contentions, especially in multi-threaded context. It is also useful for tools to profile non-deterministic calls such as MPI_Test. # Experimental API MPIX_Async_start. This function let applications to inject progress hooks to MPI progress. It allows application to implement custom asynchronous operations that will be progressed by MPI. It avoids having to implement separate progress mechanisms that may either take additional resource or contend with MPI progress and negatively impact performance. It also allows applications to create custom MPI operations, such as MPI nonblocking collectives, and achieve near native performance. # Added benchmark tests test/mpi/bench/p2p_{latency,bw}. # Added CMA support in CH4 IPC. # Added IPC read algorithm for intranode Allgather and Allgatherv. # Added CVAR MPIR_CVAR_CH4_SHM_POSIX_TOPO_ENABLE to enable non-temporal memcpy for inter-numa shm communication. # Added CVAR MPIR_CVAR_DEBUG_PROGRESS_TIMEOUT for debugging MPI deadlock issues. # ch4:ucx now supports dynamic processes. MPI_Comm_spawn{_multiple} will work. MPI_Open_port will fail due to ucx port name exceeds current MPI_MAX_PORT_NAME of 256. One can work around by use an info hint "port_name_size" and use a larger port name buffer. # PMI-1 defines PMI_MAX_PORT_NAME, which may be different from MPI_MAX_PORT_NAME. This is used by "PMI_Lookup_name". Consequently, MPI_Lookup_name accepts info hint "port_name_size" that may be larger than MPI_MAX_PORT_NAME. If the port name does not fit in "port_name_size", it will return a truncation error. # Autogen default to use -yaksa-depth=2. # Default MPIR_CVAR_CH4_ROOTS_ONLY_PMI to on. # Added ch4 netmod API am_tag_send and am_tag_recv. # Added MPIR_CVAR_CH4_OFI_EAGER_THRESHOLD to force RNDV send mode. # Make check target will run ROMIO tests. =============================================================================== Changes in 4.2 =============================================================================== # Complete support MPI 4.1 specification # Experimental thread communicator feature (e.g. MPIX_Threadcomm_init). See paper "Frustrated With MPI+Threads? Try MPIxThreads!", https://doi.org/10.1145/3615318.3615320. # Experimental datatype functions MPIX_Type_iov_len and MPIX_Type_Iov # Experimental op MPIX_EQUAL for MPI_Reduce and MPI_Allreduce (intra communicator only) # Use --with-{pmi,pmi2,pmix]=[path] to configure external PMI library. Convenience options for Slurm and cray deprecated. Use --with-pmi=oldcray for older Cray environment. # Error checking default changed to runtime (used to be all). # Use the error handler bound to MPI_COMM_SELF as the default error handler. # Use ierror instead of ierr in "use mpi" Fortran interface. This affects user code if they call with explicit keyword, e.g. call MPI_Init(ierr=arg). "ierror" is the correct name specified in the MPI specification. We only added subroutine interface in "mpi.mod" since 4.1. # Handle conversion functions, such as MPI_Comm_c2f, MPI_Comm_f2c, etc., are no longer macros. MPI-4.1 require these to be actual functions. # Yaksa updated to auto detect the GPU architecture and only build for the detected arch. This applies to CUDA and HIP support. # MPI_Win_shared_query can be used on windows created by MPI_Win_create, MPI_Win_allocate, in addition to windows created by MPI_Win_allocate_shared. MPI_Win_allocate will create shared memory whenever feasible, including between spawned processes on the same node. # Fortran mpi.mod support Type(c_ptr) buffer output for MPI_Alloc_mem, MPI_Win_allocate, and MPI_Win_allocate_shared. # New functions added in MPI-4.1: MPI_Remove_error_string, MPI_Remove_error_code, and MPI_Remove_error_class # New functions added in MPI-4.1: MPI_Request_get_status_all, MPI_Request_get_status_any, and MPI_Request_get_status_some. # New function added in MPI-4.1: MPI_Type_get_value_index. # New functions added in MPI-4.1: MPI_Comm_attach_buffer, MPI_Session_attach_buffer, MPI_Comm_detach_buffer, MPI_Session_detach_buffer, MPI_Buffer_flush, MPI_Comm_flush_buffer, MPI_Session_flush_buffer, MPI_Buffer_iflush, MPI_Comm_iflush_buffer, and MPI_Session_iflush_buffer. Also added constant MPI_BUFFER_AUTOMATIC to allow automatic buffers. # Support for "mpi_memory_alloc_kinds" info key. Memory allocation kind requests can be made via argument to mpiexec, or as info during session creation. Kinds supported are "mpi" (with standard defined restrictors) and "system". Queries for supported kinds can be made on MPI objects such as sessions, comms, windows, or files. MPI 4.1 states that supported kinds can also be found in MPI_INFO_ENV, but it was decided at the October 2023 meeting that this was a mistake and will be removed in an erratum. =============================================================================== Changes in 4.1 =============================================================================== # Thread-cs in ch4 changed to per-vci. # Testsuite (test/mpi) is configured separately from mpich configure. # Added options in autogen to accelerate CI builds, including using pre-built sub-modules. Added -yaksa-depth option to generate shallower yaksa pup code for faster build and smaller binaries. # Support singleton init using hydra. # On OSX, link option flat_namespace is no longer turned on by default. # Generate mpi.mod Fortran interfaces using Python 3. For many compilers, including gfortran, flags such as -fallow-mismatched-args is no longer necessary. # Fixed message queue debugger interface in ch4. # PMI (src/pmi) is refactored as a subdir and can be separately distributed. # Added MPIX_Comm_get_failed. # Experimental MPIX stream API to enable explicit thread contexts. # Experimental MPIX gpu enqueue API. It currently only supports CUDA streams. # Delays GPU resource allocation in yaksa. # CH3 nemesis ofi netmod is removed. # New collective algorithms. All collective algorithms are listed in src/mpi/coll/coll_algorithms.txt # Removed hydra2. We will port unique features of hydra2, including tree-launching, to hydra in the future release. # Added in-repository wiki documentation. # Added stream workq to support optimizations for enqueue operations. # Better support for large count APIs by eliminating type conversion issues. # Hydra now uses libpmi (src/pmi) for handling PMI messages. # Many bug fixes and enhancements. =============================================================================== Changes in 4.0 =============================================================================== # All MPI-4 APIs have been implemented. Major MPI-4 features include MPI sessions, partitioned point-to-point communications, events in the MPI tool information interface, large-count functions, persistent collectives, MPI_Comm_idup_with_info, MPI_Isendrecv and MPI_Isendrecv_replace, MPI_Info_get_string, MPI_Comm_split_type with new split_type -- MPI_COMM_TYPE_HW_GUIDED and MPI_COMM_TYPE_HW_UNGUIDED. # Add QMPI (experimental) support. # Add MPIX_Delete_error_{class,code,string}. # MPI_Info objects can be accessed before MPI_Init{_thread}. # Generate C API interface functions including man page notes and error checking using Python scripts. # Generate Fortran bindings using Python scripts. # Generate collective entrance functions and generate per-algorithm tests. # Support explicit --without-cuda configure option. # Drop support for UCX version < 1.7.0. # Configure now optionally require Python 3 (when F08 is enabled). # Multi-NIC support in ch4:ofi. # Default to ch4:ofi when configure doesn't have a clear choice. Add message block at the end of configure to advise user. # Multiple VCI is fully implemented including the active message fallback paths. # Extend IPC to support non-contig datatypes. # Add AMD GPU support using HIP. # Add generic RNDV callback mechanism with active messages. # Refactor ch4 dynamic process functions. # Avoid building MPL and hwloc multiple times. # Many bug fixes and code clean-ups. =============================================================================== Changes in 3.4 =============================================================================== # ch4 replaces ch3 as the default device configuration. If no network module is specified at configuration-time, MPICH will search the user environment in order to select one to build. The user will be prompted to choose if no preferred network library is detected. # Add support for Yaksa datatype engine (default in ch4). # Add support for GPU buffers (CUDA, Level Zero) in pt2pt, collectives, and one-sided communication. # Add support for XPMEM. # Add support for multiple virtual communication interfaces for more efficient MPI_THREAD_MULTIPLE (experimental). # Add DAOS ADIO driver to ROMIO (contributed by Intel). # Add Quobyte ADIO driver to ROMIO (contributed by Quobyte). # Add support for Arm compiler toolchain # Add support for NVIDIA HPC compilers # Add support for flang/f18 Fortran compiler # Add support for AddressSanitizer and UndefinedBehaviorSanitizer to debug configuration # Remove mxm, llc, and portals4 netmods from ch3. # Remove support for logical reduction operations on floating point types. # Remove MPIX_Mutex interfaces. # Further improvements to ch4 business card exchange: extra long address support and fixes for PMIx integration. # Un-inline non-critical ch4 code for improved build times. # Fix several test program bugs. # Fix several static analysis and compiler warnings. # Change the signature of MPID_Init to include requested and provided thread levels. =============================================================================== Changes in 3.3.2 =============================================================================== # Add support for struct sockaddr in MPICH, Hydra, and PMI socket code. Works with both IPv4 and IPv6 addresses. # Fix localhost detection on FreeBSD and macOS, avoiding long delay during startup. # Fix thread-local storage detection. # Fix several test program bugs. # Fix several static analysis and compiler warnings. =============================================================================== Changes in 3.3.1 =============================================================================== # Fix bug in MPI_Testany/MPI_Waitany that could cause deadlock # Add missing functionality in Argobots library support # Fix configure-time detection for thread local storage support # Better support for reproducible builds. Thanks to Bernhard Wiedemann for the report and fixes # Fix support for XL compiler toolchain # Add support for -static-intel linking option # Fix building on systems without weak symbols # Fix several static analysis and compiler warnings =============================================================================== Changes in 3.3 =============================================================================== # CH4 Device: A new device layer implementation designed for low software overheads. CH4 has experimental support for OFI and UCX network libraries, and POSIX shared memory. Thanks to Intel, Mellanox, and RIKEN AICS for participating in the CH4 coding effort. # Fixed SLURM integration in Hydra for new node list format. # Added support for PMIx (https://pmix.github.io/pmix/) client library in CH4 netmods. Note that you must use a compatible PMIx server in this configuration. # Better organization of collectives in the MPI layer. The new scheme, which de-couples implementation from selection logic, enables easier integration of additional algorithms. # TSP collectives framework: A C++-template style framework for collective algorithms is added to allow single collective implementation to move data over generic or device-specific transport functions. # Improvements to derived datatype testing (DTPools - https://github.com/pmodels/mpich/blob/main/doc/wiki/design/DTPools.md). # Added new "non-catastrophic" error codes to expose internal resource exhaustion. # Added info hints to MPI_Comm_split_type to support splitting communicators by machine topology. Both on-node (socket, core, etc.) and off-node (switch-level) hints are defined. # Improvements to MPI_THREAD_MULTIPLE in CH4 through new thread safety models at the Virtual Network Interface (VNI) level. This introduces two new models that leverage work-queues to offload operations and improve scalability under contention. # Message Driven Thread Activation (MDTA). An alternative locking model is defined for MPI_THREAD_MULTIPLE in CH4. # Added PMI usage optimizations for business card exchange in CH4 netmods. # Improvements on MPI_Abort. MPI_Abort invoked on subcommunicators will only abort the connected processes within that communicator. ` # Cleanup of whitespace (ch3 excluded) using the maint/code-cleanup.sh script. For instructions on how to update PRs/branches based on MPICH before the cleanup, see https://github.com/pmodels/mpich/wiki/Code-Cleanup-Procedure. # Removed the PAMI device and poe PMI client. # C99 compiler support is now required to build MPICH. # Several other minor bug fixes, memory leak fixes, and code cleanup. A full list of changes is available at the following link: http://git.mpich.org/mpich.git/shortlog/v3.2..v3.3 A list of bugs that have been fixed is available at the following link: https://github.com/pmodels/mpich/milestone/25?closed=1 =============================================================================== Changes in 3.2 =============================================================================== # Added support for MPI-3.1 features including nonblocking collective I/O, address manipulation routines, thread-safety for MPI initialization, pre-init functionality, and new MPI_T routines to look up variables by name. # Fortran 2008 bindings are enabled by default and fully supported. # Added support for the Mellanox MXM InfiniBand interface. (thanks to Mellanox for the code contribution). # Added support for the Mellanox HCOLL interface for collectives. (thanks to Mellanox for the code contribution). # Significant stability improvements to the MPICH/portals4 implementation. # Completely revamped RMA infrastructure including several scalability improvements, performance improvements, and bug fixes. # Added experimental support for Open Fabrics Interfaces (OFI) version 1.0.0. https://github.com/ofiwg/libfabric (thanks to Intel for code contribution) # The Myrinet MX network module, which had a life cycle from 1.1 till 3.1.2, has now been deleted. # Several other minor bug fixes, memory leak fixes, and code cleanup. A full list of changes is available at the following link: http://git.mpich.org/mpich.git/shortlog/v3.1.3..v3.2 A full list of bugs that have been fixed is available at the following link: https://trac.mpich.org/projects/mpich/query?status=closed&group=resolution&milestone=mpich-3.2 =============================================================================== Changes in 3.1.3 =============================================================================== # Several enhancements to Portals4 support. # Several enhancements to PAMI (thanks to IBM for the code contribution). # Several enhancements to the CH3 RMA implementation. # Several enhancements to ROMIO. # Fixed deadlock in multi-threaded MPI_Comm_idup. # Several other minor bug fixes, memory leak fixes, and code cleanup. A full list of changes is available at the following link: http://git.mpich.org/mpich.git/shortlog/v3.1.2..v3.1.3 A full list of bugs that have been fixed is available at the following link: https://trac.mpich.org/projects/mpich/query?status=closed&group=resolution&milestone=mpich-3.1.3 =============================================================================== Changes in 3.1.2 =============================================================================== # Significant enhancements to the BG/Q device, especially for RMA and shared memory functionality. # Several enhancements to ROMIO. # Upgraded to hwloc-1.9. # Added more Fortran 2008 (F08) tests and fixed a few F08 binding bugs. Now all MPICH F90 tests have been ported to F08. # Updated weak alias support to align with gcc-4.x # Minor enhancements to the CH3 RMA implementation. # Better implementation of MPI_Allreduce for intercommunicator. # Added environment variables to control memory tracing overhead. # Added flags to enable C99 mode with Solaris compilers. # Updated implementation of MPI-T CVARs of type MPI_CHAR, as interpreted in MPI-3.0 Errata. # Several other minor bug fixes, memory leak fixes, and code cleanup. A full list of changes is available at the following link: http://git.mpich.org/mpich.git/shortlog/v3.1.1..v3.1.2 A full list of bugs that have been fixed is available at the following link: https://trac.mpich.org/projects/mpich/query?status=closed&group=resolution&milestone=mpich-3.1.2 =============================================================================== Changes in 3.1.1 =============================================================================== # Blue Gene/Q implementation supports MPI-3. This release contains a functional and compliant Blue Gene/Q implementation of the MPI-3 standard. Instructions to build on Blue Gene/Q are on the mpich.org wiki: https://github.com/pmodels/mpich/blob/main/doc/wiki/source_code/BGQ.md # Fortran 2008 bindings (experimental). Build with --enable-fortran=all. Must have a Fortran 2008 + TS 29113 capable compiler. # Significant rework of MPICH library management and which symbols go into which libraries. Also updated MPICH library names to make them consistent with Intel MPI, Cray MPI and IBM PE MPI. Backward compatibility links are provided for older mpich-based build systems. # The ROMIO "Blue Gene" driver has seen significant rework. We have separated "file system" features from "platform" features, since GPFS shows up in more places than just Blue Gene # New ROMIO options for aggregator selection and placement on Blue Gene # Optional new ROMIO two-phase algorithm requiring less communication for certain workloads # The old ROMIO optimization "deferred open" either stopped working or was disabled on several platforms. # Added support for powerpcle compiler. Patched libtool in MPICH to support little-endian powerpc linux host. # Fixed the prototype of the Reduce_local C++ binding. The previous prototype was completely incorrect. Thanks to Jeff Squyres for reporting the issue. # The mpd process manager, which was deprecated and unsupported for the past four major release series (1.3.x till 3.1), has now been deleted. RIP. # Several other minor bug fixes, memory leak fixes, and code cleanup. A full list of changes is available at the following link: http://git.mpich.org/mpich.git/shortlog/v3.1..v3.1.1 A full list of bugs that have been fixed is available at the following link: https://trac.mpich.org/projects/mpich/query?status=closed&group=resolution&milestone=mpich-3.1.1 =============================================================================== Changes in 3.1 =============================================================================== # Implement runtime compatibility with MPICH-derived implementations as per the ABI Compatibility Initiative (see www.mpich.org/abi for more information). # Integrated MPICH-PAMI code base for Blue Gene/Q and other IBM platforms. # Several improvements to the SCIF netmod. (code contribution from Intel). # Major revamp of the MPI_T interface added in MPI-3. # Added environment variables to control a lot more capabilities for collectives. See the README.envvar file for more information. # Allow non-blocking collectives and fault tolerance at the same time. The option MPIR_PARAM_ENABLE_COLL_FT_RET has been deprecated as it is no longer necessary. # Improvements to MPI_WIN_ALLOCATE to internally allocate shared memory between processes on the same node. # Performance improvements for MPI RMA operations on shared memory for MPI_WIN_ALLOCATE and MPI_WIN_ALLOCATE_SHARED. # Enable shared library builds by default. # Upgraded hwloc to 1.8. # Several improvements to the Hydra-SLURM integration. # Several improvements to the Hydra process binding code. See the Hydra wiki page for more information: https://github.com/pmodels/mpich/blob/main/doc/wiki/how_to/Using_the_Hydra_Process_Manager.md # MPICH now supports operations on very large datatypes (those that describe more than 32 bits of data). This work also allows MPICH to fully support MPI-3's introduction of MPI_Count. # Several other minor bug fixes, memory leak fixes, and code cleanup. A full list of changes is available at the following link: http://git.mpich.org/mpich.git/shortlog/v3.0.4..v3.1 A full list of bugs that have been fixed is available at the following link: https://trac.mpich.org/projects/mpich/query?status=closed&group=resolution&milestone=mpich-3.1 =============================================================================== Changes in 3.0.4 =============================================================================== # BUILD SYSTEM: Reordered the default compiler search to prefer Intel and PG compilers over GNU compilers because of the performance difference. WARNING: If you do not explicitly specify the compiler you want through CC and friends, this might break ABI for you relative to the previous 3.0.x release. # OVERALL: Added support to manage per-communicator eager-rendezvous thresholds. # PM/PMI: Performance improvements to the Hydra process manager on large-scale systems by allowing for key/value caching. # Several other minor bug fixes, memory leak fixes, and code cleanup. A full list of changes is available at the following link: http://git.mpich.org/mpich.git/shortlog/v3.0.3..v3.0.4 =============================================================================== Changes in 3.0.3 =============================================================================== # RMA: Added a new mechanism for piggybacking RMA synchronization operations, which improves the performance of several synchronization operations, including Flush. # RMA: Added an optimization to utilize the MPI_MODE_NOCHECK assertion in passive target RMA to improve performance by eliminating a lock request message. # RMA: Added a default implementation of shared memory windows to CH3. This adds support for this MPI 3.0 feature to the ch3:sock device. # RMA: Fix a bug that resulted in an error when RMA operation request handles where completed outside of a synchronization epoch. # PM/PMI: Upgraded to hwloc-1.6.2rc1. This version uses libpciaccess instead of libpci, to workaround the GPL license used by libpci. # PM/PMI: Added support for the Cobalt process manager. # BUILD SYSTEM: allow MPI_LONG_DOUBLE_SUPPORT to be disabled with a configure option. # FORTRAN: fix MPI_WEIGHTS_EMPTY in the Fortran bindings # MISC: fix a bug in MPI_Get_elements where it could return incorrect values # Several other minor bug fixes, memory leak fixes, and code cleanup. A full list of changes is available at the following link: http://git.mpich.org/mpich.git/shortlog/v3.0.2..v3.0.3 =============================================================================== Changes in 3.0.2 =============================================================================== # PM/PMI: Upgrade to hwloc-1.6.1 # RMA: Performance enhancements for shared memory windows. # COMPILER INTEGRATION: minor improvements and fixes to the clang static type checking annotation macros. # MPI-IO (ROMIO): improved error checking for user errors, contributed by IBM. # MPI-3 TOOLS INTERFACE: new MPI_T performance variables providing information about nemesis communication behavior and and CH3 message matching queues. # TEST SUITE: "make testing" now also outputs a "summary.tap" file that can be interpreted with standard TAP consumer libraries and tools. The "summary.xml" format remains unchanged. # GIT: This is the first release built from the new git repository at git.mpich.org. A few build system mechanisms have changed because of this switch. # BUG FIX: resolved a compilation error related to LLONG_MAX that affected several users (ticket #1776). # BUG FIX: nonblocking collectives now properly make progress when MPICH is configured with the ch3:sock channel (ticket #1785). # Several other minor bug fixes, memory leak fixes, and code cleanup. A full list of changes is available at the following link: http://git.mpich.org/mpich.git/shortlog/v3.0.1..v3.0.2 =============================================================================== Changes in 3.0.1 =============================================================================== # PM/PMI: Critical bug-fix in Hydra to work correctly in multi-node tests. # A full list of changes is available using: svn log -r10790:HEAD https://svn.mcs.anl.gov/repos/mpi/mpich2/tags/release/mpich-3.0.1 ... or at the following link: https://trac.mcs.anl.gov/projects/mpich2/log/mpich2/tags/release/mpich-3.0.1?action=follow_copy&rev=HEAD&stop_rev=10790&mode=follow_copy =============================================================================== Changes in 3.0 =============================================================================== # MPI-3: All MPI-3 features are now implemented and the MPI_VERSION bumped up to 3.0. # OVERALL: Added support for ARM-v7 native atomics # MPE: MPE is now separated out of MPICH and can be downloaded/used as a separate package. # PM/PMI: Upgraded to hwloc-1.6 # Several other minor bug fixes, memory leak fixes, and code cleanup. A full list of changes is available using: svn log -r10344:HEAD https://svn.mcs.anl.gov/repos/mpi/mpich2/tags/release/mpich-3.0 ... or at the following link: https://trac.mcs.anl.gov/projects/mpich2/log/mpich2/tags/release/mpich-3.0?action=follow_copy&rev=HEAD&stop_rev=10344&mode=follow_copy =============================================================================== Changes in 1.5 =============================================================================== # OVERALL: Nemesis now supports an "--enable-yield=..." configure option for better performance/behavior when oversubscribing processes to cores. Some form of this option is enabled by default on Linux, Darwin, and systems that support sched_yield(). # OVERALL: Added support for Intel Many Integrated Core (MIC) architecture: shared memory, TCP/IP, and SCIF based communication. # OVERALL: Added support for IBM BG/Q architecture. Thanks to IBM for the contribution. # MPI-3: const support has been added to mpi.h, although it is disabled by default. It can be enabled on a per-translation unit basis with "#define MPICH2_CONST const". # MPI-3: Added support for MPIX_Type_create_hindexed_block. # MPI-3: The new MPI-3 nonblocking collective functions are now available as "MPIX_" functions (e.g., "MPIX_Ibcast"). # MPI-3: The new MPI-3 neighborhood collective routines are now available as "MPIX_" functions (e.g., "MPIX_Neighbor_allgather"). # MPI-3: The new MPI-3 MPI_Comm_split_type function is now available as an "MPIX_" function. # MPI-3: The new MPI-3 tools interface is now available as "MPIX_T_" functions. This is a beta implementation right now with several limitations, including no support for multithreading. Several performance variables related to CH3's message matching are exposed through this interface. # MPI-3: The new MPI-3 matched probe functionality is supported via the new routines MPIX_Mprobe, MPIX_Improbe, MPIX_Mrecv, and MPIX_Imrecv. # MPI-3: The new MPI-3 nonblocking communicator duplication routine, MPIX_Comm_idup, is now supported. It will only work for single-threaded programs at this time. # MPI-3: MPIX_Comm_reenable_anysource support # MPI-3: Native MPIX_Comm_create_group support (updated version of the prior MPIX_Group_comm_create routine). # MPI-3: MPI_Intercomm_create's internal communication no longer interferes with point-to-point communication, even if point-to-point operations on the parent communicator use the same tag or MPI_ANY_TAG. # MPI-3: Eliminated the possibility of interference between MPI_Intercomm_create and point-to-point messaging operations. # Build system: Completely revamped build system to rely fully on autotools. Parallel builds ("make -j8" and similar) are now supported. # Build system: rename "./maint/updatefiles" --> "./autogen.sh" and "configure.in" --> "configure.ac" # JUMPSHOT: Improvements to Jumpshot to handle thousands of timelines, including performance improvements to slog2 in such cases. # JUMPSHOT: Added navigation support to locate chosen drawable's ends when viewport has been scrolled far from the drawable. # PM/PMI: Added support for memory binding policies. # PM/PMI: Various improvements to the process binding support in Hydra. Several new pre-defined binding options are provided. # PM/PMI: Upgraded to hwloc-1.5 # PM/PMI: Several improvements to PBS support to natively use the PBS launcher. # Several other minor bug fixes, memory leak fixes, and code cleanup. A full list of changes is available using: svn log -r8478:HEAD https://svn.mcs.anl.gov/repos/mpi/mpich2/tags/release/mpich2-1.5 ... or at the following link: https://trac.mcs.anl.gov/projects/mpich2/log/mpich2/tags/release/mpich2-1.5?action=follow_copy&rev=HEAD&stop_rev=8478&mode=follow_copy =============================================================================== Changes in 1.4.1 =============================================================================== # OVERALL: Several improvements to the ARMCI API implementation within MPICH2. # Build system: Added beta support for DESTDIR while installing MPICH2. # PM/PMI: Upgrade hwloc to 1.2.1rc2. # PM/PMI: Initial support for the PBS launcher. # Several other minor bug fixes, memory leak fixes, and code cleanup. A full list of changes is available using: svn log -r8675:HEAD https://svn.mcs.anl.gov/repos/mpi/mpich2/tags/release/mpich2-1.4.1 ... or at the following link: https://trac.mcs.anl.gov/projects/mpich2/log/mpich2/tags/release/mpich2-1.4.1?action=follow_copy&rev=HEAD&stop_rev=8675&mode=follow_copy =============================================================================== Changes in 1.4 =============================================================================== # OVERALL: Improvements to fault tolerance for collective operations. Thanks to Rui Wang @ ICT for reporting several of these issues. # OVERALL: Improvements to the universe size detection. Thanks to Yauheni Zelenko for reporting this issue. # OVERALL: Bug fixes for Fortran attributes on some systems. Thanks to Nicolai Stange for reporting this issue. # OVERALL: Added new ARMCI API implementation (experimental). # OVERALL: Added new MPIX_Group_comm_create function to allow non-collective creation of sub-communicators. # FORTRAN: Bug fixes in the MPI_DIST_GRAPH_ Fortran bindings. # PM/PMI: Support for a manual "none" launcher in Hydra to allow for higher-level tools to be built on top of Hydra. Thanks to Justin Wozniak for reporting this issue, for providing several patches for the fix, and testing it. # PM/PMI: Bug fixes in Hydra to handle non-uniform layouts of hosts better. Thanks to the MVAPICH group at OSU for reporting this issue and testing it. # PM/PMI: Bug fixes in Hydra to handle cases where only a subset of the available launchers or resource managers are compiled in. Thanks to Satish Balay @ Argonne for reporting this issue. # PM/PMI: Support for a different username to be provided for each host; this only works for launchers that support this (such as SSH). # PM/PMI: Bug fixes for using Hydra on AIX machines. Thanks to Kitrick Sheets @ NCSA for reporting this issue and providing the first draft of the patch. # PM/PMI: Bug fixes in memory allocation/management for environment variables that was showing up on older platforms. Thanks to Steven Sutphen for reporting the issue and providing detailed analysis to track down the bug. # PM/PMI: Added support for providing a configuration file to pick the default options for Hydra. Thanks to Saurabh T. for reporting the issues with the current implementation and working with us to improve this option. # PM/PMI: Improvements to the error code returned by Hydra. # PM/PMI: Bug fixes for handling "=" in environment variable values in hydra. # PM/PMI: Upgrade the hwloc version to 1.2. # COLLECTIVES: Performance and memory usage improvements for MPI_Bcast in certain cases. # VALGRIND: Fix incorrect Valgrind client request usage when MPICH2 is built for memory debugging. # BUILD SYSTEM: "--enable-fast" and "--disable-error-checking" are once again valid simultaneous options to configure. # TEST SUITE: Several new tests for MPI RMA operations. # Several other minor bug fixes, memory leak fixes, and code cleanup. A full list of changes is available using: svn log -r7838:HEAD https://svn.mcs.anl.gov/repos/mpi/mpich2/tags/release/mpich2-1.4 ... or at the following link: https://trac.mcs.anl.gov/projects/mpich2/log/mpich2/tags/release/mpich2-1.4?action=follow_copy&rev=HEAD&stop_rev=7838&mode=follow_copy =============================================================================== Changes in 1.3.2 =============================================================================== # OVERALL: MPICH2 now recognizes the OSX mach_absolute_time as a native timer type. # OVERALL: Performance improvements to MPI_Comm_split on large systems. # OVERALL: Several improvements to error returns capabilities in the presence of faults. # PM/PMI: Several fixes and improvements to Hydra's process binding capability. # PM/PMI: Upgrade the hwloc version to 1.1.1. # PM/PMI: Allow users to sort node lists allocated by resource managers in Hydra. # PM/PMI: Improvements to signal handling. Now Hydra respects Ctrl-Z signals and passes on the signal to the application. # PM/PMI: Improvements to STDOUT/STDERR handling including improved support for rank prepending on output. Improvements to STDIN handling for applications being run in the background. # PM/PMI: Split the bootstrap servers into "launchers" and "resource managers", allowing the user to pick a different resource manager from the launcher. For example, the user can now pick the "SLURM" resource manager and "SSH" as the launcher. # PM/PMI: The MPD process manager is deprecated. # PM/PMI: The PLPA process binding library support is deprecated. # WINDOWS: Adding support for gfortran and 64-bit gcc libs. # Several other minor bug fixes, memory leak fixes, and code cleanup. A full list of changes is available using: svn log -r7457:HEAD https://svn.mcs.anl.gov/repos/mpi/mpich2/tags/release/mpich2-1.3.2 ... or at the following link: https://trac.mcs.anl.gov/projects/mpich2/log/mpich2/tags/release/mpich2-1.3.2?action=follow_copy&rev=HEAD&stop_rev=7457&mode=follow_copy =============================================================================== Changes in 1.3.1 =============================================================================== # OVERALL: MPICH2 is now fully compliant with the CIFTS FTB standard MPI events (based on the draft standard). # OVERALL: Major improvements to RMA performance for long lists of RMA operations. # OVERALL: Performance improvements for Group_translate_ranks. # COLLECTIVES: Collective algorithm selection thresholds can now be controlled at runtime via environment variables. # ROMIO: PVFS error codes are now mapped to MPI error codes. # Several other minor bug fixes, memory leak fixes, and code cleanup. A full list of changes is available using: svn log -r7350:HEAD https://svn.mcs.anl.gov/repos/mpi/mpich2/tags/release/mpich2-1.3.1 ... or at the following link: https://trac.mcs.anl.gov/projects/mpich2/log/mpich2/tags/release/mpich2-1.3.1?action=follow_copy&rev=HEAD&stop_rev=7350&mode=follow_copy =============================================================================== Changes in 1.3 =============================================================================== # OVERALL: Initial support for fine-grained threading in ch3:nemesis:tcp. # OVERALL: Support for Asynchronous Communication Progress. # OVERALL: The ssm and shm channels have been removed. # OVERALL: Checkpoint/restart support using BLCR. # OVERALL: Improved tolerance to process and communication failures when error handler is set to MPI_ERRORS_RETURN. If a communication operation fails (e.g., due to a process failure) MPICH2 will return an error, and further communication to that process is not possible. However, communication with other processes will still proceed normally. Note, however, that the behavior collective operations on communicators containing the failed process is undefined, and may give incorrect results or hang some processes. # OVERALL: Experimental support for inter-library dependencies. # PM/PMI: Hydra is now the default process management framework replacing MPD. # PM/PMI: Added dynamic process support for Hydra. # PM/PMI: Added support for LSF, SGE and POE in Hydra. # PM/PMI: Added support for CPU and memory/cache topology aware process-core binding. # DEBUGGER: Improved support and bug fixes in the Totalview support. # Build system: Replaced F90/F90FLAGS by FC/FCFLAGS. F90/F90FLAGS are not longer supported in the configure. # Multi-compiler support: On systems where C compiler that is used to build mpich2 libraries supports multiple weak symbols and multiple aliases, the Fortran binding built in the mpich2 libraries can handle different Fortran compilers (than the one used to build mpich2). Details in README. # Several other minor bug fixes, memory leak fixes, and code cleanup. A full list of changes is available using: svn log -r5762:HEAD https://svn.mcs.anl.gov/repos/mpi/mpich2/tags/release/mpich2-1.3 ... or at the following link: https://trac.mcs.anl.gov/projects/mpich2/log/mpich2/tags/release/mpich2-1.3?action=follow_copy&rev=HEAD&stop_rev=5762&mode=follow_copy =============================================================================== Changes in 1.2.1 =============================================================================== # OVERALL: Improved support for fine-grained multithreading. # OVERALL: Improved integration with Valgrind for debugging builds of MPICH2. # PM/PMI: Initial support for hwloc process-core binding library in Hydra. # PM/PMI: Updates to the PMI-2 code to match the PMI-2 API and wire-protocol draft. # Several other minor bug fixes, memory leak fixes, and code cleanup. A full list of changes is available using: svn log -r5425:HEAD https://svn.mcs.anl.gov/repos/mpi/mpich2/tags/release/mpich2-1.2.1 ... or at the following link: https://trac.mcs.anl.gov/projects/mpich2/log/mpich2/tags/release/mpich2-1.2.1?action=follow_copy&rev=HEAD&stop_rev=5425&mode=follow_copy =============================================================================== Changes in 1.2 =============================================================================== # OVERALL: Support for MPI-2.2 # OVERALL: Several fixes to Nemesis/MX. # WINDOWS: Performance improvements to Nemesis/windows. # PM/PMI: Scalability and performance improvements to Hydra using PMI-1.1 process-mapping features. # PM/PMI: Support for process-binding for hyperthreading enabled systems in Hydra. # PM/PMI: Initial support for PBS as a resource management kernel in Hydra. # PM/PMI: PMI2 client code is now officially included in the release. # TEST SUITE: Support to run the MPICH2 test suite through valgrind. # Several other minor bug fixes, memory leak fixes, and code cleanup. A full list of changes is available using: svn log -r5025:HEAD https://svn.mcs.anl.gov/repos/mpi/mpich2/tags/release/mpich2-1.2 ... or at the following link: https://trac.mcs.anl.gov/projects/mpich2/log/mpich2/tags/release/mpich2-1.2?action=follow_copy&rev=HEAD&stop_rev=5025&mode=follow_copy =============================================================================== Changes in 1.1.1p1 =============================================================================== - OVERALL: Fixed an invalid read in the dataloop code for zero count types. - OVERALL: Fixed several bugs in ch3:nemesis:mx (tickets #744,#760; also change r5126). - BUILD SYSTEM: Several fixes for functionality broken in 1.1.1 release, including MPICH2LIB_xFLAGS and extra libraries living in $LIBS instead of $LDFLAGS. Also, '-lpthread' should no longer be duplicated in link lines. - BUILD SYSTEM: MPICH2 shared libraries are now compatible with glibc versioned symbols on Linux, such as those present in the MX shared libraries. - BUILD SYSTEM: Minor tweaks to improve compilation under the nvcc CUDA compiler. - PM/PMI: Fix mpd incompatibility with python2.3 introduced in mpich2-1.1.1. - PM/PMI: Several fixes to hydra, including memory leak fixes and process binding issues. - TEST SUITE: Correct invalid arguments in the coll2 and coll3 tests. - Several other minor bug fixes, memory leak fixes, and code cleanup. A full list of changes is available using: svn log -r5032:HEAD https://svn.mcs.anl.gov/repos/mpi/mpich2/tags/release/mpich2-1.1.1p1 ... or at the following link: https://trac.mcs.anl.gov/projects/mpich2/log/mpich2/tags/release/mpich2-1.1.1p1?action=follow_copy&rev=HEAD&stop_rev=5032&mode=follow_copy =============================================================================== Changes in 1.1.1 =============================================================================== # OVERALL: Improved support for Boost MPI. # PM/PMI: Significantly improved time taken by MPI_Init with Nemesis and MPD on large numbers of processes. # PM/PMI: Improved support for hybrid MPI-UPC program launching with Hydra. # PM/PMI: Improved support for process-core binding with Hydra. # PM/PMI: Preliminary support for PMI-2. Currently supported only with Hydra. # Many other bug fixes, memory leak fixes and code cleanup. A full list of changes is available using: svn log -r4655:HEAD https://svn.mcs.anl.gov/repos/mpi/mpich2/tags/release/mpich2-1.1.1 ... or at the following link: https://trac.mcs.anl.gov/projects/mpich2/log/mpich2/tags/release/mpich2-1.1.1?action=follow_copy&rev=HEAD&stop_rev=4655&mode=follow_copy =============================================================================== Changes in 1.1 =============================================================================== - OVERALL: Added MPI 2.1 support. - OVERALL: Nemesis is now the default configuration channel with a completely new TCP communication module. - OVERALL: Windows support for nemesis. - OVERALL: Added a new Myrinet MX network module for nemesis. - OVERALL: Initial support for shared-memory aware collective communication operations. Currently MPI_Bcast, MPI_Reduce, MPI_Allreduce, and MPI_Scan. - OVERALL: Improved handling of MPI Attributes. - OVERALL: Support for BlueGene/P through the DCMF library (thanks to IBM for the patch). - OVERALL: Experimental support for fine-grained multithreading - OVERALL: Added dynamic processes support for Nemesis. - OVERALL: Added automatic as well as statically runtime configurable receive timeout variation for MPD (thanks to OSU for the patch). - OVERALL: Improved performance for MPI_Allgatherv, MPI_Gatherv, and MPI_Alltoall. - PM/PMI: Initial support for the new Hydra process management framework (current support is for ssh, rsh, fork and a preliminary version of slurm). - ROMIO: Added support for MPI_Type_create_resized and MPI_Type_create_indexed_block datatypes in ROMIO. - ROMIO: Optimized Lustre ADIO driver (thanks to Weikuan Yu for initial work and Sun for further improvements). - Many other bug fixes, memory leak fixes and code cleanup. A full list of changes is available using: svn log -r813:HEAD https://svn.mcs.anl.gov/repos/mpi/mpich2/tags/release/mpich2-1.1 ... or at the following link: https://trac.mcs.anl.gov/projects/mpich2/log/mpich2/tags/release/mpich2-1.1?action=follow_copy&rev=HEAD&stop_rev=813&mode=follow_copy =============================================================================== Changes in 1.0.7 =============================================================================== - OVERALL: Initial ROMIO device for BlueGene/P (the ADI device is also added but is not configurable at this time). - OVERALL: Major clean up for the propagation of user-defined and other MPICH2 flags throughout the code. - OVERALL: Support for STI Cell Broadband Engine. - OVERALL: Added datatype free hooks to be used by devices independently. - OVERALL: Added device-specific timer support. - OVERALL: make uninstall works cleanly now. - ROMIO: Support to take hints from a config file - ROMIO: more tests and bug fixes for nonblocking I/O - PM/PMI: Added support to use PMI Clique functionality for process managers that support it. - PM/PMI: Added SLURM support to configure to make it transparent to users. - PM/PMI: SMPD Singleton Init support. - WINDOWS: Fortran 90 support added. - SCTP: Added MPICH_SCTP_NAGLE_ON support. - MPE: Updated MPE logging API so that it is thread-safe (through global mutex). - MPE: Added infrastructure to piggyback argument data to MPI states. - DOCS: Documentation creation now works correctly for VPATH builds. - Many other bug fixes, memory leak fixes and code cleanup. A full list of changes is available using: svn log -r100:HEAD https://svn.mcs.anl.gov/repos/mpi/mpich2/branches/release/MPICH2_1_0_7 =============================================================================== Changes in 1.0.6 =============================================================================== - Updates to the ch3:nemesis channel including preliminary support for thread safety. - Preliminary support for dynamic loading of ch3 channels (sock, ssm, shm). See the README file for details. - Singleton init now works with the MPD process manager. - Fixes in MPD related to MPI-2 connect-accept. - Improved support for MPI-2 generalized requests that allows true nonblocking I/O in ROMIO. - MPE changes: * Enabled thread-safe MPI logging through global mutex. * Enhanced Jumpshot to be more thread friendly + added simple statistics in the Legend windows. * Added backtrace support to MPE on Solaris and glibc based systems, e.g. Linux. This improves the output error message from the Collective/Datatype checking library. * Fixed the CLOG2 format so it can be used in serial (non-MPI) logging. - Performance improvements for derived datatypes (including packing and communication) through in-built loop-unrolling and buffer alignment. - Performance improvements for MPI_Gather when non-power-of-two processes are used, and when a non-zero ranked root is performing the gather. - MPI_Comm_create works for intercommunicators. - Enabled -O2 and equivalent compiler optimizations for supported compilers by default (including GNU, Intel, Portland, Sun, Absoft, IBM). - Many other bug fixes, memory leak fixes and code cleanup. A full list of changes is available at www.mcs.anl.gov/mpi/mpich2/mpich2_1_0_6changes.htm. =============================================================================== Changes in 1.0.5 =============================================================================== - An SCTP channel has been added to the CH3 device. This was implemented by Brad Penoff and Mike Tsai, Univ. of British Columbia. Their group's webpage is located at http://www.cs.ubc.ca/labs/dsg/mpi-sctp/ . - Bugs related to dynamic processes have been fixed. - Performance-related fixes have been added to derived datatypes and collective communication. - Updates to the Nemesis channel - Fixes to thread safety for the ch3:sock channel - Many other bug fixes and code cleanup. A full list of changes is available at www.mcs.anl.gov/mpi/mpich2/mpich2_1_0_5changes.htm . =============================================================================== Changes in 1.0.4 =============================================================================== - For the ch3:sock channel, the default build of MPICH2 supports thread safety. A separate build is not needed as before. However, thread safety is enabled only if the user calls MPI_Init_thread with MPI_THREAD_MULTIPLE. If not, no thread locks are called, so there is no penalty. - A new low-latency channel called Nemesis has been added. It can be selected by specifying the option --with-device=ch3:nemesis. Nemesis uses shared memory for intranode communication and various networks for internode communication. Currently available networks are TCP, GM and MX. Nemesis is still a work in progress. See the README for more information about the channel. - Support has been added for providing message queues to debuggers. Configure with --enable-debuginfo to make this information available. This is still a "beta" test version and has not been extensively tested. - For systems with firewalls, the environment variable MPICH_PORT_RANGE can be used to restrict the range of ports used by MPICH2. See the documentation for more details. - Withdrew obsolete modules, including the ib and rdma communication layers. For Infiniband and MPICH2, please see http://nowlab.cse.ohio-state.edu/projects/mpi-iba/ For other interconnects, please contact us at mpich2-maint@mcs.anl.gov . - Numerous bug fixes and code cleanup. A full list of changes is available at www.mcs.anl.gov/mpi/mpich2/mpich2_1_0_4changes.htm . - Numerous new tests in the MPICH2 test suite. - For developers, the way in which information is passed between the top level configure and configures in the device, process management, and related modules has been cleaned up. See the comments at the beginning of the top-level configure.in for details. This change makes it easier to interface other modules to MPICH2. =============================================================================== Changes in 1.0.3 =============================================================================== - There are major changes to the ch3 device implementation. Old and unsupported channels (essm, rdma) have been removed. The internal interface between ch3 and the channels has been improved to similify the process of adding a new channel (sharing existing code where possible) and to improve performance. Further changes in this internal interface are expected. - Numerous bug fixes and code cleanup Creation of intercommunicators and intracommunicators from the intercommunicators created with Spawn and Connect/Accept The computation of the alignment and padding of items within structures now handles additional cases, including systems where the alignment an padding depends on the type of the first item in the structure MPD recognizes wdir info keyword gforker's mpiexec supports -env and -genv arguments for controlling which environment variables are delivered to created processes - While not a bug, to aid in the use of memory trace packages, MPICH2 tries to free all allocated data no later than when MPI_Finalize returns. - Support for DESTDIR in install targets - Enhancements to SMPD - In order to support special compiler flags for users that may be different from those used to build MPICH2, the environment variables MPI_CFLAGS, MPI_FFLAGS, MPI_CXXFLAGS, and MPI_F90FLAGS may be used to specify the flags used in mpicc, mpif77, mpicxx, and mpif90 respectively. The flags CFLAGS, FFLAGS, CXXFLAGS, and F90FLAGS are used in the building of MPICH2. - Many enhancements to MPE - Enhanced support for features and idiosyncrasies of Fortran 77 and Fortran 90 compilers, including gfortran, g95, and xlf - Enhanced support for C++ compilers that do not fully support abstract base classes - Additional tests in the mpich2/tests/mpi - New FAQ included (also available at http://www.mcs.anl.gov/mpi/mpich2/faq.htm) - Man pages for mpiexec and mpif90 - Enhancements for developers, including a more flexible and general mechanism for inserting logging and information messages, controllable with --mpich-dbg-xxx command line arguments or MPICH_DBG_XXX environment variables. - Note to developers: This release contains many changes to the structure of the CH3 device implementation (in src/mpid/ch3), including significant reworking of the files (many files have been combined into fewer files representing logical grouping of functions). The next release of MPICH2 will contain even more significant changes to the device structure as we introduce a new communication implementation. =============================================================================== Changes in 1.0.2 =============================================================================== - Optimizations to the MPI-2 one-sided communication functions for the sshm (scalable shared memory) channel when window memory is allocated with MPI_Alloc_mem (for all three synchronization methods). - Numerous bug fixes and code cleanup. - Fixed memory leaks. - Fixed shared library builds. - Fixed performance problems with MPI_Type_create_subarray/darray - The following changes have been made to MPE2: - MPE2 now builds the MPI collective and datatype checking library by default. - SLOG-2 format has been upgraded to 2.0.6 which supports event drawables and provides count of real drawables in preview drawables. - new slog2 tools, slog2filter and slog2updater, which both are logfile format converters. slog2filter removes undesirable categories of drawables as well as alters the slog2 file structure. slog2updater is a slog2filter that reads in older logfile format, 2.0.5, and writes out the latest format 2.0.6. - The following changes have been made to MPD: - Nearly all code has been replaced by new code that follows a more object-oriented approach than before. This has not changed any fundamental behavior or interfaces. - There is info support in spawn and spawn_multiple for providing parts of the environment for spawned processes such as search-path and current working directory. See the Standard for the required fields. - mpdcheck has been enhanced to help users debug their cluster and network configurations. - CPickle has replaced marshal as the source module for dumps and loads. - The mpigdb command has been replaced by mpiexec -gdb. - Alternate interfaces can be used. See the Installer's Guide. =============================================================================== Changes in 1.0.1 =============================================================================== - Copyright statements have been added to all code files, clearly identifying that all code in the distribution is covered by the extremely flexible copyright described in the COPYRIGHT file. - The MPICH2 test suite (mpich2/test) can now be run against any MPI implementation, not just MPICH2. - The send and receive socket buffers sizes may now be changed by setting MPICH_SOCKET_BUFFER_SIZE. Note: the operating system may impose a maximum socket buffer size that prohibits MPICH2 from increasing the buffers to the desire size. To raise the maximum allowable buffer size, please contact your system administrator. - Error handling throughout the MPI routines has been improved. The error handling in some internal routines has been simplified as well, making the routines easier to read. - MPE (Jumpshot and CLOG logging) is now supported on Microsoft Windows. - C applications built for Microsoft Windows may select the desired channels at runtime. - A program not started with mpiexec may become an MPI program by calling MPI_Init. It will have an MPI_COMM_WORLD of size one. It may then call other MPI routines, including MPI_COMM_SPAWN, to become a truly parallel program. At present, the use of MPI_COMM_SPAWN and MPI_COMM_SPAWN_MULTIPLE by such a process is only supported by the MPD process manager. - Memory leaks in communicator allocation and the C++ binding have been fixed. - Following GNU guidelines, the parts of the install step that checked the installation have been moved to an installcheck target. Much of the installation now supports the DESTDIR prefix. - Microsoft Visual Studio projects have been added to make it possible to build x86-64 version - Problems with compilers and linkers that do not support weak symbols, which are used to support the PMPI profiling interface, have been corrected. - Handling of Fortran 77 and Fortran 90 compilers has been improved, including support for g95. - The Fortran stdcall interface on Microsoft Windows now supports character*. - A bug in the OS X implementation of poll() caused the sock channel to hang. A workaround has been put in place. - Problems with installation under OS/X are now detected and corrected. (Install breaks libraries that are more than 10 seconds old!) - The following changes have been made to MPD: - Sending a SIGINT to mpiexec/mpdrun, such as by typing control-C, now causes SIGINT to be sent to the processes within the job. Previously, SIGKILL was sent to the processes, preventing applications from catching the signal and performing their own signal processing. - The process for merging output has been improved. - A new option, -ifhn, has been added to the machine file, allowing the user to select the destination interface to be used for TCP communication. See the User's Manual for details. - The user may now select, via the "-s" option to mpiexec/mpdrun, which processes receive input through stdin. stdin is immediately closed for all processes not in set receiving input. This prevents processes not in the set from hanging should they attempt to read from stdin. - The MPICH2 Installer's Guide now contains an appendix on troubleshooting problems with MPD. - The following changes have been made to SMPD: - On Windows machines, passwordless authentication (via SSPI) can now be used to start processes on machines within a domain. This feature is a recent addition, and should be considered experimental. - On Windows machines, the -localroot option was added to mpiexec, allowing processes on the local machines to perform GUI operations on the local desktop. - On Windows machines, network drive mapping is now supported via the -map option to mpiexec. - Three new GUI tools have been added for Microsoft Windows. These tools are wrappers to the command line tools, mpiexec.exe and smpd.exe. wmpiexec allows the user to run a job much in the way they with mpiexec. wmpiconfig provides a means of setting various global options to the SMPD process manager environment. wmpiregister encrypts the user's credentials and saves them to the Windows Registry. - The following changes have been made to MPE2: - MPE2 no longer attempt to compile or link code during 'make install' to validate the installation. Instead, 'make installcheck' may now be used to verify that the MPE installation. - MPE2 now supports DESTDIR. - The sock channel now has preliminary support for MPI_THREAD_SERIALIZED and MPI_THREAD_MULTIPLE on both UNIX and Microsoft Windows. We have performed rudimentary testing; and while overall the results were very positive, known issues do exist. ROMIO in particular experiences hangs in several places. We plan to correct that in the next release. As always, please report any difficulties you encounter. - Another channel capable of communicating with both over sockets and shared memory has been added. Unlike the ssm channel which waits for new data to arrive by continuously polling the system in a busy loop, the essm channel waits by blocking on an operating system event object. This channel is experimental, and is only available for Microsoft Windows. - The topology routines have been modified to allow the device to override the default implementation. This allows the device to export knowledge of the underlying physical topology to the MPI routines (Dims_create and the reorder == true cases in Cart_create and Graph_create). - New memory allocation macros, MPIU_CHK[PL]MEM_*(), have been added to help prevent memory leaks. See mpich2/src/include/mpir_mem.h. - New error reporting macros, MPIU_ERR_*, have been added to simplify the error handling throughout the code, making the code easier to read. See mpich2/src/include/mpir_err.h. - Interprocess communication using the Sock interface (sock and ssm channels) may now be bound to a particular destination interface using the environment variable MPICH_INTERFACE_HOSTNAME. The variable needs to be set for each process for which the destination interface is not the default interface. (Other mechanisms for destination interface selection will be provided in future releases.) Both MPD and SMPD provide a more simplistic mechanism for specifying the interface. See the user documentation. - Too many bug fixes to describe. Much thanks goes to the users who reported bugs. Their patience and understanding as we attempted to recreate the problems and solve them is greatly appreciated. =============================================================================== Changes in 1.0 =============================================================================== - MPICH2 now works on Solaris. - The User's Guide has been expanded considerably. The Installation Guide has been expanded some as well. - MPI_COMM_JOIN has been implemented; although like the other dynamic process routines, it is only supported by the Sock channel. - MPI_COMM_CONNECT and MPI_COMM_ACCEPT are now allowed to connect with remote process to which they are already connected. - Shared libraries can now be built (and used) on IA32 Linux with the GNU compilers (--enable-sharedlibs=gcc), and on Solaris with the native Sun Workshop compilers (--enable-sharedlibs=solaris). They may also work on other operating systems with GCC, but that has not been tested. Previous restrictions disallowing C++ and Fortran bindings when building shared libraries have been removed. - The dataloop and datatype contents code has been improved to address alignment issues on all platforms. - A bug in the datatype code, which handled zero block length cases incorrectly, has been fixed. - An segmentation fault in the datatype memory management, resulting from freeing memory twice, has been fixed. - The following changes were made to the MPD process manager: - MPI_SPAWN_MULTIPLE now works with MPD. - The arguments to the 'mpiexec' command supplied by the MPD have changed. First, the -default option has been removed. Second, more flexible ways to pass environment variables have been added. - The commands 'mpdcheck' and 'testconfig' have been to installations using MPD. These commands test the setup of the machines on which you wish to run MPICH2 jobs. They help to identify misconfiguration, firewall issues, and other communication problems. - Support for MPI_APPNUM and MPI_UNIVERSE_SIZE has been added to the Simple implementation of PMI and the MPD process manager. - In general, error detection and recovery in MPD has improved. - A new process manager, gforker, is now available. Like the forker process manager, gforker spawns processes using fork(), and thus is quite useful on SMPs machines. However, unlike forker, gforker supports all of the features of a standard mpiexec, plus some. Therefore, It should be used in place of the previous forker process manager, which is now deprecated. - The following changes were made to ROMIO: - The amount of duplicated ROMIO code in the close, resize, preallocate, read, write, asynchronous I/O, and sync routines has been substantially reduced. - A bug in flattening code, triggered by nested datatypes, has been fixed. - Some small memory leaks have been fixed. - The error handling has been abstracted allowing different MPI implementations to handle and report error conditions in their own way. Using this abstraction, the error handling routines have been made consistent with rest of MPICH2. - AIO support has been cleaned up and unified. It now works correctly on Linux, and is properly detected on old versions of AIX. - A bug in MPI_File_seek code, and underlying support code, has been fixed. - Support for PVFS2 has improved. - Several dead file systems have been removed. Others, including HFS, SFS, PIOFS, and Paragon, have been deprecated. - MPE and CLOG have been updated to version 2.1. For more details, please see src/mpe2/README. - New macros for memory management were added to support function local allocations (alloca), to rollback pending allocations when error conditions are detected to avoid memory leaks, and to improve the conciseness of code performing memory allocations. - New error handling macros were added to make internal error handling code more concise. =============================================================================== Changes in 0.971 =============================================================================== - Code restricted by copyrights less flexible than the one described in the COPYRIGHT file has been removed. - Installation and User Guides have been added. - The SMPD PMI Wire Protocol Reference Manual has been updated. - To eliminate portability problems, common blocks in mpif.h that spanned multiple lines were broken up into multiple common blocks each described on a single line. - A new command, mpich2version, was added to allow the user to obtain information about the MPICH2 installation. This command is currently a simple shell script. We anticipate that the mpich2version command will eventually provide additional information such as the patches applied and the date of the release. - The following changes were made to MPD2: - Support was added for MPI's "singleton init", in which a single process started in the normal way (i.e., not by mpiexec or mpirun) becomes an MPI process with an MPI_COMM_WORLD of size one by calling MPI_Init. After this the process can call other MPI functions, including MPI_Comm_spawn. - The format for some of the arguments to mpiexec have changed, especially for passing environment variables to MPI processes. - In addition to miscellaneous hardening, better error checking and messages have been added. - The install process has been improved. In particular, configure has been updated to check for a working install program and supply it's own installation script (install.sh) if necessary. - A new program, mpdcheck, has been added to help diagnose machine configurations that might be erroneous or at least confusing to mpd. - Runtime version checking has been added to insure that the Simple implementation of PMI linked into the application and the MPD process manager being used to run that application are compatible. - Minor improvements have been made to mpdboot. - Support for the (now deprecated) BNR interface has been added to allow MPICH1 programs to also be run via MPD2. - Shared libraries are now supported on Linux systems using the GNU compilers with the caveat that C++ support must be disabled (--disable-cxx). - The CH3 interface and device now provide a mechanism for using RDMA (remote direct memory access) to transfer data between processes. - Logging capabilities for MPI and internal routines have been re-added. See the documentation in doc/logging for details. - A "meminit" option was added to --enable-g to force all bytes associated with a structure or union to be initialized prior to use. This prevents programs like Valgrind from complaining about uninitialized accesses. - The dist-with-version and snap targets in the top-level Makefile.sm now properly produce mpich2-/maint/Version instead of mpich2-/Version. In addition, they now properly update the VERSION variable in Makefile.sm without clobbering the sed line that performs the update. - The dist and snap targets in the top-level Makefile.sm now both use the dist-with-version target to avoid inconsistencies. - The following changes were made to simplemake: - The environment variables DEBUG, DEBUG_DIRS, and DEBUG_CONFDIR can now be used to control debugging output. - Many fixes were made to make simplemake so that it would run cleanly with perl -w. - Installation of *all* files from a directory is now possible (example, installing all of the man pages). - The clean targets now remove the cache files produced by newer versions of autoconf. - For files that are created by configure, the determination of the location of that configure has been improved, so that make of those files (e.g., make Makefile) is more likely to work. There is still more to do here. - Short loops over subdirectories are now unrolled. - The maintainerclean target has been renamed to maintainer-clean to match GNU guidelines. - The distclean and maintainer-clean targets have been improved. - An option was added to perform one ar command per directory instead of one per file when creating the profiling version of routines (needed only for systems that do not support weak symbols). =============================================================================== Changes in 0.97 =============================================================================== - MPI-2 one-sided communication has been implemented in the CH3 device. - mpigdb works as a simple parallel debugger for MPI programs started with mpd. New since MPICH1 is the ability to attach to running parallel programs. See the README in mpich2/src/pm/mpd for details. - MPI_Type_create_darray() and MPI_Type_create_subarray() implemented including the right contents and envelope data. - ROMIO flattening code now supports subarray and darray combiners. - Improve scalability and performance of some ROMIO PVFS and PVFS2 routines - An error message string parameter was added to MPID_Abort(). If the parameter is non-NULL this string will be used as the message with the abort output. Otherwise, the output message will be base on the error message associated with the mpi_errno parameter. - MPID_Segment_init() now takes an additional boolean parameter that specifies if the segment processing code is to produce/consume homogeneous (FALSE) or heterogeneous (TRUE) data. - The definitions of MPID_VCR and MPID_VCRT are now defined by the device. - The semantics of MPID_Progress_{Start,Wait,End}() have changed. A typical blocking progress loop now looks like the following. if (req->cc != 0) { MPID_Progress_state progress_state; MPID_Progress_start(&progress_state); while (req->cc != 0) { mpi_errno = MPID_Progress_wait(&progress_state); if (mpi_errno != MPI_SUCCESS) { /* --BEGIN ERROR HANDLING-- */ MPID_Progress_end(&progress_state); goto fn_fail; /* --END ERROR HANDLING-- */ } } MPID_Progress_end(&progress_state); } NOTE: each of these routines now takes a single parameter, a pointer to a thread local state variable. - The CH3 device and interface have been modified to better support MPI_COMM_{SPAWN,SPAWN_MULTIPLE,CONNECT,ACCEPT,DISCONNECT}. Channels writers will notice the following. (This is still a work in progress. See the note below.) - The introduction of a process group object (MPIDI_PG_t) and a new set of routines to manipulate that object. - The renaming of the MPIDI_VC object to MPIDI_VC_t to make it more consistent with the naming of other objects in the device. - The process group information in the MPIDI_VC_t moved from the channel specific portion to the device layer. - MPIDI_CH3_Connection_terminate() was added to the CH3 interface to allow the channel to properly shutdown a connection before the device deletes all associated data structures. - A new upcall routine, MPIDI_CH3_Handle_connection(), was added to allow the device to notify the device when a connection related event has completed. A present the only event is MPIDI_CH3_VC_EVENT_TERMINATED, which notify the device that the underlying connection associated with a VC has been properly shutdown. For every call to MPIDI_CH3_Connection_terminate() that the device makes, the channel must make a corresponding upcall to MPIDI_CH3_Handle_connection(). MPID_Finalize() will likely hang if this rule is not followed. - MPIDI_CH3_Get_parent_port() was added to provide MPID_Init() with the port name of the the parent (spawner). This port name is used by MPID_Init() and MPID_Comm_connect() to create an intercommunicator between the parent (spawner) and child (spawnee). Eventually, MPID_Comm_spawn_multiple() will be update to perform the reverse logic; however, the logic is presently still in the sock channel. Note: the changes noted are relatively fresh and are the beginning to a set of future changes. The goal is to minimize the amount of code required by a channel to support MPI dynamic process functionality. As such, portions of the device will change dramatically in a future release. A few more changes to the CH3 interface are also quite likely. - MPIDI_CH3_{iRead,iWrite}() have been removed from the CH3 interface. MPIDI_CH3U_Handle_recv_pkt() now returns a receive request with a populated iovec to receive data associated with the request. MPIDU_CH3U_Handle_{recv,send}_req() reload the iovec in the request and return and set the complete argument to TRUE if more data is to read or written. If data transfer for the request is complete, the complete argument must be set to FALSE. =============================================================================== Changes in 0.96p2 =============================================================================== The shm and ssm channels have been added back into the distribution. Officially, these channels are supported only on x86 platforms using the gcc compiler. The necessary assembly instructions to guarantee proper ordering of memory operations are lacking for other platforms and compilers. That said, we have seen a high success rate when testing these channels on unsupported systems. This patch release also includes a new unsupported channel. The scalable shared memory, or sshm, channel is similar to the shm channel except that it allocates shared memory communication queues only when necessary instead of preallocating N-squared queues. =============================================================================== Changes in 0.96p1 =============================================================================== This patch release fixes a problem with building MPICH2 on Microsoft Windows platforms. It also corrects a serious bug in the poll implementation of the Sock interface. =============================================================================== Changes in 0.96 =============================================================================== The 0.96 distribution is largely a bug fix release. In addition to the many bug fixes, major improvements have been made to the code that supports the dynamic process management routines (MPI_Comm_{connect,accept,spawn,...}()). Additional changes are still required to support MPI_Comm_disconnect(). We also added an experimental (and thus completely unsupported) rdma device. The internal interface is similar to the CH3 interface except that it contains a couple of extra routines to inform the device about data transfers using the rendezvous protocol. The channel can use this extra information to pin memory and perform a zero-copy transfer. If all goes well, the results will be rolled back into the CH3 device. Due to last minute difficulties, this release does not contain the shm or ssm channels. These channels will be included in a subsequent patch release. =============================================================================== Changes in 0.94 =============================================================================== Active target one-sided communication is now available for the ch3:sock channel. This new functionality has undergone some correctness testing but has not been optimized in terms of performance. Future release will include performance enhancements, passive target communication, and availability in channels other than just ch3:sock. The shared memory channel (ch3:shm), which performs communication using shared memory on a single machine, is now complete and has been extensively tested. At present, this channel only supports IA32 based machines (excluding the Pentium Pro which has a memory ordering bug). In addition, this channel must be compiled with gcc. Future releases with support additional architectures and compilers. A new channel has been added that performs inter-node communication using sockets (TCP/IP) and intra-node communication using shared memory. This channel, ch3:ssm, is ideal for clusters of SMPs. Like the shared memory channel (ch3:shm), this channel only supports IA32 based machines and must be compiled with gcc. In future releases, the ch3:ssm channel will support additional architectures and compilers. The two channels that perform commutation using shared memory, ch3:shm and ch3:ssm, now support the allocation of shared memory using both the POSIX and System V interfaces. The POSIX interface will be used if available; otherwise, the System V interface is used. In the interest of increasing portability, many enhancements have been made to both the code and the configure scripts. And, as always, many bugs have been fixed :-). ***** INTERFACE CHANGES **** The parameters to MPID_Abort() have changed. MPID_Abort() now takes a pointer to communicator object, an MPI error code, and an exit code. MPIDI_CH3_Progress() has been split into two functions: MPIDI_CH3_Progress_wait() and MPIDI_CH3_Progress_test(). =============================================================================== Changes in 0.93 =============================================================================== Version 0.93 has undergone extensive changes to provide better error reporting. Part of these changes involved modifications to the ADI3 and CH3 interfaces. The following routines now return MPI error codes: MPID_Cancel_send() MPID_Cancel_recv() MPID_Progress_poke() MPID_Progress_test() MPID_Progress_wait() MPIDI_CH3_Cancel_send() MPIDI_CH3_Progress() MPIDI_CH3_Progress_poke() MPIDI_CH3_iRead() MPIDI_CH3_iSend() MPIDI_CH3_iSendv() MPIDI_CH3_iStartmsg() MPIDI_CH3_iStartmsgv() MPIDI_CH3_iWrite() MPIDI_CH3U_Handle_recv_pkt() MPIDI_CH3U_Handle_recv_req() MPIDI_CH3U_Handle_send_req() ******************************************************************************* Of special note are MPID_Progress_test(), MPID_Progress_wait() and MPIDI_CH3_Progress() which previously returned an integer value indicating if one or more requests had completed. They no longer return this value and instead return an MPI error code (also an integer). The implication being that while the semantics changed, the type signatures did not. ******************************************************************************* The function used to create error codes, MPIR_Err_create_code(), has also changed. It now takes additional parameters, allowing it create a stack of errors and making it possible for the reporting function to indicate in which function and on which line the error occurred. It also allows an error to be designated as fatal or recoverable. Fatal errors always result in program termination regardless of the error handler installed by the application. A RDMA channel has been added and includes communication methods for shared memory and shmem. This is recent development and the RDMA interface is still in flux.