State-of-the-Art High Performance MPI Libraries and Slingshot Networking

State-of-the-Art High Performance MPI Libraries and Slingshot Networking

Monday, May 22, 2023 3:00 PM to Wednesday, May 24, 2023 5:00 PM · 2 days 2 hr. (Europe/Berlin)
Foyer D-G - 2nd Floor
Research Poster
Emerging HPC Processors and AcceleratorsExascale SystemsHPC WorkflowsManaging Extreme-Scale Parallelism

Information

Many top supercomputers utilize InfiniBand networking across nodes to scale out performance. Underlying interconnect technology is a critical component in achieving high performance, low latency and high throughput, at scale on next-generation exascale systems. The deployment of Slingshot networking for new exascale systems such as Frontier at OLCF and the upcoming El-Capitan at LLNL pose several challenges. State-of-the-art MPI libraries for GPU-aware and CPU-based communication should adapt to be optimized for Slingshot networking, particularly with support for the underlying HPE Cray fabric and adapter to have functionality over the Slingshot-11 interconnect. This poses a need for a thorough evaluation and understanding of slingshot networking with regards to MPI-level performance in order to provide efficient performance and scalability on exascale systems. In this work, we delve into a comprehensive evaluation on Slingshot-10 and Slingshot-11 networking with state-of-the-art MPI libraries and delve into the challenges this newer ecosystem poses.
Contributors:
Format
On-site
Beginner Level
10%
Intermediate Level
30%
Advanced Level
60%