Characterizing Containerized HPC Application Performance at Petascale on CPU and GPU Architectures
Monday, June 28, 2021 1:40 PM to 2:00 PM · 20 min. (Africa/Abidjan)
Stream#4
Exascale Systems
Information
Contributors:
Abstract:
Containerization technologies provide a mechanism to encapsulate applications and many of their dependencies, facilitating software portability and reproducibility on HPC systems. However, in order to access many of the architectural features that enable HPC system performance, compatibility between certain components of the container and host is required, resulting in a trade-off between portability and performance. In this work, we discuss our experiences running three state-of-the-art containerization technologies on five leading petascale systems. We present how we build the containers to ensure performance and security and their performance at scale. We ran microbenchmarks at a scale of 6,144 nodes containing 0.35M MPI processes and baseline the performance of container technologies. We establish the near-native performance and minimal memory overheads by the containerized environments using MILC - a lattice quantum chromodynamics code at 139,968 processes and using VPIC - a 3d electromagnetic relativistic Vector Particle-In-Cell code for modeling kinetic plasmas at 32,768 processes. We demonstrate an on-par performance trend at a large scale on Intel, AMD, and three NVIDIA architectures for both HPC applications.
- John Cazes (Texas Advanced Computing Center)
- Richard Todd Evans (Texas Advanced Computing Center)
- John Fonner (Texas Advanced Computing Center)
- Stephen Harrell (Texas Advanced Computing Center)
- Tommy Minyard (Texas Advanced Computing Center)
- Matt Vaughn (Texas Advanced Computing Center)
- Gregory Zynda (Texas Advanced Computing Center)
- Amit Ruhela (Texas Advanced Computing Center)
Abstract:
Containerization technologies provide a mechanism to encapsulate applications and many of their dependencies, facilitating software portability and reproducibility on HPC systems. However, in order to access many of the architectural features that enable HPC system performance, compatibility between certain components of the container and host is required, resulting in a trade-off between portability and performance. In this work, we discuss our experiences running three state-of-the-art containerization technologies on five leading petascale systems. We present how we build the containers to ensure performance and security and their performance at scale. We ran microbenchmarks at a scale of 6,144 nodes containing 0.35M MPI processes and baseline the performance of container technologies. We establish the near-native performance and minimal memory overheads by the containerized environments using MILC - a lattice quantum chromodynamics code at 139,968 processes and using VPIC - a 3d electromagnetic relativistic Vector Particle-In-Cell code for modeling kinetic plasmas at 32,768 processes. We demonstrate an on-par performance trend at a large scale on Intel, AMD, and three NVIDIA architectures for both HPC applications.
Speakers
Amit Ruhela
Manager HPC ToolsTACC