TinyProf : Towards Continuous Performance Introspection through Scalable Parallel I/O

TinyProf : Towards Continuous Performance Introspection through Scalable Parallel I/O

Wednesday, May 15, 2024 1:25 PM to 1:50 PM · 25 min. (Europe/Berlin)
Hall F - 2nd floor
Research Paper
Performance MeasurementPerformance Tools and SimulatorsSystem and Performance Monitoring

Information

Performance profiling tools are crucial for HPC specialists to identify performance bottlenecks in parallel codes at various levels of granularity (i.e., across nodes, ranks, and threads). Although numerous sophisticated profiling tools have been developed, achieving scalable performance introspection on large scales remains a challenge. This is particularly evident in efficiently writing profiles to disk during runtime and subsequently reading them with constrained computing resources for post-hoc analysis. In this paper, we present tinyProf, a performance introspection framework that tackles I/O-related challenges in profiling performance data at scale. TinyProf's scalability is attributed to an optimal runtime that consists of three key components: (1) an efficient in-memory data structure that minimizes memory consumption and decreases communication overhead during parallel file I/O; (2) a customizable three-phase I/O scheme that generates optimal I/O patterns capable of scaling with high core counts; and (3) a streamlined data format for profiles, which guarantees minimal sizes for profile files.
Contributors:
Format
On-siteOn Demand