Determining Parallel Application Execution Efficiency and Scaling using the POP Methodology

Determining Parallel Application Execution Efficiency and Scaling using the POP Methodology

Information

HPC application developers encounter significant challenges getting their codes to run correctly on leadership computer systems consisting of large numbers of interconnected multi-socket multicore processor nodes often with attached accelerator devices. They also need effective tools and methods to track and assess their codes’ execution performance as they aim to get ready for production on current or prospective exascale computer systems.

This tutorial presents the methodology developed and applied over several years within the EU HPC Centre of Excellence Performance Optimisation and Productivity (POP). Its focus is the hierarchy of execution efficiency and scaling metrics that identify the most critical issues and quantify potential benefits of remedies. The metrics can be readily compared and determined by a variety of tools for applications in any language employing standard MPI, OpenMP and other multi-threading and offload paradigms. Widely-deployed open-source tools will be used to demonstrate this process with provided performance measurements of actual HPC application executions (ranging from CFD to neuroscience), allowing tutorial participants to repeat this on their own computers and preparing them to locate and diagnose efficiency and scalability issues in their own parallel application codes.