Power Consumption Trends in Supercomputers: A Study of NERSC's Cori and Perlmutter Machines
Wednesday, May 15, 2024 10:45 AM to 11:10 AM · 25 min. (Europe/Berlin)
Hall F - 2nd floor
Research Paper
Energy ManagementExtreme-scale SystemsHeterogeneous System ArchitecturesSustainability and Energy EfficiencySystem and Performance Monitoring
Information
The rising power demands of supercomputers put high importance on understanding the underlying sources of power use.
We compare a comprehensive set of power measurements covering six months from two supercomputers, the Cori and Perlmutter machines at the National Energy Research Scientific Computing Center (NERSC).
We show that power usage varies considerably, and is always significantly below the peak provisioned power.
Several factors cause this – the machine may not be fully utilized, applications' computational characteristics are not those which maximize power usage,
and/or applications can be waiting on resources external to the node.
Our analysis shows that while the power usage of applications in the same science domain is similar, the power usage of the same application run by different users is even more similar.
As NERSC transitioned to GPU accelerated nodes, the peak power capabilities increased, but the production workload's power demands did not increase at the same rate, further decreasing the fraction of thermal design power (TDP) used.
These results indicate that future machines could be power capped and over-provisioned and a metric different than thermal peak design is needed for future procurement, in alignment with the actual power needs of production workloads.
These results suggest that with appropriate technologies, such as power-aware scheduling or dynamic power management, future HPC systems could be operated with power caps well below TDP, avoiding the high cost of over-provisioned infrastructure.
Contributors:
Contributors:
Format
On-siteOn Demand
Documents & Links
Read the Full Paper Open Access at IEEE Xplore!