EASEY Deployment Analysis of containerized applications with BPF
Exascale SystemsHPC Workflows
Information
Choosing likely optimal deployment configuration of containerized high performance applications is one important building brick towards optimal usage of supercomputing systems. While investigating the scalability and performance of such applications, different questions regarding number of tasks and nodes, and overall the tradeoff between hardware usage and time-to-solution need to be answered, but, only single solutions are available and they do not match almost Exascale-ready systems.
Therefore, we propose a comprehensive benchmarking framework based on EASEY (Enable exASclae for EverYone) to support the decision process. Our systems is capable of preparing and deploying containerized applications, apply auto-tunig mechanisms of target systems and analyze the performance of different execution settings. This automatic procedure assists the application owner to determine the likely optimal deployment configuration. Thereby, also the usage of the target hardware can be balanced and shifted by the user or the system owner.
Profiling the runtime behavior of an unknown application is still ongoing research. Adding to this challenge a container framework makes it even more complex. With the Berkeley Paket Filter (BPF) we found a low-level tap point to extract necessary data while executing only small chunks of the application. This data trains a regression model to predict the characterization of the complete runtime till solution of the containerized application.
In this first stage of this analysis framework we focused only on CPU- and memory-heavy micro benchmarks. On our available target systems, we executed two benchmarks based on the seven dwarfs. We decided to use a simple metric to indicate the potential usage as a target system for either CPU-bound or memory-bound applications. Each benchmark was executed only on one dedicated node, since communication across nodes and islands will be part of a future version of this logic.
Determining the one or the other bound condition we are able to predict and select a computing system most likely optimal for this container execution, not knowing which exact applications and computation is inside. We treat the container as a black box while investigating its characteristics. The used functions are independent of the actual container and create objective measurements. The predicted characterization in this first stage is only used to determine the bound condition of the black box container. In the next stages we aim to include especially communication patterns to add more granularity. This will enable more complex characterizations and deployment decision also including the interconnects of the target systems between nodes and islands.
The EASEY deployment recommendation offers on the one site the possibility for system owners to optimize the usage of their clusters when knowing a-priori where the bottleneck of an applications is, and on the other side enables application owners to characterize their container applications for any deployment without any knowledge of such a characterization. With a rising number of HPC users, who might have no long term experience in application optimization, our framework offers assistance and support at a certain level to characterize potential bottlenecks.
Therefore, we propose a comprehensive benchmarking framework based on EASEY (Enable exASclae for EverYone) to support the decision process. Our systems is capable of preparing and deploying containerized applications, apply auto-tunig mechanisms of target systems and analyze the performance of different execution settings. This automatic procedure assists the application owner to determine the likely optimal deployment configuration. Thereby, also the usage of the target hardware can be balanced and shifted by the user or the system owner.
Profiling the runtime behavior of an unknown application is still ongoing research. Adding to this challenge a container framework makes it even more complex. With the Berkeley Paket Filter (BPF) we found a low-level tap point to extract necessary data while executing only small chunks of the application. This data trains a regression model to predict the characterization of the complete runtime till solution of the containerized application.
In this first stage of this analysis framework we focused only on CPU- and memory-heavy micro benchmarks. On our available target systems, we executed two benchmarks based on the seven dwarfs. We decided to use a simple metric to indicate the potential usage as a target system for either CPU-bound or memory-bound applications. Each benchmark was executed only on one dedicated node, since communication across nodes and islands will be part of a future version of this logic.
Determining the one or the other bound condition we are able to predict and select a computing system most likely optimal for this container execution, not knowing which exact applications and computation is inside. We treat the container as a black box while investigating its characteristics. The used functions are independent of the actual container and create objective measurements. The predicted characterization in this first stage is only used to determine the bound condition of the black box container. In the next stages we aim to include especially communication patterns to add more granularity. This will enable more complex characterizations and deployment decision also including the interconnects of the target systems between nodes and islands.
The EASEY deployment recommendation offers on the one site the possibility for system owners to optimize the usage of their clusters when knowing a-priori where the bottleneck of an applications is, and on the other side enables application owners to characterize their container applications for any deployment without any knowledge of such a characterization. With a rising number of HPC users, who might have no long term experience in application optimization, our framework offers assistance and support at a certain level to characterize potential bottlenecks.