An IO500-based Workflow For User-centric I/O Performance Management
HPC Workflows
Information
Contributors:
Abstract:
I/O performance in a multi-user environment is challenging to predict. It is hard for users to know what to expect when running and tuning their application for better I/O performance. In this project, we evaluate IO500 as a user-centric workflow to manage their expectations on their application’s I/O performance and devise an optimization strategy specific to the target cluster’s capability.
IO500 benchmark is a standard benchmark for HPC storages systems and is designed to create a balanced performance. In our workflow, we use the IO500 benchmark scenarios ‘easy’ and ‘hard’ to get the best and worst possible performance results of the cluster’s bandwidth and metadata rate. Then, we create a bounding box of user expectations with these scenarios and map the application’s I/O performance within this box. With the mapped I/O performance, we can understand which part of the application needs to be improved and the possible extent of this improvement.
Our experiments confirm that the bounding box of user’s expectations can be created. In doing so, this project is a promising first step towards the mapping and improvement of the application’s I/O performance. Also, our experiments show that the IO500 benchmark misses I/O libraries and middleware aspects to represent the actual worst-case scenario, and we observed tail latency that needs to be addressed.
Abstract:
I/O performance in a multi-user environment is challenging to predict. It is hard for users to know what to expect when running and tuning their application for better I/O performance. In this project, we evaluate IO500 as a user-centric workflow to manage their expectations on their application’s I/O performance and devise an optimization strategy specific to the target cluster’s capability.
IO500 benchmark is a standard benchmark for HPC storages systems and is designed to create a balanced performance. In our workflow, we use the IO500 benchmark scenarios ‘easy’ and ‘hard’ to get the best and worst possible performance results of the cluster’s bandwidth and metadata rate. Then, we create a bounding box of user expectations with these scenarios and map the application’s I/O performance within this box. With the mapped I/O performance, we can understand which part of the application needs to be improved and the possible extent of this improvement.
Our experiments confirm that the bounding box of user’s expectations can be created. In doing so, this project is a promising first step towards the mapping and improvement of the application’s I/O performance. Also, our experiments show that the IO500 benchmark misses I/O libraries and middleware aspects to represent the actual worst-case scenario, and we observed tail latency that needs to be addressed.