Exploring and addressing non-deterministic floating point rounding errors in an industrial CFD application

Exploring and addressing non-deterministic floating point rounding errors in an industrial CFD application

Wednesday, June 1, 2022 1:24 PM to 1:28 PM · 4 min. (Europe/Berlin)
Hall D - 2nd Floor

Information

The exact results of most engineering simulations that use aggressive optimizations are affected by the non-associativity of IEEE floating-point operations. This effect is particularly amplified in a parallel environment where execution order is often non-deterministic for the sake of improved parallel efficiency. This can cause issues with the non-reproducibility of results even when the same compiler and same parallel setup is used, making it more difficult to track down numerical issues or adhere to strict validation requirements. In this work, we explore this effect on a full-scale industrial CFD application, Rolls-Royce Hydra. We demonstrate how different configurations, such as the use of different turbulence models or unsteady solvers affect numerical results across different distributed runs. Using the OP2 DSL, we demonstrate how deterministic order of execution can be achieved in distributed environments for the Hydra code to achieve bitwise reproducibility with an overhead of 1.03-2.79X in runtime.
Contributors:

  • Bálint Siklósi (Pázmány Péter Catholic University - Hungary)
  • Gihan Mudalige (University of Warwick)
  • István Reguly (Pázmány Péter Catholic University - Hungary)
Format
On-site