Compression for Scientific & Engineering Data

Compression for Scientific & Engineering Data

Exascale Systems

Information

Large-scale numerical simulations, observations and experiments are generating very large datasets that are difficult to analyze, store and transfer. Data compression is an attractive and efficient technique to significantly reduce the size of scientific datasets. This tutorial reviews the state of the art in lossy compression of scientific datasets covering the most effective decorrelation, approximation and coding techniques. It details the two leading compressors (SZ and ZFP) that offer lossless and lossy compressions.It also introduces compression error assessment metrics and the Z-checker tool to analyze the difference between initial and decompressed datasets. The tutorial offers hands-on exercises using SZ and ZFP as well as Z-checker. It addresses the following questions: Why compression? How does compression work? How to measure and control compression error of lossy compressors? The tutorial uses examples of real-world compressors and scientific/engineering datasets to illustrate the different compression techniques and their performance. The tutorial is given by two of the leading teams in this domain and targets primarily beginners interested in learning about lossy compression for scientific data. This half-day tutorial is improved from the evaluations of the highly rated tutorials given on this topic at ISC17, SC17, SC18, ISC19, SC19.