Analyzing huge amounts of genomics data produced by high throughput genomics technologies is often involving expensive computational resources. Exploring the solution space for single cell genomics applications on large parallel systems to meet user exceptions is a challenging task. The performance of genomics applications depends on many factors which can be determined by the underling workload characteristics, computer architecture features and algorithmic complexity of applications.
We focus on Kallisto as a genomics application which develops a method introduced as near-optimal probabilistic RNA-seq quantification. Kallisto is able to pseudo-align reads from an RNA-seq experiment to the transcripts of a reference genomes.
To discover potential application and architecture optimization options, the performance impact of different parameters and factors have been investigated by means of performance model. The sensitivity of the performance over a broader range of architectural choices including different platforms (like Intel Xeon, IBM POWER8 and Huawei HiSilicon 1616) and I/O devices and PFS (like NVME SSD, BeeGPFS ) have been investigated to construct a performance model which is able to set realistic expectations on performance based on system capabilities and application characteristics.
First, the fundamental operations performed by Kallisto is captured independent of the target architecture to identify the application signature and application requirements. Then, some parts of application requirements are developed as micro kernels and micro-benchmarks to motivate application optimizations and identify potential performance bottlenecks. They are used to estimate performance relative to machine capabilities (especially for I/O capabilities) in order to decrease the complexity of understanding and modeling performance.
The developed micro-benchmarks can be used to identify the part of the application or the target architecture which limits the scale and efficiency. The observed performance results clarified that there would be still optimization opportunities for Kallisto by considering the need for algorithmic changes. Kallisto is a data-intensive application but it has limitation to get parallel access to data. Since the application uses serial POSIX I/O call. Therefore, access to data in a parallel fashion using non-sequential I/O calls like pread system call and memory-mapped I/O has been investigated which results in performance improvement by eliminating the serialization bottleneck.
Due to the complex control flow and memory access patterns of Kallisto precise prediction by the performance model might not be approachable however, the model is able to estimate the baseline performance and identify the critical resources.