iPuma: High-Performance Sequence Alignment on the Graphcore IPU

Wednesday, May 15, 2024 1:50 PM to 2:15 PM · 25 min. (Europe/Berlin)

Hall F - 2nd floor

Research Paper

Bioinformatics and Life Sciences

Information

String alignment algorithms are an essential tool for understanding DNA and protein sequences. They demand substantial computation in real-world applications, thus a prime target for hardware acceleration. While GPUs struggle to provide sufficient acceleration, the recent MIMD-capable AI accelerators such as the Graphcore Intelligence Processing Unit (IPU) have become technologically viable. We present iPuma, a new implementation of Smith-Waterman sequence alignment for the IPU, which offers generalized short and medium length, one-to-one, and many-to-many high-throughput alignments for both DNA and protein sequences. iPuma is also integrated into two bioinformatics pipelines, MetaHipMer2 and PASTIS. On protein datasets, iPuma shows speedups of $2.7\times$ and $1.6\times$ over state-of-the-art GPU and CPU implementations, respectively. We test the scalability on up to 64 IPUs, attaining a peak scoring performance of $1763$~GCUPS for protein and $1168$~GCUPS for DNA sequences.

Contributors:

Format

On-siteOn Demand