Optimizing Matrix Math: Batch-Reduced GEMM (BRGEMM)  for Accelerated Deep Learning on Arm HPC Systems

Optimizing Matrix Math: Batch-Reduced GEMM (BRGEMM) for Accelerated Deep Learning on Arm HPC Systems

Tuesday, June 10, 2025 3:00 PM to Thursday, June 12, 2025 4:00 PM · 2 days 1 hr. (Europe/Berlin)
Foyer D-G - 2nd floor
Project Poster
AI Applications powered by HPC TechnologiesNumerical LibrariesOptimizing for Energy and Performance

Information

Poster is on display.
Matrix multiplications serve as a basic building block for models like Transformers and large language models (LLMs), thus contributing majorly for the performance of deep learning workloads. Among the matrix multiplication algorithms, BRGEMM (Batch-Reduced GEMM) stands out as a highly efficient algorithm which can be used in other essential kernels such as convolutions, significantly accelerating various deep learning language and vision models.
In this work, we have developed BRGEMM utilising the SVE vector registers to achieve maximum vectorization on Arm, addressing the growing need for efficient computation on Arm architectures. We chose oneDNN (Deep Neural Network Library) to implement this kernel as it is an open-source performance library which serves as the backend for many popular deep learning frameworks, including PyTorch, TensorFlow, and JAX, making it an ideal platform for implementing optimized algorithms. Our contributions to oneDNN provides a 1.2x to 1.4x performance improvement at the kernel level for various LLM shapes and achieves up to 3x acceleration in inference time for various deep learning language and vision models like Whisper, Resnet50, Llama, T5 etc., in PyTorch on ARM platforms.
This work accelerates high-performance deep learning workload on ARM HPC systems, fostering improved scalability and efficiency for applications ranging from computer vision to NLP and recommendation systems. By developing BRGEMM, we advance the adoption of Arm architecture in AI workloads.
Contributors:
Format
On DemandOn Site

Log in

See all the content and easy-to-use features by logging in or registering!