Developing Efficient Scientific Gateways for Bioinformatics in Supercomputer Environments Supported by Artificial Intelligence

Developing Efficient Scientific Gateways for Bioinformatics in Supercomputer Environments Supported by Artificial Intelligence

Only 125 seats left
Wednesday, June 30, 2021 3:00 PM to 4:00 PM · 1 hr. (Africa/Abidjan)
HPC Workflows

Information

Contributors:
Abstract:

This project aims to develop green and intelligent scientific gateways for bioinformatics supported by high-performance computing environments (HPC) and specialized technologies as scientific workflows, data mining, machine learning, and deep learning. The efficient analysis and interpretation of Big Data open new challenges to explore molecular biology, genetics, biomedical, and healthcare to improve personalized diagnostics and therapeutics; then, it becomes necessary to availability of new avenues to deal with this massive amount of information. New paradigms in Bioinformatics and Computational Biology drive the storing, managing, and accessing of data. HPC and Big Data advances in this domain represent a vast new field of opportunities for bioinformatics researchers and a significant challenge.

The Bioinfo-Portal (https://bioinfo.lncc.br/) science gateway is a multiuser Brazilian infrastructure for bioinformatics applications, benefiting from the HPC infrastructure. We present several challenges for efficiently executing applications and discussing the findings of how to improve the use of computational resources. We performed several large-scale bioinformatics experiments that are considered computationally intensive and time-consuming. We are currently coupling artificial intelligence to generate models to analyze computational and bioinformatics metadata to understand how automatic learning can predict computational resources’ efficient use. The computational executions are carried out at Santos Dumont (SDumont, https://sdumont.lncc.br/), the largest supercomputer in Latin America that has 5.1 Petaflops and 36,472 computational cores distributed in 1,134 computational nodes.