BioCloud 2 (CNPq/AWS)

BioCloud 2: Exploring the DNA of the AWS Cloud to Optimize Execution of Biotechnology Applications

Coordinator: Maria Cristina Silva Boeres

The importance of bioinformatics in today’s world is undeniable, such as the global knowledge necessary to combat the pandemic caused by the SARS-CoV-2 virus. For this and other diseases, scientists in the field advance their research and discoveries using sophisticated comparative genomics algorithms, which require high computational power and manipulate an immense volume of data. It is worth mentioning that in the last three decades, there has been a major evolution in genome sequencing technology, which has allowed the determination of an unprecedented number of DNA, RNA and protein sequences. The vast majority of these biological sequences are found in public databases with the availability of more than 240 million sequences. Furthermore, in the search for organ donor compatibility, the analysis of organ donor records in Brazil is crucial due to the incompleteness of a large part of the records stored in a public bank with more than 5 million records.

Given the difficulty of maintaining laboratories with cutting-edge infrastructure to run such applications, the use of computing clouds meets the processing and storage requirements of vast amounts of data at an attractive cost. However, to use the cloud environment in such a way that applications are executed in a timely manner, within a certain cost, it is imperative to know how to select cloud resources, configure them, monitor them and offer ways of analyzing the results of cloud applications. bioinformatics to help the scientist himself.

In order to explore the cost and performance relationships of bioinformatics applications in the AWS cloud, this project will focus efforts on specifying methodologies for efficient application execution, through performance analysis, scaling mechanisms, resource elasticity and fault tolerance. , aiming for better efficiency, robustness and minimization of monetary costs for users. On the side of the scientist, user of the applications, the objective is to facilitate the efficient use of the cloud to solve their problems, and on the provider’s side, the project offers benefits in terms of cost reduction and energy consumption in the use of its computational infrastructure and its services.


Team
Universidade Federal Fluminense (UFF)
Maria Cristina Silva Boeres    
Daniel Cardoso Moraes de Oliveira    
Eugene Francis Vinod Rebello    
Lúcia Maria de Assumpção Drummond    
Alan Lira Nunes    
Universidade de Brasília (UnB)
Alba Cristina Magalhães Alves de Melo    
Aletéia Patrícia Favacho de Araújo von Paumgartten    
Universidade do Estado do Rio de Janeiro (UERJ)
Fundação Oswaldo Cruz (Fiocruz)

Publications