1 research outputs found
IBM Functional Genomics Platform, A Cloud-Based Platform for Studying Microbial Life at Scale
The rapid growth in biological sequence data is revolutionizing our
understanding of genotypic diversity and challenging conventional approaches to
informatics. With the increasing availability of genomic data, traditional
bioinformatic tools require substantial computational time and the creation of
ever-larger indices each time a researcher seeks to gain insight from the data.
To address these challenges, we pre-computed important relationships between
biological entities spanning the Central Dogma of Molecular Biology and
captured this information in a relational database. The database can be queried
across hundreds of millions of entities and returns results in a fraction of
the time required by traditional methods. In this paper, we describe
\textit{IBM Functional Genomics Platform} (formerly known as OMXWare), a
comprehensive database relating genotype to phenotype for bacterial life.
Continually updated, IBM Functional Genomics Platform today contains data
derived from 200,000 curated, self-consistently assembled genomes. The database
stores functional data for over 68 million genes, 52 million proteins, and 239
million domains with associated biological activity annotations from Gene
Ontology, KEGG, MetaCyc, and Reactome. IBM Functional Genomics Platform maps
all of the many-to-many connections between each biological entity including
the originating genome, gene, protein, and protein domain. Various microbial
studies, from infectious disease to environmental health, can benefit from the
rich data and connections. We describe the data selection, the pipeline to
create and update the IBM Functional Genomics Platform, and the developer tools
(Python SDK and REST APIs) which allow researchers to efficiently study
microbial life at scale