119 research outputs found
Linking genomics and metabolomics to chart specialized metabolic diversity
Microbial and plant specialized metabolites constitute an immense chemical diversity, and play key roles in mediating ecological interactions between organisms. Also referred to as natural products, they have been widely applied in medicine, agriculture, cosmetic and food industries. Traditionally, the main discovery strategies have centered around the use of activity-guided fractionation of metabolite extracts. Increasingly, omics data is being used to complement this, as it has the potential to reduce rediscovery rates, guide experimental work towards the most promising metabolites, and identify enzymatic pathways that enable their biosynthetic production. In recent years, genomic and metabolomic analyses of specialized metabolic diversity have been scaled up to study thousands of samples simultaneously. Here, we survey data analysis technologies that facilitate the effective exploration of large genomic and metabolomic datasets, and discuss various emerging strategies to integrate these two types of omics data in order to further accelerate discovery
HypoRiPPAtlas as an Atlas of hypothetical natural products for mass spectrometry database search
Recent analyses of public microbial genomes have found over a million biosynthetic gene clusters, the natural products of the majority of which remain unknown. Additionally, GNPS harbors billions of mass spectra of natural products without known structures and biosynthetic genes. We bridge the gap between large-scale genome mining and mass spectral datasets for natural product discovery by developing HypoRiPPAtlas, an Atlas of hypothetical natural product structures, which is ready-to-use for in silico database search of tandem mass spectra. HypoRiPPAtlas is constructed by mining genomes using seq2ripp, a machine-learning tool for the prediction of ribosomally synthesized and post-translationally modified peptides (RiPPs). In HypoRiPPAtlas, we identify RiPPs in microbes and plants. HypoRiPPAtlas could be extended to other natural product classes in the future by implementing corresponding biosynthetic logic. This study paves the way for large-scale explorations of biosynthetic pathways and chemical structures of microbial and plant RiPP classes
Synthetic Biology: Mapping the Scientific Landscape
This article uses data from Thomson Reuters Web of Science to map and analyse the scientific landscape for synthetic biology. The article draws on recent advances in data visualisation and analytics with the aim of informing upcoming international policy debates on the governance of synthetic biology by the Subsidiary Body on Scientific, Technical and Technological Advice (SBSTTA) of the United Nations Convention on Biological Diversity. We use mapping techniques to identify how synthetic biology can best be understood and the range of institutions, researchers and funding agencies involved. Debates under the Convention are likely to focus on a possible moratorium on the field release of synthetic organisms, cells or genomes. Based on the empirical evidence we propose that guidance could be provided to funding agencies to respect the letter and spirit of the Convention on Biological Diversity in making research investments. Building on the recommendations of the United States Presidential Commission for the Study of Bioethical Issues we demonstrate that it is possible to promote independent and transparent monitoring of developments in synthetic biology using modern information tools. In particular, public and policy understanding and engagement with synthetic biology can be enhanced through the use of online interactive tools. As a step forward in this process we make existing data on the scientific literature on synthetic biology available in an online interactive workbook so that researchers, policy makers and civil society can explore the data and draw conclusions for themselves
A community resource for paired genomic and metabolomic data mining
Genomics and metabolomics are widely used to explore specialized metabolite diversity. The Paired Omics Data Platform is a community initiative to systematically document links between metabolome and (meta)genome data, aiding identification of natural product biosynthetic origins and metabolite structures.Peer reviewe
A novel underdetermined source recovery algorithm based on k-sparse component analysis
Sparse component analysis (SCA) is a popular method for addressing underdetermined blind source separation in array signal processing applications. We are motivated by problems that arise in the applications where the sources are densely sparse (i.e. the number of active sources is high and very close to the number of sensors). The separation performance of current underdetermined source recovery (USR) solutions, including the relaxation and greedy families, reduces with decreasing the mixing system dimension and increasing the sparsity level (k). In this paper, we present a k-SCA-based algorithm that is suitable for USR in low-dimensional mixing systems. Assuming the sources is at most (m−1) sparse where m is the number of mixtures; the proposed method is capable of recovering the sources from the mixtures given the mixing matrix using a subspace detection framework. Simulation results show that the proposed algorithm achieves better separation performance in k-SCA conditions compared to state-of-the-art USR algorithms such as basis pursuit, minimizing norm-L1, smoothed L0, focal underdetermined system solver and orthogonal matching pursuit
American Gut: an Open Platform for Citizen Science Microbiome Research
McDonald D, Hyde E, Debelius JW, et al. American Gut: an Open Platform for Citizen Science Microbiome Research. mSystems. 2018;3(3):e00031-18
- …