1 research outputs found
HypoRiPPAtlas as an Atlas of hypothetical natural products for mass spectrometry database search
Recent analyses of public microbial genomes have found over a million biosynthetic gene clusters, the natural products of the majority of which remain
unknown. Additionally, GNPS harbors billions of mass spectra of natural products without known structures and biosynthetic genes. We bridge the gap
between large-scale genome mining and mass spectral datasets for natural
product discovery by developing HypoRiPPAtlas, an Atlas of hypothetical
natural product structures, which is ready-to-use for in silico database search
of tandem mass spectra. HypoRiPPAtlas is constructed by mining genomes
using seq2ripp, a machine-learning tool for the prediction of ribosomally
synthesized and post-translationally modified peptides (RiPPs). In HypoRiPPAtlas, we identify RiPPs in microbes and plants. HypoRiPPAtlas could be
extended to other natural product classes in the future by implementing
corresponding biosynthetic logic. This study paves the way for large-scale
explorations of biosynthetic pathways and chemical structures of microbial
and plant RiPP classes