15 research outputs found
Open Access Repository-Scale Propagated Nearest Neighbor Suspect Spectral Library for Untargeted Metabolomics
Abstract Despite the increasing availability of tandem mass spectrometry (MS/MS) community spectral libraries for untargeted metabolomics over the past decade, the majority of acquired MS/MS spectra remain uninterpreted. To further aid in interpreting unannotated spectra, we created a nearest neighbor suspect spectral library, consisting of 87,916 annotated MS/MS spectra derived from hundreds of millions of public MS/MS spectra. Annotations were propagated based on structural relationships to reference molecules using MS/MS-based spectrum alignment. We demonstrate the broad relevance of the nearest neighbor suspect spectral library through representative examples of propagation-based annotation of acylcarnitines, bacterial and plant natural products, and drug metabolism. Our results also highlight how the library can help to better understand an Alzheimer’s brain phenotype. The nearest neighbor suspect spectral library is openly available through the GNPS platform to help investigators hypothesize candidate structures for unknown MS/MS spectra in untargeted metabolomics data
26th Annual Computational Neuroscience Meeting (CNS*2017): Part 3 - Meeting Abstracts - Antwerp, Belgium. 15–20 July 2017
This work was produced as part of the activities of FAPESP Research,\ud
Disseminations and Innovation Center for Neuromathematics (grant\ud
2013/07699-0, S. Paulo Research Foundation). NLK is supported by a\ud
FAPESP postdoctoral fellowship (grant 2016/03855-5). ACR is partially\ud
supported by a CNPq fellowship (grant 306251/2014-0)
Sensory neurons and peripheral pathways in Drosophila embryos
SCOPUS: ar.jinfo:eu-repo/semantics/publishe
Reproducible Molecular Networking Of Untargeted Mass Spectrometry Data Using GNPS.
Herein, we present a protocol for the use of Global Natural Products Social (GNPS) Molecular Networking, an interactive online chemistry-focused mass spectrometry data curation and analysis infrastructure. The goal of GNPS is to provide as much chemical insight for an untargeted tandem mass spectrometry data set as possible and to connect this chemical insight to the underlying biological questions a user wishers to address. This can be performed within one experiment or at the repository scale. GNPS not only serves as a public data repository for untargeted tandem mass spectrometry data with the sample information (metadata), it also captures community knowledge that is disseminated via living data across all public data. One or the main analysis tools used by the GNPS community is molecular networking. Molecular networking creates a structured data table that reflects the chemical space from tandem mass spectrometry experiments via computing the relationships of the tandem mass spectra through spectral similarity. This protocol provides step-by-step instructions for creating reproducible high-quality molecular networks. For training purposes, the reader is led through the protocol from recalling a public data set and its sample information to creating and interpreting a molecular network. Each data analysis job can be shared or cloned to disseminate the knowledge gained, thus propagating information that can lead to the discovery of molecules, metabolic pathways, and ecosystem/community interactions
Untargeted mass spectrometry-based metabolomics approach unveils molecular changes in raw and processed foods and beverages
n our daily lives, we consume foods that have been transported, stored, prepared, cooked, or otherwise processed by ourselves or others. Food storage and preparation have drastic effects on the chemical composition of foods. Untargeted mass spectrometry analysis of food samples has the potential to increase our chemical understanding of these processes by detecting a broad spectrum of chemicals. We performed a time-based analysis of the chemical changes in foods during common preparations, such as fermentation, brewing, and ripening, using untargeted mass spectrometry and molecular networking. The data analysis workflow presented implements an approach to study changes in food chemistry that can reveal global alterations in chemical profiles, identify changes in abundance, as well as identify specific chemicals and their transformation products. The data generated in this study are publicly available, enabling the replication and re-analysis of these data in isolation, and serve as a baseline dataset for future investigations
Reproducible molecular networking of untargeted mass spectrometry data using GNPS
Global Natural Product Social Molecular Networking (GNPS) is an interactive online small molecule–focused tandem mass spectrometry (MS2) data curation and analysis infrastructure. It is intended to provide as much chemical insight as possible into an untargeted MS2 dataset and to connect this chemical insight to the user’s underlying biological questions. This can be performed within one liquid chromatography (LC)-MS2 experiment or at the repository scale. GNPS-MassIVE is a public data repository for untargeted MS2 data with sample information (metadata) and annotated MS2 spectra. These publicly accessible data can be annotated and updated with the GNPS infrastructure keeping a continuous record of all changes. This knowledge is disseminated across all public data; it is a living dataset. Molecular networking—one of the main analysis tools used within the GNPS platform—creates a structured data table that reflects the molecular diversity captured in tandem mass spectrometry experiments by computing the relationships of the MS2 spectra as spectral similarity. This protocol provides step-by-step instructions for creating reproducible, high-quality molecular networks. For training purposes, the reader is led through a 90- to 120-min procedure that starts by recalling an example public dataset and its sample information and proceeds to creating and interpreting a molecular network. Each data analysis job can be shared or cloned to disseminate the knowledge gained, thus propagating information that can lead to the discovery of molecules, metabolic pathways, and ecosystem/community interactions.UCR::Vicerrectoría de Investigación::Unidades de Investigación::Ciencias Básicas::Centro de Investigaciones en Productos Naturales (CIPRONA)UCR::Vicerrectoría de Docencia::Ciencias Básicas::Facultad de Ciencias::Escuela de Químic
ReDU: a framework to find and reanalyze public mass spectrometry data
We present ReDU (https://redu.ucsd.edu/), a system for metadata capture of public mass spectrometry-based metabolomics data, with validated controlled vocabularies. Systematic capture of knowledge enables the reanalysis of public data and/or co-analysis of one’s own data. ReDU enables multiple types of analyses, including finding chemicals and associated metadata, comparing the shared and different chemicals between groups of samples, and metadata-filtered, repository-scale molecular networking. © 2020, The Author(s), under exclusive licence to Springer Nature America, Inc
Recommended from our members
Enhancing untargeted metabolomics using metadata-based source annotation
Human untargeted metabolomics studies annotate only ~10% of molecular features. We introduce reference-data-driven analysis to match metabolomics tandem mass spectrometry (MS/MS) data against metadata-annotated source data as a pseudo-MS/MS reference library. Applying this approach to food source data, we show that it increases MS/MS spectral usage 5.1-fold over conventional structural MS/MS library matches and allows empirical assessment of dietary patterns from untargeted data
Wastewater sequencing reveals early cryptic SARS-CoV-2 variant transmission.
As SARS-CoV-2 continues to spread and evolve, detecting emerging variants early is critical for public health interventions. Inferring lineage prevalence by clinical testing is infeasible at scale, especially in areas with limited resources, participation, or testing and/or sequencing capacity, which can also introduce biases1-3. SARS-CoV-2 RNA concentration in wastewater successfully tracks regional infection dynamics and provides less biased abundance estimates than clinical testing4,5. Tracking virus genomic sequences in wastewater would improve community prevalence estimates and detect emerging variants. However, two factors limit wastewater-based genomic surveillance: low-quality sequence data and inability to estimate relative lineage abundance in mixed samples. Here we resolve these critical issues to perform a high-resolution, 295-day wastewater and clinical sequencing effort, in the controlled environment of a large university campus and the broader context of the surrounding county. We developed and deployed improved virus concentration protocols and deconvolution software that fully resolve multiple virus strains from wastewater. We detected emerging variants of concern up to 14 days earlier in wastewater samples, and identified multiple instances of virus spread not captured by clinical genomic surveillance. Our study provides a scalable solution for wastewater genomic surveillance that allows early detection of SARS-CoV-2 variants and identification of cryptic transmission
Recommended from our members
Genomic surveillance reveals dynamic shifts in the connectivity of COVID-19 epidemics
Summary:
The maturation of genomic surveillance in the past decade has enabled tracking of the emergence and spread of epidemics at an unprecedented level. During the COVID-19 pandemic, for example, genomic data revealed that local epidemics varied considerably in the frequency of SARS-CoV-2 lineage importation and persistence, likely due to a combination of COVID-19 restrictions and changing connectivity. Here, we show that local COVID-19 epidemics are driven by regional transmission, including across international boundaries, but can become increasingly connected to distant locations following the relaxation of public health interventions. By integrating genomic, mobility, and epidemiological data, we find abundant transmission occurring between both adjacent and distant locations, supported by dynamic mobility patterns. We find that changing connectivity significantly influences local COVID-19 incidence. Our findings demonstrate a complex meaning of ‘local’ when investigating connected epidemics and emphasize the importance of collaborative interventions for pandemic prevention and mitigation