43 research outputs found

    Quality Control Analysis in Real-time (QC-ART) : A Tool for Real-time Quality Control Assessment of Mass Spectrometry-based Proteomics Data

    Get PDF
    Liquid chromatography-mass spectrometry (LC-MS)-based proteomics studies of large sample cohorts can easily require from months to years to complete. Acquiring consistent, high-quality data in such large-scale studies is challenging because of normal variations in instrumentation performance over time, as well as artifacts introduced by the samples themselves, such as those because of collection, storage and processing. Existing quality control methods for proteomics data primarily focus on post-hoc analysis to remove low-quality data that would degrade downstream statistics; they are not designed to evaluate the data in near real-time, which would allow for interventions as soon as deviations in data quality are detected. In addition to flagging analyses that demonstrate outlier behavior, evaluating how the data structure changes over time can aide in understanding typical instrument performance or identify issues such as a degradation in data quality because of the need for instrument cleaning and/or re-calibration. To address this gap for proteomics, we developed Quality Control Analysis in Real-Time (QC-ART), a tool for evaluating data as they are acquired to dynamically flag potential issues with instrument performance or sample quality. QC-ART has similar accuracy as standard post-hoc analysis methods with the additional benefit of real-time analysis. We demonstrate the utility and performance of QC-ART in identifying deviations in data quality because of both instrument and sample issues in near real-time for LC-MS-based plasma proteomics analyses of a sample subset of The Environmental Determinants of Diabetes in the Young cohort. We also present a case where QC-ART facilitated the identification of oxidative modifications, which are often underappreciated in proteomic experiments.Peer reviewe

    The role of EGFR in influenza pathogenicity: Multiple network-based approaches to identify a key regulator of non-lethal infections

    Get PDF
    Despite high sequence similarity between pandemic and seasonal influenza viruses, there is extreme variation in host pathogenicity from one viral strain to the next. Identifying the underlying mechanisms of variability in pathogenicity is a critical task for understanding influenza virus infection and effective management of highly pathogenic influenza virus disease. We applied a network-based modeling approach to identify critical functions related to influenza virus pathogenicity using large transcriptomic and proteomic datasets from mice infected with six influenza virus strains or mutants. Our analysis revealed two pathogenicity-related gene expression clusters; these results were corroborated by matching proteomics data. We also identified parallel downstream processes that were altered during influenza pathogenesis. We found that network bottlenecks (nodes that bridge different network regions) were highly enriched in pathogenicity-related genes, while network hubs (highly connected network nodes) were significantly depleted in these genes. We confirmed that this trend persisted in a distinct virus: Severe Acute Respiratory Syndrome Coronavirus (SARS). The role of epidermal growth factor receptor (EGFR) in influenza pathogenesis, one of the bottleneck regulators with corroborating signals across transcript and protein expression data, was tested and validated in additional mouse infection experiments. We demonstrate that EGFR is important during influenza infection, but the role it plays changes for lethal versus non-lethal infections. Our results show that by using association networks, bottleneck genes that lack hub characteristics can be used to predict a gene’s involvement in influenza virus pathogenicity. We also demonstrate the utility of employing multiple network approaches for analyzing host response data from viral infections

    Hypergraph models of biological networks to identify genes critical to pathogenic viral response

    Get PDF
    Background: Representing biological networks as graphs is a powerful approach to reveal underlying patterns, signatures, and critical components from high-throughput biomolecular data. However, graphs do not natively capture the multi-way relationships present among genes and proteins in biological systems. Hypergraphs are generalizations of graphs that naturally model multi-way relationships and have shown promise in modeling systems such as protein complexes and metabolic reactions. In this paper we seek to understand how hypergraphs can more faithfully identify, and potentially predict, important genes based on complex relationships inferred from genomic expression data sets. Results: We compiled a novel data set of transcriptional host response to pathogenic viral infections and formulated relationships between genes as a hypergraph where hyperedges represent significantly perturbed genes, and vertices represent individual biological samples with specific experimental conditions. We find that hypergraph betweenness centrality is a superior method for identification of genes important to viral response when compared with graph centrality. Conclusions: Our results demonstrate the utility of using hypergraphs to represent complex biological systems and highlight central important responses in common to a variety of highly pathogenic viruses

    ftmsRanalysis: An R package for exploratory data analysis and interactive visualization of FT-MS data.

    No full text
    The high-resolution and mass accuracy of Fourier transform mass spectrometry (FT-MS) has made it an increasingly popular technique for discerning the composition of soil, plant and aquatic samples containing complex mixtures of proteins, carbohydrates, lipids, lignins, hydrocarbons, phytochemicals and other compounds. Thus, there is a growing demand for informatics tools to analyze FT-MS data that will aid investigators seeking to understand the availability of carbon compounds to biotic and abiotic oxidation and to compare fundamental chemical properties of complex samples across groups. We present ftmsRanalysis, an R package which provides an extensive collection of data formatting and processing, filtering, visualization, and sample and group comparison functionalities. The package provides a suite of plotting methods and enables expedient, flexible and interactive visualization of complex datasets through functions which link to a powerful and interactive visualization user interface, Trelliscope. Example analysis using FT-MS data from a soil microbiology study demonstrates the core functionality of the package and highlights the capabilities for producing interactive visualizations

    PM2.5 Is Insufficient to Explain Personal PAH Exposure

    No full text
    Abstract To understand how chemical exposure can impact health, researchers need tools that capture the complexities of personal chemical exposure. In practice, fine particulate matter (PM2.5) air quality index (AQI) data from outdoor stationary monitors and Hazard Mapping System (HMS) smoke density data from satellites are often used as proxies for personal chemical exposure, but do not capture total chemical exposure. Silicone wristbands can quantify more individualized exposure data than stationary air monitors or smoke satellites. However, it is not understood how these proxy measurements compare to chemical data measured from wristbands. In this study, participants wore daily wristbands, carried a phone that recorded locations, and answered daily questionnaires for a 7‐day period in multiple seasons. We gathered publicly available daily PM2.5 AQI data and HMS data. We analyzed wristbands for 94 organic chemicals, including 53 polycyclic aromatic hydrocarbons. Wristband chemical detections and concentrations, behavioral variables (e.g., time spent indoors), and environmental conditions (e.g., PM2.5 AQI) significantly differed between seasons. Machine learning models were fit to predict personal chemical exposure using PM2.5 AQI only, HMS only, and a multivariate feature set including PM2.5 AQI, HMS, and other environmental and behavioral information. On average, the multivariate models increased predictive accuracy by approximately 70% compared to either the AQI model or the HMS model for all chemicals modeled. This study provides evidence that PM2.5 AQI data alone or HMS data alone is insufficient to explain personal chemical exposures. Our results identify additional key predictors of personal chemical exposure

    Soil Metabolomics Predict Microbial Taxa as Biomarkers of Moisture Status in Soils from a Tidal Wetland

    No full text
    We present observations from a laboratory-controlled study on the impacts of extreme wetting and drying on a wetland soil microbiome. Our approach was to experimentally challenge the soil microbiome to understand impacts on anaerobic carbon cycling processes as the system transitions from dryness to saturation and vice-versa. Specifically, we tested for impacts on stress responses related to shifts from wet to drought conditions. We used a combination of high-resolution data for small organic chemical compounds (metabolites) and biological (community structure based on 16S rRNA gene sequencing) features. Using a robust correlation-independent data approach, we further tested the predictive power of soil metabolites for the presence or absence of taxa. Here, we demonstrate that taking an untargeted, multidimensional data approach to the interpretation of metabolomics has the potential to indicate the causative pathways selecting for the observed bacterial community structure in soils
    corecore