208 research outputs found
Deep neural network improves the estimation of polygenic risk scores for breast cancer
Polygenic risk scores (PRS) estimate the genetic risk of an individual for a
complex disease based on many genetic variants across the whole genome. In this
study, we compared a series of computational models for estimation of breast
cancer PRS. A deep neural network (DNN) was found to outperform alternative
machine learning techniques and established statistical algorithms, including
BLUP, BayesA and LDpred. In the test cohort with 50% prevalence, the Area Under
the receiver operating characteristic Curve (AUC) were 67.4% for DNN, 64.2% for
BLUP, 64.5% for BayesA, and 62.4% for LDpred. BLUP, BayesA, and LPpred all
generated PRS that followed a normal distribution in the case population.
However, the PRS generated by DNN in the case population followed a bi-modal
distribution composed of two normal distributions with distinctly different
means. This suggests that DNN was able to separate the case population into a
high-genetic-risk case sub-population with an average PRS significantly higher
than the control population and a normal-genetic-risk case sub-population with
an average PRS similar to the control population. This allowed DNN to achieve
18.8% recall at 90% precision in the test cohort with 50% prevalence, which can
be extrapolated to 65.4% recall at 20% precision in a general population with
12% prevalence. Interpretation of the DNN model identified salient variants
that were assigned insignificant p-values by association studies, but were
important for DNN prediction. These variants may be associated with the
phenotype through non-linear relationships.Comment: 28 pages, 7 figures, 2 Table
Recommended from our members
Genome-Resolved Proteomic Stable Isotope Probing of Soil Microbial Communities Using 13CO2 and 13C-Methanol.
Stable isotope probing (SIP) enables tracking the nutrient flows from isotopically labeled substrates to specific microorganisms in microbial communities. In proteomic SIP, labeled proteins synthesized by the microbial consumers of labeled substrates are identified with a shotgun proteomics approach. Here, proteomic SIP was combined with targeted metagenomic binning to reconstruct metagenome-assembled genomes (MAGs) of the microorganisms producing labeled proteins. This approach was used to track carbon flows from 13CO2 to the rhizosphere communities of Zea mays, Triticum aestivum, and Arabidopsis thaliana. Rhizosphere microorganisms that assimilated plant-derived 13C were capable of metabolic and signaling interactions with their plant hosts, as shown by their MAGs containing genes for phytohormone modulation, quorum sensing, and transport and metabolism of nutrients typical of those found in root exudates. XoxF-type methanol dehydrogenases were among the most abundant proteins identified in the rhizosphere metaproteomes. 13C-methanol proteomic SIP was used to test the hypothesis that XoxF was used to metabolize and assimilate methanol in the rhizosphere. We detected 7 13C-labeled XoxF proteins and identified methylotrophic pathways in the MAGs of 8 13C-labeled microorganisms, which supported the hypothesis. These two studies demonstrated the capability of proteomic SIP for functional characterization of active microorganisms in complex microbial communities
Impact of Pretreated Switchgrass and Biomass Carbohydrates on Clostridium thermocellum ATCC 27405 Cellulosome Composition: A Quantitative Proteomic Analysis
Background: Economic feasibility and sustainability of lignocellulosic ethanol production requires the development of robust microorganisms that can efficiently degrade and convert plant biomass to ethanol. The anaerobic thermophilic bacterium Clostridium thermocellum is a candidate microorganism as it is capable of hydrolyzing cellulose and fermenting the hydrolysis products to ethanol and other metabolites. C. thermocellum achieves efficient cellulose hydrolysis using multiprotein extracellular enzymatic complexes, termed cellulosomes. Methodology/Principal Findings: In this study, we used quantitative proteomics (multidimensional LC-MS/MS and 15N-metabolic labeling) to measure relative changes in levels of cellulosomal subunit proteins (per CipA scaffoldin basis) when C. thermocellum ATCC 27405 was grown on a variety of carbon sources [dilute-acid pretreated switchgrass, cellobiose, amorphous cellulose, crystalline cellulose (Avicel) and combinations of crystalline cellulose with pectin or xylan or both]. Cellulosome samples isolated from cultures grown on these carbon sources were compared to 15N labeled cellulosome samples isolated from crystalline cellulose-grown cultures. In total from all samples, proteomic analysis identified 59 dockerin- and 8 cohesin-module containing components, including 16 previously undetected cellulosomal subunits. Many cellulosomal components showed differential protein abundance in the presence of non-cellulose substrates in the growt
A high-throughput \u3ci\u3ede novo\u3c/i\u3e sequencing approach for shotgun proteomics using high-resolution tandem mass spectrometry
Abstract Background
High-resolution tandem mass spectra can now be readily acquired with hybrid instruments, such as LTQ-Orbitrap and LTQ-FT, in high-throughput shotgun proteomics workflows. The improved spectral quality enables more accurate de novo sequencing for identification of post-translational modifications and amino acid polymorphisms. Results
In this study, a new de novo sequencing algorithm, called Vonode, has been developed specifically for analysis of such high-resolution tandem mass spectra. To fully exploit the high mass accuracy of these spectra, a unique scoring system is proposed to evaluate sequence tags based primarily on mass accuracy information of fragment ions. Consensus sequence tags were inferred for 11,422 spectra with an average peptide length of 5.5 residues from a total of 40,297 input spectra acquired in a 24-hour proteomics measurement of Rhodopseudomonas palustris. The accuracy of inferred consensus sequence tags was 84%. According to our comparison, the performance of Vonode was shown to be superior to the PepNovo v2.0 algorithm, in terms of the number of de novo sequenced spectra and the sequencing accuracy. Conclusions
Here, we improved de novo sequencing performance by developing a new algorithm specifically for high-resolution tandem mass spectral data. The Vonode algorithm is freely available for download at http://compbio.ornl.gov/Vonode webcite
Geochemical, metagenomic and metaproteomic insights into trace metal utilization by methane-oxidizing microbial consortia in sulphidic marine sediments
Microbes have obligate requirements for trace metals in metalloenzymes that catalyse important biogeochemical reactions. In anoxic methane- and sulphide-rich environments, microbes may have unique adaptations for metal acquisition and utilization because of decreased bioavailability as a result of metal sulphide precipitation. However, micronutrient cycling is largely unexplored in cold (≤ 10°C) and sulphidic (> 1 mM ΣH_(2)S) deep-sea methane seep ecosystems. We investigated trace metal geochemistry and microbial metal utilization in methane seeps offshore Oregon and California, USA, and report dissolved concentrations of nickel (0.5–270 nM), cobalt (0.5–6 nM), molybdenum (10–5600 nM) and tungsten (0.3–8 nM) in Hydrate Ridge sediment porewaters. Despite low levels of cobalt and tungsten, metagenomic and metaproteomic data suggest that microbial consortia catalysing anaerobic oxidation of methane (AOM) utilize both scarce micronutrients in addition to nickel and molybdenum. Genetic machinery for cobalt-containing vitamin B_(12) biosynthesis was present in both anaerobic methanotrophic archaea (ANME) and sulphate-reducing bacteria. Proteins affiliated with the tungsten-containing form of formylmethanofuran dehydrogenase were expressed in ANME from two seep ecosystems, the first evidence for expression of a tungstoenzyme in psychrophilic microorganisms. Overall, our data suggest that AOM consortia use specialized biochemical strategies to overcome the challenges of metal availability in sulphidic environments
Measuring Dissociation Rate Constants of Protein Complexes through Subunit Exchange: Experimental Design and Theoretical Modeling
Protein complexes are dynamic macromolecules that constantly dissociate into, and simultaneously are assembled from, free subunits. Dissociation rate constants, koff, provide structural and functional information on protein complexes. However, because all existing methods for measuring koff require high-quality purification and specific modifications of protein complexes, dissociation kinetics has only been studied for a small set of model complexes. Here, we propose a new method, called Metabolically-labeled Affinity-tagged Subunit Exchange (MASE), to measure koff using metabolic stable isotope labeling, affinity purification and mass spectrometry. MASE is based on a subunit exchange process between an unlabeled affinity-tagged variant and a metabolically-labeled untagged variant of a complex. The subunit exchange process was modeled theoretically for a heterodimeric complex. The results showed that koff determines, and hence can be estimated from, the observed rate of subunit exchange. This study provided the theoretical foundation for future experiments that can validate and apply the MASE method
Islet autoantibody seroconversion in type-1 diabetes is associated with metagenome-assembled genomes in infant gut microbiomes
The immune system of some genetically susceptible children can be triggered by certain environmental factors to produce islet autoantibodies (IA) against pancreatic β cells, which greatly increases their risk for Type-1 diabetes. An environmental factor under active investigation is the gut microbiome due to its important role in immune system education. Here, we study gut metagenomes that are de-novo-assembled in 887 at-risk children in the Environmental Determinants of Diabetes in the Young (TEDDY) project. Our results reveal a small set of core protein families, present in >50% of the subjects, which account for 64% of the sequencing reads. Time-series binning generates 21,536 high-quality metagenome-assembled genomes (MAGs) from 883 species, including 176 species that hitherto have no MAG representation in previous comprehensive human microbiome surveys. IA seroconversion is positively associated with 2373 MAGs and negatively with 1549 MAGs. Comparative genomics analysis identifies lipopolysaccharides biosynthesis in Bacteroides MAGs and sulfate reduction in Anaerostipes MAGs as functional signatures of MAGs with positive IA-association. The functional signatures in the MAGs with negative IA-association include carbohydrate degradation in lactic acid bacteria MAGs and nitrate reduction in Escherichia MAGs. Overall, our results show a distinct set of gut microorganisms associated with IA seroconversion and uncovered the functional genomics signatures of these IA-associated microorganisms.We appreciate the data and technical support provided by the TEDDY project, which is supported by the National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK). This work is supported by an R01 grant (R01AT011618) to C.P. and R.S.M. from National Center for Complementary & Integrative Health and National Institute of General Medical Sciences and a Team Science grant to K.R.J. and C.P. from Presbyterian Health Foundation of Oklahoma City and Harold Hamm Diabetes Center. The high-performance computing was provided by the OU Supercomputing Center for Education & Research (OSCER). Financial support was provided by the University of Oklahoma Libraries’ Open Access Fund.YesNature Communications thanks the anonymous reviewers for their contribution to the peer review of this work
Effects of diet on resource utilization by a model human gut microbiota containing Bacteroides cellulosilyticus WH2, a symbiont with an extensive glycobiome
The human gut microbiota is an important metabolic organ, yet little is known about how its individual species interact, establish dominant positions, and respond to changes in environmental factors such as diet. In this study, gnotobiotic mice were colonized with an artificial microbiota comprising 12 sequenced human gut bacterial species and fed oscillating diets of disparate composition. Rapid, reproducible, and reversible changes in the structure of this assemblage were observed. Time-series microbial RNA-Seq analyses revealed staggered functional responses to diet shifts throughout the assemblage that were heavily focused on carbohydrate and amino acid metabolism. High-resolution shotgun metaproteomics confirmed many of these responses at a protein level. One member, Bacteroides cellulosilyticus WH2, proved exceptionally fit regardless of diet. Its genome encoded more carbohydrate active enzymes than any previously sequenced member of the Bacteroidetes. Transcriptional profiling indicated that B. cellulosilyticus WH2 is an adaptive forager that tailors its versatile carbohydrate utilization strategy to available dietary polysaccharides, with a strong emphasis on plant-derived xylans abundant in dietary staples like cereal grains. Two highly expressed, diet-specific polysaccharide utilization loci (PULs) in B. cellulosilyticus WH2 were identified, one with characteristics of xylan utilization systems. Introduction of a B. cellulosilyticus WH2 library comprising >90,000 isogenic transposon mutants into gnotobiotic mice, along with the other artificial community members, confirmed that these loci represent critical diet-specific fitness determinants. Carbohydrates that trigger dramatic increases in expression of these two loci and many of the organism's 111 other predicted PULs were identified by RNA-Seq during in vitro growth on 31 distinct carbohydrate substrates, allowing us to better interpret in vivo RNA-Seq and proteomics data. These results offer insight into how gut microbes adapt to dietary perturbations at both a community level and from the perspective of a well-adapted symbiont with exceptional saccharolytic capabilities, and illustrate the value of artificial communities
Recommended from our members
Community proteogenomics reveals the systemic impact of phosphorus availability on microbial functions in tropical soil.
Phosphorus is a scarce nutrient in many tropical ecosystems, yet how soil microbial communities cope with growth-limiting phosphorus deficiency at the gene and protein levels remains unknown. Here, we report a metagenomic and metaproteomic comparison of microbial communities in phosphorus-deficient and phosphorus-rich soils in a 17-year fertilization experiment in a tropical forest. The large-scale proteogenomics analyses provided extensive coverage of many microbial functions and taxa in the complex soil communities. A greater than fourfold increase in the gene abundance of 3-phytase was the strongest response of soil communities to phosphorus deficiency. Phytase catalyses the release of phosphate from phytate, the most recalcitrant phosphorus-containing compound in soil organic matter. Genes and proteins for the degradation of phosphorus-containing nucleic acids and phospholipids, as well as the decomposition of labile carbon and nitrogen, were also enhanced in the phosphorus-deficient soils. In contrast, microbial communities in the phosphorus-rich soils showed increased gene abundances for the degradation of recalcitrant aromatic compounds, transformation of nitrogenous compounds and assimilation of sulfur. Overall, these results demonstrate the adaptive allocation of genes and proteins in soil microbial communities in response to shifting nutrient constraints
Microbial Community Functional Change during Vertebrate Carrion Decomposition
Microorganisms play a critical role in the decomposition of organic matter, which contributes to energy and nutrient transformation in every ecosystem. Yet, little is known about the functional activity of epinecrotic microbial communities associated with carrion. The objective of this study was to provide a description of the carrion associated microbial community functional activity using differential carbon source use throughout decomposition over seasons, between years and when microbial communities were isolated from eukaryotic colonizers (e.g., necrophagous insects). Additionally, microbial communities were identified at the phyletic level using high throughput sequencing during a single study. We hypothesized that carrion microbial community functional profiles would change over the duration of decomposition, and that this change would depend on season, year and presence of necrophagous insect colonization. Biolog EcoPlatesâ„¢ were used to measure the variation in epinecrotic microbial community function by the differential use of 29 carbon sources throughout vertebrate carrion decomposition. Pyrosequencing was used to describe the bacterial community composition in one experiment to identify key phyla associated with community functional changes. Overall, microbial functional activity increased throughout decomposition in spring, summer and winter while it decreased in autumn. Additionally, microbial functional activity was higher in 2011 when necrophagous arthropod colonizer effects were tested. There were inconsistent trends in the microbial function of communities isolated from remains colonized by necrophagous insects between 2010 and 2011, suggesting a greater need for a mechanistic understanding of the process. These data indicate that functional analyses can be implemented in carrion studies and will be important in understanding the influence of microbial communities on an essential ecosystem process, carrion decomposition
- …