15 research outputs found

    Longitudinal data reveal strong genetic and weak non-genetic components of ethnicity-dependent blood DNA methylation levels

    Get PDF
    Epigenetic architecture is influenced by genetic and environmental factors, but little is known about their relative contributions or longitudinal dynamics. Here, we studied DNA methylation (DNAm) at over 750,000 CpG sites in mononuclear blood cells collected at birth and age 7 from 196 children of primarily self-reported Black and Hispanic ethnicities to study race-associated DNAm patterns. We developed a novel Bayesian method for high-dimensional longitudinal data and showed that race-associated DNAm patterns at birth and age 7 are nearly identical. Additionally, we estimated that up to 51% of all self-reported race-associated CpGs had race-dependent DNAm levels that were mediated through local genotype and, quite surprisingly, found that genetic factors explained an overwhelming majority of the variation in DNAm levels at other, previously identified, environmentally-associated CpGs. These results indicate that race-associated blood DNAm patterns in particular, and blood DNAm levels in general, are primarily driven by genetic factors, and are not as sensitive to environmental exposures as previously suggested, at least during the first 7 years of life

    From differential abundance to mtGWAS: accurate and scalable methodology for metabolomics data with non-ignorable missing observations and latent factors

    Full text link
    Metabolomics is the high-throughput study of small molecule metabolites. Besides offering novel biological insights, these data contain unique statistical challenges, the most glaring of which is the many non-ignorable missing metabolite observations. To address this issue, nearly all analysis pipelines first impute missing observations, and subsequently perform analyses with methods designed for complete data. While clearly erroneous, these pipelines provide key practical advantages not present in existing statistically rigorous methods, including using both observed and missing data to increase power, fast computation to support phenome- and genome-wide analyses, and streamlined estimates for factor models. To bridge this gap between statistical fidelity and practical utility, we developed MS-NIMBLE, a statistically rigorous and powerful suite of methods that offers all the practical benefits of imputation pipelines to perform phenome-wide differential abundance analyses, metabolite genome-wide association studies (mtGWAS), and factor analysis with non-ignorable missing data. Critically, we tailor MS-NIMBLE to perform differential abundance and mtGWAS in the presence of latent factors, which reduces biases and improves power. In addition to proving its statistical and computational efficiency, we demonstrate its superior performance using three real metabolomic datasets.Comment: 19 pages of main text; 89 pages with supplement; 3 figures and 2 table

    Robust and Accurate Estimation of Cellular Fraction from Tissue Omics Data Via Ensemble Deconvolution

    Get PDF
    Motivation: Tissue-level omics data such as transcriptomics and epigenomics are an average across diverse cell types. To extract cell-type-specific (CTS) signals, dozens of cellular deconvolution methods have been proposed to infer cell-type fractions from tissue-level data. However, these methods produce vastly different results under various real data settings. Simulation-based benchmarking studies showed no universally best deconvolution approaches. There have been attempts of ensemble methods, but they only aggregate multiple single-cell references or reference-free deconvolution methods. Results: To achieve a robust estimation of cellular fractions, we proposed EnsDeconv (Ensemble Deconvolution), which adopts CTS robust regression to synthesize the results from 11 single deconvolution methods, 10 reference datasets, 5 marker gene selection procedures, 5 data normalizations and 2 transformations. Unlike most benchmarking studies based on simulations, we compiled four large real datasets of 4937 tissue samples in total with measured cellular fractions and bulk gene expression from different tissues. Comprehensive evaluations demonstrated that EnsDeconv yields more stable, robust and accurate fractions than existing methods. We illustrated that EnsDeconv estimated cellular fractions enable various CTS downstream analyses such as differential fractions associated with clinical variables. We further extended EnsDeconv to analyze bulk DNA methylation data

    Induction of the Immunoproteasome Subunit Lmp7 Links Proteostasis and Immunity in α-Synuclein Aggregation Disorders

    No full text
    Accumulation of aggregated α-synuclein into Lewy bodies is thought to contribute to the onset and progression of dopaminergic neuron degeneration in Parkinson's disease (PD) and related disorders. Although protein aggregation is associated with perturbation of proteostasis, how α-synuclein aggregation affects the brain proteome and signaling remains uncertain. In a mouse model of α-synuclein aggregation, 6% of 6215 proteins and 1.6% of 8183 phosphopeptides changed in abundance, indicating conservation of proteostasis and phosphorylation signaling. The proteomic analysis confirmed changes in abundance of proteins that regulate dopamine synthesis and transport, synaptic activity and integrity, and unearthed changes in mRNA binding, processing and protein translation. Phosphorylation signaling changes centered on axonal and synaptic cytoskeletal organization and structural integrity. Proteostatic responses included a significant increase in the levels of Lmp7, a component of the immunoproteasome. Increased Lmp7 levels and activity were also quantified in postmortem human brains with PD and dementia with Lewy bodies. Functionally, the immunoproteasome degrades α-synuclein aggregates and generates potentially antigenic peptides. Expression and activity of the immunoproteasome may represent testable targets to induce adaptive responses that maintain proteome integrity and modulate immune responses in protein aggregation disorders. Keywords: Neurodegeneration, Parkinson's disease, Dopaminergic neurons, Immunoproteasome, Proteostasi

    Epigenetic landscape links upper airway microbiota in infancy with allergic rhinitis at 6 years of age

    No full text
    BACKGROUND: The upper airways present a barrier to inhaled allergens and microbes, which alter immune responses and subsequent risk for diseases, such as allergic rhinitis (AR). OBJECTIVE: We tested the hypothesis that early life microbial exposures leaves a lasting signature in DNA methylation that ultimately influences the development of AR in children. METHODS: We studied upper airway microbiota at 1 week, 1 month and 3 months of life, and measured DNA methylation (DNAm) and gene expression profiles in upper airway mucosal cells and assessed AR at age 6 in children in the Copenhagen Prospective Studies on Asthma in Childhood (COPSAC)2010 birth cohort RESULTS: We identified 956 AR-associated differentially methylated CpGs (DMCs) in upper airway mucosal cells at age 6; 792 of which formed three modules of correlated DMCs. The eigenvector of one module was correlated with the expression of genes enriched for lysosome and bacterial invasion of epithelial cell pathways. Early life microbial diversity was lower at 1 week (richness p=0.0079) in children with AR at age 6, and reduced diversity at 1 week was also correlated with the same module’s eigenvector (rho=−0.25, p=3.3×10(−5)). We show that the effect of microbiota richness at 1 week on risk for AR at age 6 was mediated in part by the epigenetic signature of this module. CONCLUSION: Our results suggest that upper airway microbial composition in infancy contributes to the development of AR during childhood, and this trajectory is mediated, at least in part, through altered DNAm patterns in upper airway mucosal cells
    corecore