13 research outputs found

    Mendelian inheritance of trimodal CpG methylation sites suggests distal cis-acting genetic effects.

    Get PDF
    Environmentally influenced phenotypes, such as obesity and insulin resistance, can be transmitted over multiple generations. Epigenetic modifications, such as methylation of DNA cytosine-guanine (CpG) pairs, may be carriers of inherited information. At the population level, the methylation state of such "heritable" CpG sites is expected to follow a trimodal distribution, and their mode of inheritance should be Mendelian. Using the Illumina Infinium 450 K DNA methylation array, we determined DNA CpG-methylation in blood cells from a family cohort 123 individuals of Arab ethnicity, including 18 elementary father-mother-child trios, we asked whether Mendelian inheritance of CpG methylation is observed, and most importantly, whether it is independent of any genetic signals. Using 40× whole genome sequencing, we therefore excluded all CpG sites with possibly confounding genetic variants (SNP) within the binding regions of the Illumina probes. We identified a total of 955 CpG sites that displayed a trimodal distribution and confirmed trimodality in a study of 1805 unrelated Caucasians. Of 955 CpG sites, 99.9% observed a strict Mendelian pattern of inheritance and had no SNP within +/-110 nucleotides of the CpG site by design. However, in 97% of these cases a distal cis-acting SNP within a +/-1 Mbp window was found that explained the observed CpG distribution, excluding the hypothesis of epigenetic inheritance for these clear-cut trimodal sites. Using power analysis, we showed that in 46% of all cases, the closest CpG-associated SNP was located more than 1000 bp from the CpG site. Our findings suggest that CpG methylation is maintained over larger genomic distances. Furthermore, nearly half of the SNPs associated with these trimodal sites were also associated with the expression of nearby genes (P = 4.08 × 10(-6)), implying a regulatory effect of these trimodal CpG sites

    Epigenetic scores for the circulating proteome as tools for disease prediction

    Get PDF
    Protein biomarkers have been identified across many age-related morbidities. However, characterising epigenetic influences could further inform disease predictions. Here, we leverage epigenome-wide data to study links between the DNA methylation (DNAm) signatures of the circulating proteome and incident diseases. Using data from four cohorts, we trained and tested epigenetic scores (EpiScores) for 953 plasma proteins, identifying 109 scores that explained between 1% and 58% of the variance in protein levels after adjusting for known protein quantitative trait loci (pQTL) genetic effects. By projecting these EpiScores into an independent sample (Generation Scotland; n = 9537) and relating them to incident morbidities over a follow-up of 14 years, we uncovered 137 EpiScore-disease associations. These associations were largely independent of immune cell proportions, common lifestyle and health factors, and biological aging. Notably, we found that our diabetes-associated EpiScores highlighted previous top biomarker associations from proteome-wide assessments of diabetes. These EpiScores for protein levels can therefore be a valuable resource for disease prediction and risk stratification

    Plasma Proteomics of Renal Function: A Transethnic Meta-Analysis and Mendelian Randomization Study.

    Get PDF
    BACKGROUND: Studies on the relationship between renal function and the human plasma proteome have identified several potential biomarkers. However, investigations have been conducted largely in European populations, and causality of the associations between plasma proteins and kidney function has never been addressed. METHODS: A cross-sectional study of 993 plasma proteins among 2882 participants in four studies of European and admixed ancestries (KORA, INTERVAL, HUNT, QMDiab) identified transethnic associations between eGFR/CKD and proteomic biomarkers. For the replicated associations, two-sample bidirectional Mendelian randomization (MR) was used to investigate potential causal relationships. Publicly available datasets and transcriptomic data from independent studies were used to examine the association between gene expression in kidney tissue and eGFR. RESULTS: In total, 57 plasma proteins were associated with eGFR, including one novel protein. Of these, 23 were additionally associated with CKD. The strongest inferred causal effect was the positive effect of eGFR on testican-2, in line with the known biological role of this protein and the expression of its protein-coding gene (SPOCK2) in renal tissue. We also observed suggestive evidence of an effect of melanoma inhibitory activity (MIA), carbonic anhydrase III, and cystatin-M on eGFR. CONCLUSIONS: In a discovery-replication setting, we identified 57 proteins transethnically associated with eGFR. The revealed causal relationships are an important stepping stone in establishing testican-2 as a clinically relevant physiological marker of kidney disease progression, and point to additional proteins warranting further investigation.The KORA study was initiated and financed by the Helmholtz Zentrum München – German Research Center for Environmental Health, which is funded by the German Federal Ministry of Education and Research (BMBF) and by the State of Bavaria. This work was also supported by the Biomedical Research Program at Weill Cornell Medicine in Qatar, a program funded by the Qatar Foundation. K.S. is supported by Qatar National Research Fund (QNRF) grant no. NPRPC11-0115-180010. The Nord-Trøndelag Health Study (The HUNT Study) is a collaboration between HUNT Research Centre (Faculty of Medicine, Norwegian University of Science and Technology NTNU), Nord-Trøndelag County Council, Central Norway Health Authority, and the Norwegian Institute of Public Health. The HUNT part of the project re-used protein data that was originally analysed and paid for by Somalogic Inc, CO, USA. Somalogic had no role in the design and conduct of the study; collection of phenotypic data, statistical analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication. Professor John Danesh is funded by the National Institute for Health Research [Senior Investigator Award]. The views expressed are those of the authors and not necessarily those of the NHS, the NIHR or the Department of Health and Social Care. RNA-sequencing experiments and kidney gene expression studies were supported by British Heart Foundation project grants [PG/17/35/33001 and PG/19/16/34270] and Kidney Research UK grants [ RP_017_20180302 and RP_013_20190305] to M.T. The German Diabetes Center is funded by the German Federal Ministry of Health (Berlin, Germany), the Ministry of Culture and Science of the state North Rhine-Westphalia (Düsseldorf, Germany), and grants from the German Federal Ministry of Education and Research (Berlin, Germany) to the German Center for Diabetes Research e.V. (DZD)

    Missing data estimation in fMRI dynamic causal modeling

    No full text
    Dynamic Causal Modeling (DCM) can be used to quantify cognitive function in individuals as effective connectivity. However, ambiguity among subjects in the number and location of discernible active regions prevents all candidate models from being compared in all subjects, preventing the use of DCM as an individual cognitive phenotyping tool. This paper proposes a solution to this problem by treating missing regions in the first-level analysis as missing data, and performing estimation of the time course associated with any missing region using one of four candidate methods: zero-filling, average-filling, noise-filling using a fixed stochastic process, or one estimated using expectation-maximization. The effect of this estimation scheme was analyzed by treating it as a preprocessing step to DCM and observing the resulting effects on model evidence. Simulation studies show that estimation using expectation-maximization yields the highest classification accuracy using a simple loss function and highest model evidence, relative to other methods. This result held for various dataset sizes and varying numbers of model choice. In real data, application to Go/No-Go and Simon tasks allowed computation of signals from the missing nodes and the consequent computation of model evidence in all subjects compared to 62 and 48 percent respectively if no preprocessing was performed. These results demonstrate the face validity of the preprocessing scheme and open the possibility of using single-subject DCM as an individual cognitive phenotyping tool

    Metabolic and proteomic signatures of type 2 diabetes subtypes in an Arab population.

    No full text
    Type 2 diabetes (T2D) has a heterogeneous etiology influencing its progression, treatment, and complications. A data driven cluster analysis in European individuals with T2D previously identified four subtypes: severe insulin deficient (SIDD), severe insulin resistant (SIRD), mild obesity-related (MOD), and mild age-related (MARD) diabetes. Here, the clustering approach was applied to individuals with T2D from the Qatar Biobank and validated in an independent set. Cluster-specific signatures of circulating metabolites and proteins were established, revealing subtype-specific molecular mechanisms, including activation of the complement system with features of autoimmune diabetes and reduced 1,5-anhydroglucitol in SIDD, impaired insulin signaling in SIRD, and elevated leptin and fatty acid binding protein levels in MOD. The MARD cluster was the healthiest with metabolomic and proteomic profiles most similar to the controls. We have translated the T2D subtypes to an Arab population and identified distinct molecular signatures to further our understanding of the etiology of these subtypes

    PopPAnTe: population and pedigree association testing for quantitative data.

    Get PDF
    Family-based designs, from twin studies to isolated populations with their complex genealogical data, are a valuable resource for genetic studies of heritable molecular biomarkers. Existing software for family-based studies have mainly focused on facilitating association between response phenotypes and genetic markers, and no user-friendly tools are at present available to straightforwardly extend association studies in related samples to large datasets of generic quantitative data, as those generated by current -omics technologies. We developed PopPAnTe, a user-friendly Java program, which evaluates the association of quantitative data in related samples. Additionally, PopPAnTe implements data pre and post processing, region based testing, and empirical assessment of associations. PopPAnTe is an integrated and flexible framework for pairwise association testing in related samples with a large number of predictors and response variables. It works either with family data of any size and complexity, or, when the genealogical information is unknown, it uses genetic similarity information between individuals as those inferred from genome-wide genetic data. It can therefore be particularly useful in facilitating usage of biobank data collections from population isolates when extensive genealogical information is missing

    Metabolic and proteomic signatures of type 2 diabetes subtypes in an Arab population

    No full text
    Type 2 diabetes (T2D) has a heterogeneous etiology influencing its progression, treatment, and complications. A data driven cluster analysis in European individuals with T2D previously identified four subtypes: severe insulin deficient (SIDD), severe insulin resistant (SIRD), mild obesity-related (MOD), and mild age-related (MARD) diabetes. Here, the clustering approach was applied to individuals with T2D from the Qatar Biobank and validated in an independent set. Cluster-specific signatures of circulating metabolites and proteins were established, revealing subtype-specific molecular mechanisms, including activation of the complement system with features of autoimmune diabetes and reduced 1,5-anhydroglucitol in SIDD, impaired insulin signaling in SIRD, and elevated leptin and fatty acid binding protein levels in MOD. The MARD cluster was the healthiest with metabolomic and proteomic profiles most similar to the controls. We have translated the T2D subtypes to an Arab population and identified distinct molecular signatures to further our understanding of the etiology of these subtypes

    Deep molecular phenotypes link complex disorders and physiological insult to CpG methylation

    No full text
    Epigenetic regulation of cellular function provides a mechanism for rapid organismal adaptation to changes in health, lifestyle and environment. Associations of cytosine-guanine di-nucleotide (CpG) methylation with clinical endpoints that overlap with metabolic phenotypes suggest a regulatory role for these CpG sites in the body's response to disease or environmental stress. We previously identified 20 CpG sites in an epigenome-wide association study (EWAS) with metabolomics that were also associated in recent EWASs with diabetes-, obesity-, and smoking-related endpoints. To elucidate the molecular pathways that connect these potentially regulatory CpG sites to the associated disease or lifestyle factors, we conducted a multi-omics association study including 2474 mass-spectrometry-based metabolites in plasma, urine and saliva, 225 NMR-based lipid and metabolite measures in blood, 1124 blood-circulating proteins using aptamer technology, 113 plasma protein N-glycans and 60 IgG-glyans, using 359 samples from the multi-ethnic Qatar Metabolomics Study on Diabetes (QMDiab). We report 138 multi-omics associations at these CpG sites, including diabetes biomarkers at the diabetes-associated TXNIP locus, and smoking-specific metabolites and proteins at multiple smoking-associated loci, including AHRR. Mendelian randomization suggests a causal effect of metabolite levels on methylation of obesity-associated CpG sites, i.e. of glycerophospholipid PC(O-36: 5), glycine and a very low-density lipoprotein (VLDL-A) on the methylation of the obesity-associated CpG loci DHCR24, MYO5C and CPT1A, respectively. Taken together, our study suggests that multi-omics-associated CpG methylation can provide functional read-outs for the underlying regulatory response mechanisms to disease or environmental insults
    corecore