10 research outputs found

    Genomic architecture and prediction of censored time-to-event phenotypes with a Bayesian genome-wide analysis

    Get PDF
    While recent advancements in computation and modelling have improved the analysis of complex traits, our understanding of the genetic basis of the time at symptom onset remains limited. Here, we develop a Bayesian approach (BayesW) that provides probabilistic inference of the genetic architecture of age-at-onset phenotypes in a sampling scheme that facilitates biobank-scale time-to-event analyses. We show in extensive simulation work the benefits BayesW provides in terms of number of discoveries, model performance and genomic prediction. In the UK Biobank, we find many thousands of common genomic regions underlying the age-at-onset of high blood pressure (HBP), cardiac disease (CAD), and type-2 diabetes (T2D), and for the genetic basis of onset reflecting the underlying genetic liability to disease. Age-at-menopause and age-at-menarche are also highly polygenic, but with higher variance contributed by low frequency variants. Genomic prediction into the Estonian Biobank data shows that BayesW gives higher prediction accuracy than other approaches

    Multi-method genome- and epigenome-wide studies of inflammatory protein levels in healthy older adults

    Get PDF
    The molecular factors which control circulating levels of inflammatory proteins are not well understood. Furthermore, association studies between molecular probes and human traits are often performed by linear model-based methods which may fail to account for complex structure and interrelationships within molecular datasets.In this study, we perform genome- and epigenome-wide association studies (GWAS/EWAS) on the levels of 70 plasma-derived inflammatory protein biomarkers in healthy older adults (Lothian Birth Cohort 1936; n = 876; Olink® inflammation panel). We employ a Bayesian framework (BayesR+) which can account for issues pertaining to data structure and unknown confounding variables (with sensitivity analyses using ordinary least squares- (OLS) and mixed model-based approaches). We identified 13 SNPs associated with 13 proteins (n = 1 SNP each) concordant across OLS and Bayesian methods. We identified 3 CpG sites spread across 3 proteins (n = 1 CpG each) that were concordant across OLS, mixed-model and Bayesian analyses. Tagged genetic variants accounted for up to 45% of variance in protein levels (for MCP2, 36% of variance alone attributable to 1 polymorphism). Methylation data accounted for up to 46% of variation in protein levels (for CXCL10). Up to 66% of variation in protein levels (for VEGFA) was explained using genetic and epigenetic data combined. We demonstrated putative causal relationships between CD6 and IL18R1 with inflammatory bowel disease and between IL12B and Crohn’s disease. Our data may aid understanding of the molecular regulation of the circulating inflammatory proteome as well as causal relationships between inflammatory mediators and disease

    Genetic basis of common complex traits

    No full text

    Haematological changes from conception to childbirth: An indicator of major pregnancy complications

    No full text
    Background: About 800 women die every day worldwide from pregnancy-related complications, including excessive blood loss, infections and high-blood pressure (World Health Organization, 2019). To improve screening for high-risk pregnancies, we set out to identify patterns of maternal hematological changes associated with future pregnancy complications. Methods: Using mixed effects models, we established changes in 14 complete blood count (CBC) parameters for 1710 healthy pregnancies and compared them to measurements from 98 pregnancy-induced hypertension, 106 gestational diabetes and 339 postpartum hemorrhage cases. Results: Results show interindividual variations, but good individual repeatability in CBC values during physiological pregnancies, allowing the identification of specific alterations in women with obstetric complications. For example, in women with uncomplicated pregnancies, haemoglobin count decreases of 0.12 g/L (95% CI −0.16, −0.09) significantly per gestation week (p value <.001). Interestingly, this decrease is three times more pronounced in women who will develop pregnancy-induced hypertension, with an additional decrease of 0.39 g/L (95% CI −0.51, −0.26). We also confirm that obstetric complications and white CBC predict the likelihood of giving birth earlier during pregnancy. Conclusion: We provide a comprehensive description of the associations between haematological changes through pregnancy and three major obstetric complications to support strategies for prevention, early-diagnosis and maternal care

    The individual and global impact of copy-number variants on complex human traits.

    No full text
    The impact of copy-number variations (CNVs) on complex human traits remains understudied. We called CNVs in 331,522 UK Biobank participants and performed genome-wide association studies (GWASs) between the copy number of CNV-proxy probes and 57 continuous traits, revealing 131 signals spanning 47 phenotypes. Our analysis recapitulated well-known associations (e.g., 1q21 and height), revealed the pleiotropy of recurrent CNVs (e.g., 26 and 16 traits for 16p11.2-BP4-BP5 and 22q11.21, respectively), and suggested gene functionalities (e.g., MARF1 in female reproduction). Forty-eight CNV signals (38%) overlapped with single-nucleotide polymorphism (SNP)-GWASs signals for the same trait. For instance, deletion of PDZK1, which encodes a urate transporter scaffold protein, decreased serum urate levels, while deletion of RHD, which encodes the Rhesus blood group D antigen, associated with hematological traits. Other signals overlapped Mendelian disorder regions, suggesting variable expressivity and broad impact of these loci, as illustrated by signals mapping to Rotor syndrome (SLCO1B1/3), renal cysts and diabetes syndrome (HNF1B), or Charcot-Marie-Tooth (PMP22) loci. Total CNV burden negatively impacted 35 traits, leading to increased adiposity, liver/kidney damage, and decreased intelligence and physical capacity. Thirty traits remained burden associated after correcting for CNV-GWAS signals, pointing to a polygenic CNV architecture. The burden negatively correlated with socio-economic indicators, parental lifespan, and age (survivorship proxy), suggesting a contribution to decreased longevity. Together, our results showcase how studying CNVs can expand biological insights, emphasizing the critical role of this mutational class in shaping human traits and arguing in favor of a continuum between Mendelian and complex diseases

    Identification and Characterization of Mediators of Fluconazole Tolerance in Candida albicans

    No full text
    International audienceCandida albicans is an important human pathogen and a major concern in intensive care units around the world. C. albicans infections are associated with a high mortality despite the use of antifungal treatments. One of the causes of therapeutic failures is the acquisition of antifungal resistance by mutations in the C. albicans genome. Fluconazole (FLC) is one of the most widely used antifungal and mechanisms of FLC resistance occurring by mutations have been extensively investigated. However, some clinical isolates are known to be able to survive at high FLC concentrations without acquiring resistance mutations, a phenotype known as tolerance. Mechanisms behind FLC tolerance are not well studied, mainly due to the lack of a proper way to identify and quantify tolerance in clinical isolates. We proposed here culture conditions to investigate FLC tolerance as well as an easy and efficient method to identity and quantify tolerance to FLC. The screening of C. albicans strain collections revealed that FLC tolerance is pH- and strain-dependent, suggesting the involvement of multiple mechanisms. Here, we addressed the identification of FLC tolerance mediators in C. albicans by an overexpression strategy focusing on 572 C. albicans genes. This strategy led to the identification of two transcription factors, CRZ1 and GZF3. CRZ1 is a C2H2-type transcription factor that is part of the calcineurin-dependent pathway in C. albicans, while GZF3 is a GATA-type transcription factor of unknown function in C. albicans. Overexpression of each gene resulted in an increase of FLC tolerance, however, only the deletion of CRZ1 in clinical FLC-tolerant strains consistently decreased their FLC tolerance. Transcription profiling of clinical isolates with variable levels of FLC tolerance confirmed a calcineurin-dependent signature in these isolates when exposed to FLC

    Probabilistic inference of the genetic architecture underlying functional enrichment of complex traits

    Get PDF
    We develop a Bayesian model (BayesRR-RC) that provides robust SNP-heritability estimation, an alternative to marker discovery, and accurate genomic prediction, taking 22 seconds per iteration to estimate 8.4 million SNP-effects and 78 SNP-heritability parameters in the UK Biobank. We find that only ≤10% of the genetic variation captured for height, body mass index, cardiovascular disease, and type 2 diabetes is attributable to proximal regulatory regions within 10kb upstream of genes, while 12-25% is attributed to coding regions, 32–44% to introns, and 22-28% to distal 10-500kb upstream regions. Up to 24% of all cis and coding regions of each chromosome are associated with each trait, with over 3,100 independent exonic and intronic regions and over 5,400 independent regulatory regions having ≥95% probability of contributing ≥0.001% to the genetic variance of these four traits. Our open-source software (GMRM) provides a scalable alternative to current approaches for biobank data

    Additional file 2 of multi-method genome- and epigenome-wide studies of inflammatory protein levels in healthy older adults

    No full text
    Additional file 2: Supplementary Tables. The association of pre-adjusted protein levels with biological and technical covariates. Protein levels were adjusted for age, sex, array plate and four genetic principal components (population structure) prior to analyses. Significant associations are emboldened. (Table S1). pQTLs associated with inflammatory biomarker levels from Bayesian penalised regression model (Posterior Inclusion Probability > 95%). (Table S2). All pQTLs associated with inflammatory biomarker levels from ordinary least squares regression model (P  95%). (Table S12). CpGs associated with inflammatory protein biomarkers as identified by linear model (limma) at P < 5.14 × 10− 10. (Table S13). CpGs associated with inflammatory protein biomarkers as identified by mixed linear model (OSCA) at P < 5.14 × 10− 10. (Table S14). Estimate of variance explained for blood protein levels by DNA methylation as well as proportion of explained attributable to different prior mixtures - BayesR+. (Table S15). Comparison of variance in protein levels explained by genome-wide DNA methylation data by mixed linear model (OSCA) and Bayesian penalised regression model (BayesR+). (Table S16). Variance in circulating inflammatory protein biomarker levels explained by common genetic and methylation data (joint and conditional estimates from BayesR+). Ordered by combined variance explained by genetic and epigenetic data - smallest to largest. Significant results from t-tests comparing distributions for variance explained by methylation or genetics alone versus combined estimate are emboldened. (Table S17). Genetic and epigenetic factors identified by BayesR+ when conditioning on all SNPs and CpGs together. (Table S18). Mendelian Randomisation analyses to assess whether proteins with concordantly identified genetic signals are causally associated with Alzheimer’s disease risk. (Table S19)

    Multi-method genome and epigenome wide studies of inflammatory protein levels in healthy older adults - Linear Regression EWAS Proteins

    No full text
    This dataset represents one of five datasets which correspond to the study: "Multi-method genome and epigenome wide studies of inflammatory protein levels in healthy older adults". These datasets represent association studies on the levels of the same set of 70 inflammatory proteins. Each dataset represents one of five distinct methods used to perform genome-wide and epigenome-wide association studies on these protein levels. These methods are: Linear Regression GWAS, Linear Regression EWAS, OSCA EWAS, BayesR+ GWAS and BayesR+ EWAS. These analyses were performed as part of the Lothian Birth Cohort 1936 Study. This data relates to summary statistics for EWAS of 70 Olink inflammation proteins - performed by OLS regression EWAS.Hillary, Robert; Trejo Banos, Daniel; Kousathanas, Athanasios; McCartney, Daniel; Harris, Sarah; Stevenson, Anna; Patxot, Marion; Ojavee, Sven Erik; Zhang, Qian; Liewald, David; Ritchie, Craig; Evans, Kathryn; Tucker-Drob, Elliot; Wray, Naomi; McRae, Allan; Visscher, Peter; Deary, Ian; Robinson, Matthew; Marioni, Riccardo. (2020). Multi-method genome and epigenome wide studies of inflammatory protein levels in healthy older adults - Linear Regression EWAS Proteins, [dataset]. University of Edinburgh. Centre for Cognitive Ageing and Cognitive Epidemiology. https://doi.org/10.7488/ds/281
    corecore