19 research outputs found

    Uncertainty quantification of reference-based cellular deconvolution algorithms

    Get PDF
    This is the final version. Available on open access from Routledge via the DOI in this recordData and code availability: The DNAm data used in this study are available as R packages or via GEO (see Supplementary Table 2 for details). We have provided the code for calculating the CETYGO score as an R package available via GitHub (https://github.com/ds420/CETYGO). The code to reproduce the analyses in this manuscript using our R package are also available via GitHub (https://github.com/ejh243/CETYGOAnalyses).The majority of epigenetic epidemiology studies to date have generated genome-wide profiles from bulk tissues (e.g., whole blood) however these are vulnerable to confounding from variation in cellular composition. Proxies for cellular composition can be mathematically derived from the bulk tissue profiles using a deconvolution algorithm; however, there is no method to assess the validity of these estimates for a dataset where the true cellular proportions are unknown. In this study, we describe, validate and characterize a sample level accuracy metric for derived cellular heterogeneity variables. The CETYGO score captures the deviation between a sample's DNA methylation profile and its expected profile given the estimated cellular proportions and cell type reference profiles. We demonstrate that the CETYGO score consistently distinguishes inaccurate and incomplete deconvolutions when applied to reconstructed whole blood profiles. By applying our novel metric to >6,300 empirical whole blood profiles, we find that estimating accurate cellular composition is influenced by both technical and biological variation. In particular, we show that when using a common reference panel for whole blood, less accurate estimates are generated for females, neonates, older individuals and smokers. Our results highlight the utility of a metric to assess the accuracy of cellular deconvolution, and describe how it can enhance studies of DNA methylation that are reliant on statistical proxies for cellular heterogeneity. To facilitate incorporating our methodology into existing pipelines, we have made it freely available as an R package (https://github.com/ds420/CETYGO).Biotechnology and Biological Sciences Research Council (BBSRC)Engineering and Physical Sciences Research Council (EPSRC)Medical Research Council (MRC)Alzheimer’s Societ

    Recalibrating the epigenetic clock: implications for assessing biological age in the human cortex.

    Get PDF
    Human DNA methylation data have been used to develop biomarkers of ageing, referred to as 'epigenetic clocks', which have been widely used to identify differences between chronological age and biological age in health and disease including neurodegeneration, dementia and other brain phenotypes. Existing DNA methylation clocks have been shown to be highly accurate in blood but are less precise when used in older samples or in tissue types not included in training the model, including brain. We aimed to develop a novel epigenetic clock that performs optimally in human cortex tissue and has the potential to identify phenotypes associated with biological ageing in the brain. We generated an extensive dataset of human cortex DNA methylation data spanning the life course (n = 1397, ages = 1 to 108 years). This dataset was split into 'training' and 'testing' samples (training: n = 1047; testing: n = 350). DNA methylation age estimators were derived using a transformed version of chronological age on DNA methylation at specific sites using elastic net regression, a supervised machine learning method. The cortical clock was subsequently validated in a novel independent human cortex dataset (n = 1221, ages = 41 to 104 years) and tested for specificity in a large whole blood dataset (n = 1175, ages = 28 to 98 years). We identified a set of 347 DNA methylation sites that, in combination, optimally predict age in the human cortex. The sum of DNA methylation levels at these sites weighted by their regression coefficients provide the cortical DNA methylation clock age estimate. The novel clock dramatically outperformed previously reported clocks in additional cortical datasets. Our findings suggest that previous associations between predicted DNA methylation age and neurodegenerative phenotypes might represent false positives resulting from clocks not robustly calibrated to the tissue being tested and for phenotypes that become manifest in older ages. The age distribution and tissue type of samples included in training datasets need to be considered when building and applying epigenetic clock algorithms to human epidemiological or disease cohorts

    Genomic and phenotypic insights from an atlas of genetic effects on DNA methylation

    Get PDF
    DNA methylation quantitative trait locus (mQTL) analyses on 32,851 participants identify genetic variants associated with DNA methylation at 420,509 sites in blood, resulting in a database of >270,000 independent mQTLs.Characterizing genetic influences on DNA methylation (DNAm) provides an opportunity to understand mechanisms underpinning gene regulation and disease. In the present study, we describe results of DNAm quantitative trait locus (mQTL) analyses on 32,851 participants, identifying genetic variants associated with DNAm at 420,509 DNAm sites in blood. We present a database of >270,000 independent mQTLs, of which 8.5% comprise long-range (trans) associations. Identified mQTL associations explain 15-17% of the additive genetic variance of DNAm. We show that the genetic architecture of DNAm levels is highly polygenic. Using shared genetic control between distal DNAm sites, we constructed networks, identifying 405 discrete genomic communities enriched for genomic annotations and complex traits. Shared genetic variants are associated with both DNAm levels and complex diseases, but only in a minority of cases do these associations reflect causal relationships from DNAm to trait or vice versa, indicating a more complex genotype-phenotype map than previously anticipated.Molecular Epidemiolog

    SARS-CoV-2 infects the human kidney and drives fibrosis in kidney organoids

    Get PDF
    Kidney failure is frequently observed during and after COVID-19, but it remains elusive whether this is a direct effect of the virus. Here, we report that SARS-CoV-2 directly infects kidney cells and is associated with increased tubule-interstitial kidney fibrosis in patient autopsy samples. To study direct effects of the virus on the kidney independent of systemic effects of COVID-19, we infected human-induced pluripotent stem-cell-derived kidney organoids with SARS-CoV-2. Single-cell RNA sequencing indicated injury and dedifferentiation of infected cells with activation of profibrotic signaling pathways. Importantly, SARS-CoV-2 infection also led to increased collagen 1 protein expression in organoids. A SARS-CoV-2 protease inhibitor was able to ameliorate the infection of kidney cells by SARS-CoV-2. Our results suggest that SARS-CoV-2 can directly infect kidney cells and induce cell injury with subsequent fibrosis. These data could explain both acute kidney injury in COVID-19 patients and the development of chronic kidney disease in long COVID

    Structure, mechanism and crystallographic fragment screening of the SARS-CoV-2 NSP13 helicase

    No full text
    There is currently a lack of effective drugs to treat people infected with SARS-CoV-2, the cause of the global COVID-19 pandemic. The SARS-CoV-2 Non-structural protein 13 (NSP13) has been identified as a target for anti-virals due to its high sequence conservation and essential role in viral replication. Structural analysis reveals two “druggable” pockets on NSP13 that are among the most conserved sites in the entire SARS-CoV-2 proteome. Here we present crystal structures of SARS-CoV-2 NSP13 solved in the APO form and in the presence of both phosphate and a non-hydrolysable ATP analog. Comparisons of these structures reveal details of conformational changes that provide insights into the helicase mechanism and possible modes of inhibition. To identify starting points for drug development we have performed a crystallographic fragment screen against NSP13. The screen reveals 65 fragment hits across 52 datasets opening the way to structure guided development of novel antiviral agents

    Crystal Structures of SARS-CoV-2 main protease with screening fragments and COVID Moonshot compounds from the XChem facility at Diamond Light Source

    No full text
    <p>Bulk repositiory of structures of SARS-CoV-2 main protease in complex with fragment molecules from inital XChem screen and designed COVID Moonshot inhibtor compounds. Each structure has a PDB ID, coordinate file, structure factor file, ligand restraint (cif) and PANDDA event maps (as appropriate).</p><p>2023-10-26 - updated to include <strong>all </strong>initial fragment screening hits alongside follow up compounds</p&gt

    Fragment binding to the Nsp3 macrodomain of SARS-CoV-2 identified through crystallographic screening and computational docking

    No full text
    The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) macrodomain within the nonstructural protein 3 counteracts host-mediated antiviral adenosine diphosphate–ribosylation signaling. This enzyme is a promising antiviral target because catalytic mutations render viruses nonpathogenic. Here, we report a massive crystallographic screening and computational docking effort, identifying new chemical matter primarily targeting the active site of the macrodomain. Crystallographic screening of 2533 diverse fragments resulted in 214 unique macrodomain-binders. An additional 60 molecules were selected from docking more than 20 million fragments, of which 20 were crystallographically confirmed. X-ray data collection to ultra-high resolution and at physiological temperature enabled assessment of the conformational heterogeneity around the active site. Several fragment hits were confirmed by solution binding using three biophysical techniques (differential scanning fluorimetry, homogeneous time-resolved fluorescence, and isothermal titration calorimetry). The 234 fragment structures explore a wide range of chemotypes and provide starting points for development of potent SARS-CoV-2 macrodomain inhibitors
    corecore