75 research outputs found

    A method for identifying genetic heterogeneity within phenotypically defined disease subgroups.

    Get PDF
    Many common diseases show wide phenotypic variation. We present a statistical method for determining whether phenotypically defined subgroups of disease cases represent different genetic architectures, in which disease-associated variants have different effect sizes in two subgroups. Our method models the genome-wide distributions of genetic association statistics with mixture Gaussians. We apply a global test without requiring explicit identification of disease-associated variants, thus maximizing power in comparison to standard variant-by-variant subgroup analysis. Where evidence for genetic subgrouping is found, we present methods for post hoc identification of the contributing genetic variants. We demonstrate the method on a range of simulated and test data sets, for which expected results are already known. We investigate subgroups of individuals with type 1 diabetes (T1D) defined by autoantibody positivity, establishing evidence for differential genetic architecture with positivity for thyroid-peroxidase-specific antibody, driven generally by variants in known T1D-associated genomic regions.We acknowledge the help of the Diabetes and Inflammation Laboratory Data Service for access and quality control procedures on the data sets used in this study. The JDRF/Wellcome Trust Diabetes and Inflammation Laboratory is in receipt of a Wellcome Trust Strategic Award (107212; J.A.T.) and receives funding from the NIHR Cambridge Biomedical Research Centre. J.L. is funded by the NIHR Cambridge Biomedical Research Centre and is on the Wellcome Trust PhD program in Mathematical Genomics and Medicine at the University of Cambridge. C.W. is funded by the MRC (grant MC_UP_1302/5). We thank M. Simmonds, S. Gough, J. Franklyn, and O. Brand for sharing their AITD genetic association data set and all patients with AITD and control subjects for participating in this study. The AITD UK national collection was funded by the Wellcome Trust. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript

    Identification of Type 1 Diabetes-Associated DNA Methylation Variable Positions That Precede Disease Diagnosis

    Get PDF
    Monozygotic (MZ) twin pair discordance for childhood-onset Type 1 Diabetes (T1D) is similar to 50%, implicating roles for genetic and non-genetic factors in the aetiology of this complex autoimmune disease. Although significant progress has been made in elucidating the genetics of T1D in recent years, the non-genetic component has remained poorly defined. We hypothesized that epigenetic variation could underlie some of the non-genetic component of T1D aetiology and, thus, performed an epigenome-wide association study (EWAS) for this disease. We generated genome-wide DNA methylation profiles of purified CD14(+) monocytes (an immune effector cell type relevant to T1D pathogenesis) from 15 T1D-discordant MZ twin pairs. This identified 132 different CpG sites at which the direction of the intra-MZ pair DNA methylation difference significantly correlated with the diabetic state, i.e. T1D-associated methylation variable positions (T1D-MVPs). We confirmed these T1D-MVPs display statistically significant intra-MZ pair DNA methylation differences in the expected direction in an independent set of T1D-discordant MZ pairs (P = 0.035). Then, to establish the temporal origins of the T1D-MVPs, we generated two further genome-wide datasets and established that, when compared with controls, T1D-MVPs are enriched in singletons both before (P = 0.001) and at (P = 0.015) disease diagnosis, and also in singletons positive for diabetes-associated autoantibodies but disease-free even after 12 years follow-up (P = 0.0023). Combined, these results suggest that T1D-MVPs arise very early in the etiological process that leads to overt T1D. Our EWAS of T1D represents an important contribution toward understanding the etiological role of epigenetic variation in type 1 diabetes, and it is also the first systematic analysis of the temporal origins of disease-associated epigenetic variation for any human complex disease

    The Genetic Interpretation of Area under the ROC Curve in Genomic Profiling

    Get PDF
    Genome-wide association studies in human populations have facilitated the creation of genomic profiles which combine the effects of many associated genetic variants to predict risk of disease. The area under the receiver operator characteristic (ROC) curve is a well established measure for determining the efficacy of tests in correctly classifying diseased and non-diseased individuals. We use quantitative genetics theory to provide insight into the genetic interpretation of the area under the ROC curve (AUC) when the test classifier is a predictor of genetic risk. Even when the proportion of genetic variance explained by the test is 100%, there is a maximum value for AUC that depends on the genetic epidemiology of the disease, i.e. either the sibling recurrence risk or heritability and disease prevalence. We derive an equation relating maximum AUC to heritability and disease prevalence. The expression can be reversed to calculate the proportion of genetic variance explained given AUC, disease prevalence, and heritability. We use published estimates of disease prevalence and sibling recurrence risk for 17 complex genetic diseases to calculate the proportion of genetic variance that a test must explain to achieve AUC = 0.75; this varied from 0.10 to 0.74. We provide a genetic interpretation of AUC for use with predictors of genetic risk based on genomic profiles. We provide a strategy to estimate proportion of genetic variance explained on the liability scale from estimates of AUC, disease prevalence, and heritability (or sibling recurrence risk) available as an online calculator

    Genome-Wide Analysis of Copy Number Variation in Type 1 Diabetes

    Get PDF
    Type 1 diabetes (T1D) tends to cluster in families, suggesting there may be a genetic component predisposing to disease. However, a recent large-scale genome-wide association study concluded that identified genetic factors, single nucleotide polymorphisms, do not account for overall familiality. Another class of genetic variation is the amplification or deletion of >1 kilobase segments of the genome, also termed copy number variations (CNVs). We performed genome-wide CNV analysis on a cohort of 20 unrelated adults with T1D and a control (Ctrl) cohort of 20 subjects using the Affymetrix SNP Array 6.0 in combination with the Birdsuite copy number calling software. We identified 39 CNVs as enriched or depleted in T1D versus Ctrl. Additionally, we performed CNV analysis in a group of 10 monozygotic twin pairs discordant for T1D. Eleven of these 39 CNVs were also respectively enriched or depleted in the Twin cohort, suggesting that these variants may be involved in the development of islet autoimmunity, as the presently unaffected twin is at high risk for developing islet autoimmunity and T1D in his or her lifetime. These CNVs include a deletion on chromosome 6p21, near an HLA-DQ allele. CNVs were found that were both enriched or depleted in patients with or at high risk for developing T1D. These regions may represent genetic variants contributing to development of islet autoimmunity in T1D

    Dietary iron intake in the first 4 months of infancy and the development of type 1 diabetes: a pilot study

    Get PDF
    <p>Abstract</p> <p>Aims</p> <p>To investigate the impact of iron intake on the development of type 1 diabetes (T1DM).</p> <p>Methods</p> <p>Case-control study with self-administered questionnaire among families of children with T1DM who were less than 10 years old at the time of the survey and developed diabetes between age 1 and 6 years. Data on the types of infant feeding in the first 4 months of life was collected from parents of children with T1DM (n = 128) and controls (n = 67) <10 years old. Because some cases had sibling controls, we used conditional logistic regression models to analyze the data in two ways. First we performed a case-control analysis of all 128 cases and 67 controls. Next, we performed a case-control analysis restricted to cases (n = 59) that had a sibling without diabetes (n = 59). Total iron intake was modeled as one standard deviation (SD) increase in iron intake. The SD for iron intake was 540 mg in the total sample and 539 mg in the restricted sample as defined above.</p> <p>Results</p> <p>The median (min, max) total iron intake in the first 4 months of life was 1159 (50, 2399) mg in T1DM cases and 466 (50, 1224) mg among controls (<it>P </it>< 0.001). For each one standard deviation increase in iron intake, the odds ratio (95% confidence interval) for type 1 diabetes was 2.01 (1.183, 3.41) among all participants (128 cases and 67 controls) while it was 2.26 (1.27, 4.03) in a restricted sample of T1 D cases with a control sibling (59 cases and 59 controls) in models adjusted for birth weight, age at the time of the survey, and birth order.</p> <p>Conclusion</p> <p>In this pilot study, high iron intake in the first 4 months of infancy is associated with T1DM. Whether iron intake is causal or a marker of another risk factor warrants further investigation.</p

    The local and systemic response to SARS-CoV-2 infection in children and adults

    Get PDF
    While a substantial proportion of adults infected with SARS-CoV-2 progress to develop severe disease, children rarely manifest respiratory complications. Therefore, understanding differences in the local and systemic response to SARS-CoV-2 infection between children and adults may provide important clues about the pathogenesis of SARS-CoV-2 infection. To address this, we first generated a healthy reference multi-omics single cell data set from children (n=30) in whom we have profiled triple matched samples: nasal and tracheal brushings and PBMCs, where we track the developmental changes for 42 airway and 31 blood cell populations from infancy, through childhood to adolescence. This has revealed the presence of naive B and T lymphocytes in neonates and infants with a unique gene expression signature bearing hallmarks of innate immunity. We then contrast the healthy reference with equivalent data from severe paediatric and adult COVID-19 patients (total n=27), from the same three types of samples: upper and lower airways and blood. We found striking differences: children with COVID-19 as opposed to adults had a higher proportion of innate lymphoid and non-clonally expanded naive T cells in peripheral blood, and a limited interferon-response signature. In the airway epithelium, we found the highest viral load in goblet and ciliated cells and describe a novel inflammatory epithelial cell population. These cells represent a transitional regenerative state between secretory and ciliated cells; they were found in healthy children and were enriched in paediatric and adult COVID-19 patients. Epithelial cells display an antiviral and neutrophil-recruiting gene signature that is weaker in severe paediatric versus adult COVID-19. Our matched blood and airway samples allowed us to study the spatial dynamics of infection. Lastly, we provide a user-friendly interface for this data1 as a highly granular reference for the study of immune responses in airways and blood in children

    Local and systemic responses to SARS-CoV-2 infection in children and adults

    Get PDF
    It is not fully understood why COVID-19 is typically milder in children1–3. To examine differences in response to SARS-CoV-2 infection in children and adults, we analysed paediatric and adult COVID-19 patients and healthy controls (total n=93) using single-cell multi-omic profiling of matched nasal, tracheal, bronchial and blood samples. In healthy paediatric airways, we observed cells already in an interferon-activated state, that upon SARS-CoV-2 infection was further induced especially in airway immune cells. We postulate that higher paediatric innate interferon-responses restrict viral replication and disease progression. The systemic response in children was characterised by increases in naive lymphocytes and a depletion of natural killer cells, while in adults cytotoxic T cells and interferon-stimulated subpopulations were significantly increased. We provide evidence that dendritic cells initiate interferon signaling in early infection, and identify novel epithelial cell states that associate with COVID-19 and age. Our matching nasal and blood data showed a strong interferon response in the airways with the induction of systemic interferon-stimulated populations, which were massively reduced in paediatric patients. Together, we provide several mechanisms that explain the milder clinical syndrome observed in children

    From Disease Association to Risk Assessment: An Optimistic View from Genome-Wide Association Studies on Type 1 Diabetes

    Get PDF
    Genome-wide association studies (GWAS) have been fruitful in identifying disease susceptibility loci for common and complex diseases. A remaining question is whether we can quantify individual disease risk based on genotype data, in order to facilitate personalized prevention and treatment for complex diseases. Previous studies have typically failed to achieve satisfactory performance, primarily due to the use of only a limited number of confirmed susceptibility loci. Here we propose that sophisticated machine-learning approaches with a large ensemble of markers may improve the performance of disease risk assessment. We applied a Support Vector Machine (SVM) algorithm on a GWAS dataset generated on the Affymetrix genotyping platform for type 1 diabetes (T1D) and optimized a risk assessment model with hundreds of markers. We subsequently tested this model on an independent Illumina-genotyped dataset with imputed genotypes (1,008 cases and 1,000 controls), as well as a separate Affymetrix-genotyped dataset (1,529 cases and 1,458 controls), resulting in area under ROC curve (AUC) of ∼0.84 in both datasets. In contrast, poor performance was achieved when limited to dozens of known susceptibility loci in the SVM model or logistic regression model. Our study suggests that improved disease risk assessment can be achieved by using algorithms that take into account interactions between a large ensemble of markers. We are optimistic that genotype-based disease risk assessment may be feasible for diseases where a notable proportion of the risk has already been captured by SNP arrays
    corecore