13 research outputs found

    LD Hub:a centralized database and web interface to perform LD score regression that maximizes the potential of summary level GWAS data for SNP heritability and genetic correlation analysis

    Get PDF
    Motivation: LD score regression is a reliable and efficient method of using genome-wide association study (GWAS) summary-level results data to estimate the SNP heritability of complex traits and diseases, partition this heritability into functional categories, and estimate the genetic correlation between different phenotypes. Because the method relies on summary level results data, LD score regression is computationally tractable even for very large sample sizes. However, publicly available GWAS summary-level data are typically stored in different databases and have different formats, making it difficult to apply LD score regression to estimate genetic correlations across many different traits simultaneously. Results: In this manuscript, we describe LD Hub - a centralized database of summary-level GWAS results for 173 diseases/traits from different publicly available resources/consortia and a web interface that automates the LD score regression analysis pipeline. To demonstrate functionality and validate our software, we replicated previously reported LD score regression analyses of 49 traits/diseases using LD Hub; and estimated SNP heritability and the genetic correlation across the different phenotypes. We also present new results obtained by uploading a recent atopic dermatitis GWAS meta-analysis to examine the genetic correlation between the condition and other potentially related traits. In response to the growing availability of publicly accessible GWAS summary-level results data, our database and the accompanying web interface will ensure maximal uptake of the LD score regression methodology, provide a useful database for the public dissemination of GWAS results, and provide a method for easily screening hundreds of traits for overlapping genetic aetiologies. Availability and implementation: The web interface and instructions for using LD Hub are available at http://ldsc.broadinstitute.org/<br/

    Importance of Genetic Studies in Consanguineous Populations for the Characterization of Novel Human Gene Functions.

    No full text
    Consanguineous offspring have elevated levels of homozygosity. Autozygous stretches within their genome are likely to harbour loss of function (LoF) mutations which will lead to complete inactivation or dysfunction of genes. Studying consanguineous offspring with clinical phenotypes has been very useful for identifying disease causal mutations. However, at present, most of the genes in the human genome have no disorder associated with them or have unknown function. This is presumably mostly due to the fact that homozygous LoF variants are not observed in outbred populations which are the main focus of large sequencing projects. However, another reason may be that many genes in the genome-even when completely "knocked out," do not cause a distinct or defined phenotype. Here, we discuss the benefits and implications of studying consanguineous populations, as opposed to the traditional approach of analysing a subset of consanguineous families or individuals with disease. We suggest that studying consanguineous populations "as a whole" can speed up the characterisation of novel gene functions as well as indicating nonessential genes and/or regions in the human genome. We also suggest designing a single nucleotide variant (SNV) array to make the process more efficient

    Importance of Genetic Studies in Consanguineous Populations for the Characterization of Novel Human Gene Functions.

    Get PDF
    Consanguineous offspring have elevated levels of homozygosity. Autozygous stretches within their genome are likely to harbour loss of function (LoF) mutations which will lead to complete inactivation or dysfunction of genes. Studying consanguineous offspring with clinical phenotypes has been very useful for identifying disease causal mutations. However, at present, most of the genes in the human genome have no disorder associated with them or have unknown function. This is presumably mostly due to the fact that homozygous LoF variants are not observed in outbred populations which are the main focus of large sequencing projects. However, another reason may be that many genes in the genome-even when completely "knocked out," do not cause a distinct or defined phenotype. Here, we discuss the benefits and implications of studying consanguineous populations, as opposed to the traditional approach of analysing a subset of consanguineous families or individuals with disease. We suggest that studying consanguineous populations "as a whole" can speed up the characterisation of novel gene functions as well as indicating nonessential genes and/or regions in the human genome. We also suggest designing a single nucleotide variant (SNV) array to make the process more efficient

    Nonsense Mutation in Coiled-Coil Domain Containing 151 Gene (CCDC151) Causes Primary Ciliary Dyskinesia

    Get PDF
    Primary ciliary dyskinesia (PCD) is an autosomal-recessive disorder characterized by impaired ciliary function that leads to subsequent clinical phenotypes such as chronic sinopulmonary disease. PCD is also a genetically heterogeneous disorder with many single gene mutations leading to similar clinical phenotypes. Here, we present a novel PCD causal gene, coiled-coil domain containing 151 (CCDC151), which has been shown to be essential in motile cilia of many animals and other vertebrates but its effects in humans was not observed until currently. We observed a novel nonsense mutation in a homozygous state in the CCDC151 gene (NM_145045.4:c.925G>T:p.[E309*]) in a clinically diagnosed PCD patient from a consanguineous family of Arabic ancestry. The variant was absent in 238 randomly selected individuals indicating that the variant is rare and likely not to be a founder mutation. Our finding also shows that given prior knowledge from model organisms, even a single whole-exome sequence can be sufficient to discover a novel causal gene

    Nonsense Mutation in Coiled-Coil Domain Containing 151 Gene (CCDC151) Causes Primary Ciliary Dyskinesia

    No full text
    Primary ciliary dyskinesia (PCD) is an autosomal-recessive disorder characterized by impaired ciliary function that leads to subsequent clinical phenotypes such as chronic sinopulmonary disease. PCD is also a genetically heterogeneous disorder with many single gene mutations leading to similar clinical phenotypes. Here, we present a novel PCD causal gene, coiled-coil domain containing 151 (CCDC151), which has been shown to be essential in motile cilia of many animals and other vertebrates but its effects in humans was not observed until currently. We observed a novel nonsense mutation in a homozygous state in the CCDC151 gene (NM_145045.4:c.925G>T:p.[E309*]) in a clinically diagnosed PCD patient from a consanguineous family of Arabic ancestry. The variant was absent in 238 randomly selected individuals indicating that the variant is rare and likely not to be a founder mutation. Our finding also shows that given prior knowledge from model organisms, even a single whole-exome sequence can be sufficient to discover a novel causal gene

    HAPRAP: a haplotype-based iterative method for statistical fine mapping using GWAS summary statistics.

    No full text
    MOTIVATION: Fine mapping is a widely used approach for identifying the causal variant(s) at disease-associated loci. Standard methods (e.g. multiple regression) require individual level genotypes. Recent fine mapping methods using summary-level data require the pairwise correlation coefficients (r(2)) of the variants. However, haplotypes rather than pairwise r(2), are the true biological representation of linkage disequilibrium (LD) among multiple loci. In this paper, we present an empirical iterative method, HAPlotype Regional Association analysis Program (HAPRAP), that enables fine mapping using summary statistics and haplotype information from an individual-level reference panel. RESULTS: Simulations with individual-level genotypes show that the results of HAPRAP and multiple regression are highly consistent. In simulation with summary-level data, we demonstrate that HAPRAP is less sensitive to poor LD estimates. In a parametric simulation using Genetic Investigation of ANthropometric Traits (GIANT) height data, HAPRAP performs well with a small training sample size (N<2000) while other methods become suboptimal. Moreover, HAPRAP's performance is not affected substantially by SNPs with low minor allele frequencies. We applied the method to existing quantitative trait and binary outcome meta-analyses (human height, QTc interval and gallbladder disease); all previous reported association signals were replicated and two additional variants were independently associated with human height. Due to the growing availability of summary level data, the value of HAPRAP is likely to increase markedly for future analyses (e.g. functional prediction and identification of instruments for Mendelian randomization). AVAILABILITY: The HAPRAP package and documentation are available online: http://apps.biocompute.org.uk/haprap

    LD Hub: a centralized database and web interface to perform LD score regression that maximizes the potential of summary level GWAS data for SNP heritability and genetic correlation analysis.

    No full text
    MOTIVATION: LD score regression is a reliable and efficient method of using genome-wide association study (GWAS) summary-level results data to estimate the SNP heritability of complex traits and diseases, partition this heritability into functional categories, and estimate the genetic correlation between different phenotypes. Because the method relies on summary level results data, LD score regression is computationally tractable even for very large sample sizes. However, publicly available GWAS summary-level data are typically stored in different databases and have different formats, making it difficult to apply LD score regression to estimate genetic correlations across many different traits simultaneously. RESULTS: In this manuscript, we describe LD Hub - a centralized database of summary-level GWAS results for 173 diseases/traits from different publicly available resources/consortia and a web interface that automates the LD score regression analysis pipeline. To demonstrate functionality and validate our software, we replicated previously reported LD score regression analyses of 49 traits/diseases using LD Hub; and estimated SNP heritability and the genetic correlation across the different phenotypes. We also present new results obtained by uploading a recent atopic dermatitis GWAS meta-analysis to examine the genetic correlation between the condition and other potentially related traits. In response to the growing availability of publicly accessible GWAS summary-level results data, our database and the accompanying web interface will ensure maximal uptake of the LD score regression methodology, provide a useful database for the public dissemination of GWAS results, and provide a method for easily screening hundreds of traits for overlapping genetic aetiologies. AVAILABILITY AND IMPLEMENTATION: The web interface and instructions for using LD Hub are available at http://ldsc.broadinstitute.org/ CONTACT: [email protected] SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online

    Mendelian randomization of blood lipids for coronary heart disease

    No full text
    AIMS: To investigate the causal role of high-density lipoprotein cholesterol (HDL-C) and triglycerides in coronary heart disease (CHD) using multiple instrumental variables for Mendelian randomization. METHODS AND RESULTS: We developed weighted allele scores based on single nucleotide polymorphisms (SNPs) with established associations with HDL-C, triglycerides, and low-density lipoprotein cholesterol (LDL-C). For each trait, we constructed two scores. The first was unrestricted, including all independent SNPs associated with the lipid trait identified from a prior meta-analysis (threshold P < 2 × 10(-6)); and the second a restricted score, filtered to remove any SNPs also associated with either of the other two lipid traits at P ≤ 0.01. Mendelian randomization meta-analyses were conducted in 17 studies including 62,199 participants and 12,099 CHD events. Both the unrestricted and restricted allele scores for LDL-C (42 and 19 SNPs, respectively) associated with CHD. For HDL-C, the unrestricted allele score (48 SNPs) was associated with CHD (OR: 0.53; 95% CI: 0.40, 0.70), per 1 mmol/L higher HDL-C, but neither the restricted allele score (19 SNPs; OR: 0.91; 95% CI: 0.42, 1.98) nor the unrestricted HDL-C allele score adjusted for triglycerides, LDL-C, or statin use (OR: 0.81; 95% CI: 0.44, 1.46) showed a robust association. For triglycerides, the unrestricted allele score (67 SNPs) and the restricted allele score (27 SNPs) were both associated with CHD (OR: 1.62; 95% CI: 1.24, 2.11 and 1.61; 95% CI: 1.00, 2.59, respectively) per 1-log unit increment. However, the unrestricted triglyceride score adjusted for HDL-C, LDL-C, and statin use gave an OR for CHD of 1.01 (95% CI: 0.59, 1.75). CONCLUSION: The genetic findings support a causal effect of triglycerides on CHD risk, but a causal role for HDL-C, though possible, remains less certain
    corecore