412 research outputs found

    Capability of common SNPs to tag rare variants

    Get PDF
    Genome-wide association studies are based on the linkage disequilibrium pattern between common tagging single-nucleotide polymorphisms (SNPs) (i.e., SNPs having only common alleles) and true causal variants, and association studies with rare SNP alleles aim to detect rare causal variants. To better understand and explain the findings from both types of studies and to provide clues to improve the power of an association study with only common SNPs genotyped, we study the correlation between common SNPs and the presence of rare alleles within a region in the genome and look at the capability of common SNPs in strong linkage disequilibrium with each other to capture single rare alleles. Our results indicate that common SNPs can, to some extent, tag the presence of rare alleles and that including SNPs in strong linkage disequilibrium with each other among the tagging SNPs helps to detect rare alleles

    Genomic architecture of inflammatory bowel disease in five families with multiple affected individuals.

    Full text link
    Currently, the best clinical predictor for inflammatory bowel disease (IBD) is family history. Over 163 sequence variants have been associated with IBD in genome-wide association studies, but they have weak effects and explain only a fraction of the observed heritability. It is expected that additional variants contribute to the genomic architecture of IBD, possibly including rare variants with effect sizes larger than the identified common variants. Here we applied a family study design and sequenced 38 individuals from five families, under the hypothesis that families with multiple IBD-affected individuals harbor one or more risk variants that (i) are shared among affected family members, (ii) are rare and (iii) have substantial effect on disease development. Our analysis revealed not only novel candidate risk variants but also high polygenic risk scores for common known risk variants in four out of the five families. Functional analysis of our top novel variant in the remaining family, a rare missense mutation in the ubiquitin ligase TRIM11, suggests that it leads to increased nuclear factor of kappa light chain enhancer in B-cells (NF-κB) signaling. We conclude that an accumulation of common weak-effect variants accounts for the high incidence of IBD in most, but not all families we analyzed and that a family study design can identify novel rare variants conferring risk for IBD with potentially large effect size, such as the TRIM11 p.H414Y mutation

    Collapsing-based and kernel-based single-gene analyses applied to Genetic Analysis Workshop 17 mini-exome data

    Get PDF
    Recently there has been great interest in identifying rare variants associated with common diseases. We apply several collapsing-based and kernel-based single-gene association tests to Genetic Analysis Workshop 17 (GAW17) rare variant association data with unrelated individuals without knowledge of the simulation model. We also implement modified versions of these methods using additional information, such as minor allele frequency (MAF) and functional annotation. For each of four given traits provided in GAW17, we use the Bayesian mixed-effects model to estimate the phenotypic variance explained by the given environmental and genotypic data and to infer an individual-specific genetic effect to use directly in single-gene association tests. After obtaining information on the GAW17 simulation model, we compare the performance of all methods and examine the top genes identified by those methods. We find that collapsing-based methods with weights based on MAFs are sensitive to the “lower MAF, larger effect size” assumption, whereas kernel-based methods are more robust when this assumption is violated. In addition, many false-positive genes identified by multiple methods often contain variants with exactly the same genotype distribution as the causal variants used in the simulation model. When the sample size is much smaller than the number of rare variants, it is more likely that causal and noncausal variants will share the same or similar genotype distribution. This likely contributes to the low power and large number of false-positive results of all methods in detecting causal variants associated with disease in the GAW17 data set

    Power analysis for genome-wide association studies

    Get PDF
    Abstract Background Genome-wide association studies are a promising new tool for deciphering the genetics of complex diseases. To choose the proper sample size and genotyping platform for such studies, power calculations that take into account genetic model, tag SNP selection, and the population of interest are required. Results The power of genome-wide association studies can be computed using a set of tag SNPs and a large number of genotyped SNPs in a representative population, such as available through the HapMap project. As expected, power increases with increasing sample size and effect size. Power also depends on the tag SNPs selected. In some cases, more power is obtained by genotyping more individuals at fewer SNPs than fewer individuals at more SNPs. Conclusion Genome-wide association studies should be designed thoughtfully, with the choice of genotyping platform and sample size being determined from careful power calculations.</p

    Genetic Evidence Supporting the Association of Protease and Protease Inhibitor Genes with Inflammatory Bowel Disease: A Systematic Review

    Get PDF
    As part of the European research consortium IBDase, we addressed the role of proteases and protease inhibitors (P/PIs) in inflammatory bowel disease (IBD), characterized by chronic mucosal inflammation of the gastrointestinal tract, which affects 2.2 million people in Europe and 1.4 million people in North America. We systematically reviewed all published genetic studies on populations of European ancestry (67 studies on Crohn's disease [CD] and 37 studies on ulcerative colitis [UC]) to identify critical genomic regions associated with IBD. We developed a computer algorithm to map the 807 P/PI genes with exact genomic locations listed in the MEROPS database of peptidases onto these critical regions and to rank P/PI genes according to the accumulated evidence for their association with CD and UC. 82 P/PI genes (75 coding for proteases and 7 coding for protease inhibitors) were retained for CD based on the accumulated evidence. The cylindromatosis/turban tumor syndrome gene (CYLD) on chromosome 16 ranked highest, followed by acylaminoacyl-peptidase (APEH), dystroglycan (DAG1), macrophage-stimulating protein (MST1) and ubiquitin-specific peptidase 4 (USP4), all located on chromosome 3. For UC, 18 P/PI genes were retained (14 proteases and 4protease inhibitors), with a considerably lower amount of accumulated evidence. The ranking of P/PI genes as established in this systematic review is currently used to guide validation studies of candidate P/PI genes, and their functional characterization in interdisciplinary mechanistic studies in vitro and in vivo as part of IBDase. The approach used here overcomes some of the problems encountered when subjectively selecting genes for further evaluation and could be applied to any complex disease and gene family

    Primary sclerosing cholangitis

    Get PDF
    Primary sclerosing cholangitis (PSC) is a chronic cholestatic liver disease of unknown aetiology characterised by inflammation and fibrosis of the biliary tree. The mean age at diagnosis is 40 years and men are affected twice as often as women. There is a reported annual incidence of PSC of 0.9–1.31/100,000 and point prevalence of 8.5–13.6/100,000. The onset of PSC is usually insidious and many patients are asymptomatic at diagnosis or have mild symptoms only such as fatigue, abdominal discomfort and pruritus In late stages, splenomegaly and jaundice may be a feature. In most, the disease progresses to cirrhosis and liver failure. Cholangiocarcinoma develops in 8–30% of patients. PSC is thought to be immune mediated and is often associated with inflammatory bowel disease, especially ulcerative colitis. The disease is diagnosed on typical cholangiographic and histological findings and after exclusion of secondary sclerosing cholangitis. Median survival has been estimated to be 12 years from diagnosis in symptomatic patients. Patients who are asymptomatic at diagnosis, the majority of whom will develop progressive disease, have a survival rate greater than 70% at 16 years after diagnosis. Liver transplantation remains the only effective therapeutic option for patients with end-stage liver disease from PSC, although high dose ursodeoxycholic acid may have a beneficial effect

    Pathway analysis comparison using Crohn's disease genome wide association studies

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The use of biological annotation such as genes and pathways in the analysis of gene expression data has aided the identification of genes for follow-up studies and suggested functional information to uncharacterized genes. Several studies have applied similar methods to genome wide association studies and identified a number of disease related pathways. However, many questions remain on how to best approach this problem, such as whether there is a need to obtain a score to summarize association evidence at the gene level, and whether a pathway, dominated by just a few highly significant genes, is of interest.</p> <p>Methods</p> <p>We evaluated the performance of two pathway-based methods (Random Set, and Binomial approximation to the hypergeometric test) based on their applications to three data sets of Crohn's disease. We consider both the disease status as a phenotype as well as the residuals after conditioning on IL23R, a known Crohn's related gene, as a phenotype.</p> <p>Results</p> <p>Our results show that Random Set method has the most power to identify disease related pathways. We confirm previously reported disease related pathways and provide evidence for IL-2 Receptor Beta Chain in T cell Activation and IL-9 signaling as Crohn's disease associated pathways.</p> <p>Conclusions</p> <p>Our results highlight the need to apply powerful gene score methods prior to pathway enrichment tests, and that controlling for genes that attain genome wide significance enable further biological insight.</p

    On Quality Control Measures in Genome-Wide Association Studies: A Test to Assess the Genotyping Quality of Individual Probands in Family-Based Association Studies and an Application to the HapMap Data

    Get PDF
    Allele transmissions in pedigrees provide a natural way of evaluating the genotyping quality of a particular proband in a family-based, genome-wide association study. We propose a transmission test that is based on this feature and that can be used for quality control filtering of genome-wide genotype data for individual probands. The test has one degree of freedom and assesses the average genotyping error rate of the genotyped SNPs for a particular proband. As we show in simulation studies, the test is sufficiently powerful to identify probands with an unreliable genotyping quality that cannot be detected with standard quality control filters. This feature of the test is further exemplified by an application to the third release of the HapMap data. The test is ideally suited as the final layer of quality control filters in the cleaning process of genome-wide association studies. It identifies probands with insufficient genotyping quality that were not removed by standard quality control filtering

    Functional variants in the LRRK2 gene confer shared effects on risk for Crohn's disease and Parkinson's disease

    Get PDF
    Crohn’s disease (CD), a form of inflammatory bowel disease, has a higher prevalence in Ashkenazi Jewish than in non-Jewish European populations. To define the role of nonsynonymous mutations, we performed exome sequencing of Ashkenazi Jewish patients with CD, followed by array-based genotyping and association analysis in 2066 CD cases and 3633 healthy controls. We detected association signals in the LRRK2 gene that conferred risk for CD (N2081D variant, P = 9.5 × 10−10) or protection from CD (N551K variant, tagging R1398H-associated haplotype, P = 3.3 × 10−8). These variants affected CD age of onset, disease location, LRRK2 activity, and autophagy. Bayesian network analysis of CD patient intestinal tissue further implicated LRRK2 in CD pathogenesis. Analysis of the extended LRRK2 locus in 24,570 CD cases, patients with Parkinson’s disease (PD), and healthy controls revealed extensive pleiotropy, with shared genetic effects between CD and PD in both Ashkenazi Jewish and non-Jewish cohorts. The LRRK2 N2081D CD risk allele is located in the same kinase domain as G2019S, a mutation that is the major genetic cause of familial and sporadic PD. Like the G2019S mutation, the N2081D variant was associated with increased kinase activity, whereas neither N551K nor R1398H variants on the protective haplotype altered kinase activity. We also confirmed that R1398H, but not N551K, increased guanosine triphosphate binding and hydrolyzing enzyme (GTPase) activity, thereby deactivating LRRK2. The presence of shared LRRK2 alleles in CD and PD provides refined insight into disease mechanisms and may have major implications for the treatment of these two seemingly unrelated diseases
    corecore