543 research outputs found

    Capability of common SNPs to tag rare variants

    Get PDF
    Genome-wide association studies are based on the linkage disequilibrium pattern between common tagging single-nucleotide polymorphisms (SNPs) (i.e., SNPs having only common alleles) and true causal variants, and association studies with rare SNP alleles aim to detect rare causal variants. To better understand and explain the findings from both types of studies and to provide clues to improve the power of an association study with only common SNPs genotyped, we study the correlation between common SNPs and the presence of rare alleles within a region in the genome and look at the capability of common SNPs in strong linkage disequilibrium with each other to capture single rare alleles. Our results indicate that common SNPs can, to some extent, tag the presence of rare alleles and that including SNPs in strong linkage disequilibrium with each other among the tagging SNPs helps to detect rare alleles

    SNPInterForest: A new method for detecting epistatic interactions

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Multiple genetic factors and their interactive effects are speculated to contribute to complex diseases. Detecting such genetic interactive effects, i.e., epistatic interactions, however, remains a significant challenge in large-scale association studies.</p> <p>Results</p> <p>We have developed a new method, named SNPInterForest, for identifying epistatic interactions by extending an ensemble learning technique called random forest. Random forest is a predictive method that has been proposed for use in discovering single-nucleotide polymorphisms (SNPs), which are most predictive of the disease status in association studies. However, it is less sensitive to SNPs with little marginal effect. Furthermore, it does not natively exhibit information on interaction patterns of susceptibility SNPs. We extended the random forest framework to overcome the above limitations by means of (i) modifying the construction of the random forest and (ii) implementing a procedure for extracting interaction patterns from the constructed random forest. The performance of the proposed method was evaluated by simulated data under a wide spectrum of disease models. SNPInterForest performed very well in successfully identifying pure epistatic interactions with high precision and was still more than capable of concurrently identifying multiple interactions under the existence of genetic heterogeneity. It was also performed on real GWAS data of rheumatoid arthritis from the Wellcome Trust Case Control Consortium (WTCCC), and novel potential interactions were reported.</p> <p>Conclusions</p> <p>SNPInterForest, offering an efficient means to detect epistatic interactions without statistical analyses, is promising for practical use as a way to reveal the epistatic interactions involved in common complex diseases.</p

    Linkage Disequilibrium in Wild Mice

    Get PDF
    Crosses between laboratory strains of mice provide a powerful way of detecting quantitative trait loci for complex traits related to human disease. Hundreds of these loci have been detected, but only a small number of the underlying causative genes have been identified. The main difficulty is the extensive linkage disequilibrium (LD) in intercross progeny and the slow process of fine-scale mapping by traditional methods. Recently, new approaches have been introduced, such as association studies with inbred lines and multigenerational crosses. These approaches are very useful for interval reduction, but generally do not provide single-gene resolution because of strong LD extending over one to several megabases. Here, we investigate the genetic structure of a natural population of mice in Arizona to determine its suitability for fine-scale LD mapping and association studies. There are three main findings: (1) Arizona mice have a high level of genetic variation, which includes a large fraction of the sequence variation present in classical strains of laboratory mice; (2) they show clear evidence of local inbreeding but appear to lack stable population structure across the study area; and (3) LD decays with distance at a rate similar to human populations, which is considerably more rapid than in laboratory populations of mice. Strong associations in Arizona mice are limited primarily to markers less than 100 kb apart, which provides the possibility of fine-scale association mapping at the level of one or a few genes. Although other considerations, such as sample size requirements and marker discovery, are serious issues in the implementation of association studies, the genetic variation and LD results indicate that wild mice could provide a useful tool for identifying genes that cause variation in complex traits

    Tools for efficient epistasis detection in genome-wide association study

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Genome-wide association study (GWAS) aims to find genetic factors underlying complex phenotypic traits, for which epistasis or gene-gene interaction detection is often preferred over single-locus approach. However, the computational burden has been a major hurdle to apply epistasis test in the genome-wide scale due to a large number of single nucleotide polymorphism (SNP) pairs to be tested.</p> <p>Results</p> <p>We have developed a set of three efficient programs, FastANOVA, COE and TEAM, that support epistasis test in a variety of problem settings in GWAS. These programs utilize permutation test to properly control error rate such as family-wise error rate (FWER) and false discovery rate (FDR). They guarantee to find the optimal solutions, and significantly speed up the process of epistasis detection in GWAS.</p> <p>Conclusions</p> <p>A web server with user interface and source codes are available at the website <url>http://www.csbio.unc.edu/epistasis/</url>. The source codes are also available at SourceForge <url>http://sourceforge.net/projects/epistasis/</url>.</p

    Evaluation of association tests for rare variants using simulated data sets in the Genetic Analysis Workshop 17 data

    Get PDF
    We evaluate four association tests for rare variants—the combined multivariate and collapsing (CMC) method, two weighted-sum methods, and a variable threshold method—by applying them to the simulated data sets of unrelated individuals in the Genetic Analysis Workshop 17 (GAW17) data. The family-wise error rate (FWER) and average power are used as criteria for evaluation. Our results show that when all nonsynonymous SNPs (rare variants and common variants) in a gene are jointly analyzed, the CMC method fails to control the FWER; when only rare variants (single-nucleotide polymorphisms with minor allele frequency less than 0.05) are analyzed, all four methods can control FWER well. All four methods have comparable power, which is low for the analysis of the GAW17 data sets. Three of the methods (not including the CMC method) involve estimation of p-values using permutation procedures that either can be computationally intensive or generate inflated FWERs. We adapt a fast permutation procedure into these three methods. The results show that using the fast permutation procedure can produce FWERs and average powers close to the values obtained from the standard permutation procedure on the GAW17 data sets. The standard permutation procedure is computationally intensive

    Rare copy number variants: a point of rarity in genetic risk for bipolar disorder and schizophrenia

    Get PDF
    Context: Recent studies suggest that copy number variation in the human genome is extensive and may play an important role in susceptibility to disease, including neuropsychiatric disorders such as schizophrenia and autism. The possible involvement of copy number variants (CNVs) in bipolar disorder has received little attention to date. Objectives: To determine whether large (>100 000 base pairs) and rare (found in <1% of the population) CNVs are associated with susceptibility to bipolar disorder and to compare with findings in schizophrenia. Design: A genome-wide survey of large, rare CNVs in a case-control sample using a high-density microarray. Setting: The Wellcome Trust Case Control Consortium. Participants: There were 1697 cases of bipolar disorder and 2806 nonpsychiatric controls. All participants were white UK residents. Main Outcome Measures: Overall load of CNVs and presence of rare CNVs. Results: The burden of CNVs in bipolar disorder was not increased compared with controls and was significantly less than in schizophrenia cases. The CNVs previously implicated in the etiology of schizophrenia were not more common in cases with bipolar disorder. Conclusions: Schizophrenia and bipolar disorder differ with respect to CNV burden in general and association with specific CNVs in particular. Our data are consistent with the possibility that possession of large, rare deletions may modify the phenotype in those at risk of psychosis: those possessing such events are more likely to be diagnosed as having schizophrenia, and those without them are more likely to be diagnosed as having bipolar disorder

    MegaSNPHunter: a learning approach to detect disease predisposition SNPs and high level interactions in genome wide association study

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The interactions of multiple single nucleotide polymorphisms (SNPs) are highly hypothesized to affect an individual's susceptibility to complex diseases. Although many works have been done to identify and quantify the importance of multi-SNP interactions, few of them could handle the genome wide data due to the combinatorial explosive search space and the difficulty to statistically evaluate the high-order interactions given limited samples.</p> <p>Results</p> <p>Three comparative experiments are designed to evaluate the performance of MegaSNPHunter. The first experiment uses synthetic data generated on the basis of epistasis models. The second one uses a genome wide study on Parkinson disease (data acquired by using Illumina HumanHap300 SNP chips). The third one chooses the rheumatoid arthritis study from Wellcome Trust Case Control Consortium (WTCCC) using Affymetrix GeneChip 500K Mapping Array Set. MegaSNPHunter outperforms the best solution in this area and reports many potential interactions for the two real studies.</p> <p>Conclusion</p> <p>The experimental results on both synthetic data and two real data sets demonstrate that our proposed approach outperforms the best solution that is currently available in handling large-scale SNP data both in terms of speed and in terms of detection of potential interactions that were not identified before. To our knowledge, MegaSNPHunter is the first approach that is capable of identifying the disease-associated SNP interactions from WTCCC studies and is promising for practical disease prognosis.</p

    Heritability estimates of the Big Five personality traits based on common genetic variants

    Get PDF
    According to twin studies, the Big Five personality traits have substantial heritable components explaining 40–60% of the variance, but identification of associated genetic variants has remained elusive. Consequently, knowledge regarding the molecular genetic architecture of personality and to what extent it is shared across the different personality traits is limited. Using genomic-relatedness-matrix residual maximum likelihood analysis (GREML), we here estimated the heritability of the Big Five personality factors (extraversion, agreeableness, conscientiousness, neuroticism and openness for experience) in a sample of 5011 European adults from 527 469 single-nucleotide polymorphisms across the genome. We tested for the heritability of each personality trait, as well as for the genetic overlap between the personality factors. We found significant and substantial heritability estimates for neuroticism (15%, s.e.=0.08, P=0.04) and openness (21%, s.e.=0.08, P<0.01), but not for extraversion, agreeableness and conscientiousness. The bivariate analyses showed that the variance explained by common variants entirely overlapped between neuroticism and openness (rG=1.00, P <0.001), despite low phenotypic correlation (r=−0.09, P <0.001), suggesting that the remaining unique heritability may be determined by rare or structural variants. As far as we are aware of, this is the first study estimating the shared and unique heritability of all Big Five personality traits using the GREML approach. Findings should be considered exploratory and suggest that detectable heritability estimates based on common variants is shared between neuroticism and openness to experiences

    A genome-wide association study of sleep habits and insomnia

    Get PDF
    Several aspects of sleep behavior such as timing, duration and quality have been demonstrated to be heritable. To identify common variants that influence sleep traits in the population, we conducted a genome-wide association study of six sleep phenotypes assessed by questionnaire in a sample of 2,323 individuals from the Australian Twin Registry. Genotyping was performed on the Illumina 317, 370, and 610K arrays and the SNPs in common between platforms were used to impute non-genotyped SNPs. We tested for association with more than 2,000,000 common polymorphisms across the genome. While no SNPs reached the genome-wide significance threshold, we identified a number of associations in plausible candidate genes. Most notably, a group of SNPs in the third intron of the CACNA1C gene ranked as most significant in the analysis of sleep latency (P=1.3×10-6). We attempted to replicate this association in an independent sample from the Chronogen Consortium (n=2,034), but found no evidence of association (P=0.73). We have identified several other suggestive associations that await replication in an independent sample. We did not replicate the results from previous genome-wide analyses of self-reported sleep phenotypes after correction for multiple testing

    Custom CGH array profiling of copy number variations (CNVs) on chromosome 6p21.32 (HLA locus) in patients with venous malformations associated with multiple sclerosis

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Multiple sclerosis (MS) is a complex disorder thought to result from an interaction between environmental and genetic predisposing factors which have not yet been characterised, although it is known to be associated with the HLA region on 6p21.32. Recently, a picture of chronic cerebrospinal venous insufficiency (CCSVI), consequent to stenosing venous malformation of the main extra-cranial outflow routes (VM), has been described in patients affected with MS, introducing an additional phenotype with possible pathogenic significance.</p> <p>Methods</p> <p>In order to explore the presence of copy number variations (CNVs) within the HLA locus, a custom CGH array was designed to cover 7 Mb of the HLA locus region (6,899,999 bp; chr6:29,900,001-36,800,000). Genomic DNA of the 15 patients with CCSVI/VM and MS was hybridised in duplicate.</p> <p>Results</p> <p>In total, 322 CNVs, of which 225 were extragenic and 97 intragenic, were identified in 15 patients. 234 known polymorphic CNVs were detected, the majority of these being situated in non-coding or extragenic regions. The overall number of CNVs (both extra- and intragenic) showed a robust and significant correlation with the number of stenosing VMs (Spearman: r = 0.6590, p = 0.0104; linear regression analysis r = 0.6577, p = 0.0106).</p> <p>The region we analysed contains 211 known genes. By using pathway analysis focused on angiogenesis and venous development, MS, and immunity, we tentatively highlight several genes as possible susceptibility factor candidates involved in this peculiar phenotype.</p> <p>Conclusions</p> <p>The CNVs contained in the HLA locus region in patients with the novel phenotype of CCSVI/VM and MS were mapped in detail, demonstrating a significant correlation between the number of known CNVs found in the HLA region and the number of CCSVI-VMs identified in patients. Pathway analysis revealed common routes of interaction of several of the genes involved in angiogenesis and immunity contained within this region. Despite the small sample size in this pilot study, it does suggest that the number of multiple polymorphic CNVs in the HLA locus deserves further study, owing to their possible involvement in susceptibility to this novel MS/VM plus phenotype, and perhaps even other types of the disease.</p
    corecore