87 research outputs found

    Tools for the identification of variable and potentially variable tandem repeats

    Get PDF
    BACKGROUND: Tandem repeat arrays showing variation between sequences within a population, between strains or across species may have functional effects. The increasing availability of genomic sequence data makes routine description of observed variation possible, creating a need for tools to describe such variability. RESULTS: We present a set of programs that facilitate the identification of tandem repeats showing variation across multiple sequences or genomes, and the prediction of potentially polymorphic tandem repeats. The VNTRfinder (Variable Number of Tandem Repeats finder) program enables the detection of sequence length variation between arrays of inter-specific or intra-specific tandem repeats. In the absence of comparable sequences to explore observed variation, predictions are provided describing which tandem repeats are more likely to be variable, to help guide and focus further experimental evaluation. CONCLUSION: These tools represent a resource for researchers interested in tandem repeats in nucleotide sequences that are most likely to be of clinical and evolutionary interest. The tools are available at . Downloadable versions for UNIX/LINUX and WINDOWS which permit the consideration of longer and more numerous sequences are also available

    Pathway Analyses Implicate Glial Cells in Schizophrenia

    Get PDF
    Background: The quest to understand the neurobiology of schizophrenia and bipolar disorder is ongoing with multiple lines of evidence indicating abnormalities of glia, mitochondria, and glutamate in both disorders. Despite high heritability estimates of 81% for schizophrenia and 75% for bipolar disorder, compelling links between findings from neurobiological studies, and findings from large-scale genetic analyses, are only beginning to emerge. Method Ten publically available gene sets (pathways) related to glia, mitochondria, and glutamate were tested for association to schizophrenia and bipolar disorder using MAGENTA as the primary analysis method. To determine the robustness of associations, secondary analyses were performed with: ALIGATOR, INRICH, and Set Screen. Data from the Psychiatric Genomics Consortium (PGC) were used for all analyses. There were 1,068,286 SNP-level p-values for schizophrenia (9,394 cases/12,462 controls), and 2,088,878 SNP-level p-values for bipolar disorder (7,481 cases/9,250 controls). Results: The Glia-Oligodendrocyte pathway was associated with schizophrenia, after correction for multiple tests, according to primary analysis (MAGENTA p = 0.0005, 75% requirement for individual gene significance) and also achieved nominal levels of significance with INRICH (p = 0.0057) and ALIGATOR (p = 0.022). For bipolar disorder, Set Screen yielded nominally and method-wide significant associations to all three glial pathways, with strongest association to the Glia-Astrocyte pathway (p = 0.002). Conclusions: Consistent with findings of white matter abnormalities in schizophrenia by other methods of study, the Glia-Oligodendrocyte pathway was associated with schizophrenia in our genomic study. These findings suggest that the abnormalities of myelination observed in schizophrenia are at least in part due to inherited factors, contrasted with the alternative of purely environmental causes (e.g. medication effects or lifestyle). While not the primary purpose of our study, our results also highlight the consequential nature of alternative choices regarding pathway analysis, in that results varied somewhat across methods, despite application to identical datasets and pathways

    Genetic Classification of Populations using Supervised Learning

    Get PDF
    There are many instances in genetics in which we wish to determine whether two candidate populations are distinguishable on the basis of their genetic structure. Examples include populations which are geographically separated, case--control studies and quality control (when participants in a study have been genotyped at different laboratories). This latter application is of particular importance in the era of large scale genome wide association studies, when collections of individuals genotyped at different locations are being merged to provide increased power. The traditional method for detecting structure within a population is some form of exploratory technique such as principal components analysis. Such methods, which do not utilise our prior knowledge of the membership of the candidate populations. are termed \emph{unsupervised}. Supervised methods, on the other hand are able to utilise this prior knowledge when it is available. In this paper we demonstrate that in such cases modern supervised approaches are a more appropriate tool for detecting genetic differences between populations. We apply two such methods, (neural networks and support vector machines) to the classification of three populations (two from Scotland and one from Bulgaria). The sensitivity exhibited by both these methods is considerably higher than that attained by principal components analysis and in fact comfortably exceeds a recently conjectured theoretical limit on the sensitivity of unsupervised methods. In particular, our methods can distinguish between the two Scottish populations, where principal components analysis cannot. We suggest, on the basis of our results that a supervised learning approach should be the method of choice when classifying individuals into pre-defined populations, particularly in quality control for large scale genome wide association studies.Comment: Accepted PLOS On

    Genetic Differences between Five European Populations

    Get PDF
    Aims: We sought to examine the magnitude of the differences in SNP allele frequencies between five European populations (Scotland, Ireland, Sweden, Bulgaria and Portugal) and to identify the loci with the greatest differences. Methods: We performed a population-based genome-wide association analysis with Affymetrix 6.0 and 5.0 arrays. We used a 4 degrees of freedom χ2 test to determine the magnitude of stratification for each SNP. We then examined the genes within the most stratified regions, using a highly conservative cutoff of p < 10–45. Results: We found 40,593 SNPs which are genome-wide significantly (p ≤ 10–8) stratified between these populations. The largest differences clustered in gene ontology categories for immunity and pigmentation. Some of the top loci span genes that have already been reported as highly stratified: genes for hair color and pigmentation (HERC2, EXOC2, IRF4), the LCT gene, genes involved in NAD metabolism, and in immunity (HLA and the Toll-like receptor genes TLR10, TLR1, TLR6). However, several genes have not previously been reported as stratified within European populations, indicating that they might also have provided selective advantages: several zinc finger genes, two genes involved in glutathione synthesis or function, and most intriguingly, FOXP2, implicated in speech development. Conclusion: Our analysis demonstrates that many SNPs show genome-wide significant differences within European populations and the magnitude of the differences correlate with the geographical distance. At least some of these differences are due to the selective advantage of polymorphisms within these loci

    An inherited duplication at the gene p21 protein-activated Kinase 7 (PAK7) is a risk factor for psychosis

    Get PDF
    FUNDING Funding for this study was provided by the Wellcome Trust Case Control Consortium 2 project (085475/B/08/Z and 085475/Z/08/Z), the Wellcome Trust (072894/Z/03/Z, 090532/Z/09/Z and 075491/Z/04/B), NIMH grants (MH 41953 and MH083094) and Science Foundation Ireland (08/IN.1/B1916). We acknowledge use of the Trinity Biobank sample from the Irish Blood Transfusion Service; the Trinity Centre for High Performance Computing; British 1958 Birth Cohort DNA collection funded by the Medical Research Council (G0000934) and the Wellcome Trust (068545/Z/02) and of the UK National Blood Service controls funded by the Wellcome Trust. Chris Spencer is supported by a Wellcome Trust Career Development Fellowship (097364/Z/11/Z). Funding to pay the Open Access publication charges for this article was provided by the Wellcome Trust. ACKNOWLEDGEMENTS The authors sincerely thank all patients who contributed to this study and all staff who facilitated their involvement. We thank W. Bodmer and B. Winney for use of the People of the British Isles DNA collection, which was funded by the Wellcome Trust. We thank Akira Sawa and Koko Ishzuki for advice on the PAK7–DISC1 interaction experiment and Jan Korbel for discussions on mechanism of structural variation.Peer reviewedPublisher PD

    Population structure and genome-wide patterns of variation in Ireland and Britain

    Get PDF
    Located off the northwestern coast of the European mainland, Britain and Ireland were among the last regions of Europe to be colonized by modern humans after the last glacial maximum. Further, the geographical location of Britain, and in particular of Ireland, is such that the impact of historical migration has been minimal. Genetic diversity studies applying the Y chromosome and mitochondrial systems have indicated reduced diversity and an increased population structure across Britain and Ireland relative to the European mainland. Such characteristics would have implications for genetic mapping studies of complex disease. We set out to further our understanding of the genetic architecture of the region from the perspective of (i) population structure, (ii) linkage disequilibrium (LD), (iii) homozygosity and (iv) haplotype diversity (HD). Analysis was conducted on 3654 individuals from Ireland, Britain (with regional sampling in Scotland), Bulgaria, Portugal, Sweden and the Utah HapMap collection. Our results indicate a subtle but clear genetic structure across Britain and Ireland, although levels of structure were reduced in comparison with average cross-European structure. We observed slightly elevated levels of LD and homozygosity in the Irish population compared with neighbouring European populations. We also report on a cline of HD across Europe with greatest levels in southern populations and lowest levels in Ireland and Scotland. These results are consistent with our understanding of the population history of Europe and promote Ireland and Scotland as relatively homogenous resources for genetic mapping of rare variants

    No Reliable Association between Runs of Homozygosity and Schizophrenia in a Well-Powered Replication Study

    Get PDF
    It is well known that inbreeding increases the risk of recessive monogenic diseases, but it is less certain whether it contributes to the etiology of complex diseases such as schizophrenia. One way to estimate the effects of inbreeding is to examine the association between disease diagnosis and genome-wide autozygosity estimated using runs of homozygosity (ROH) in genome-wide single nucleotide polymorphism arrays. Using data for schizophrenia from the Psychiatric Genomics Consortium (n = 21,868), Keller et al. (2012) estimated that the odds of developing schizophrenia increased by approximately 17% for every additional percent of the genome that is autozygous (β = 16.1, CI(β) = [6.93, 25.7], Z = 3.44, p = 0.0006). Here we describe replication results from 22 independent schizophrenia case-control datasets from the Psychiatric Genomics Consortium (n = 39,830). Using the same ROH calling thresholds and procedures as Keller et al. (2012), we were unable to replicate the significant association between ROH burden and schizophrenia in the independent PGC phase II data, although the effect was in the predicted direction, and the combined (original + replication) dataset yielded an attenuated but significant relationship between Froh and schizophrenia (β = 4.86,CI(β) = [0.90,8.83],Z = 2.40,p = 0.02). Since Keller et al. (2012), several studies reported inconsistent association of ROH burden with complex traits, particularly in case-control data. These conflicting results might suggest that the effects of autozygosity are confounded by various factors, such as socioeconomic status, education, urbanicity, and religiosity, which may be associated with both real inbreeding and the outcome measures of interest
    corecore