130 research outputs found

    Tools for the identification of variable and potentially variable tandem repeats

    Get PDF
    BACKGROUND: Tandem repeat arrays showing variation between sequences within a population, between strains or across species may have functional effects. The increasing availability of genomic sequence data makes routine description of observed variation possible, creating a need for tools to describe such variability. RESULTS: We present a set of programs that facilitate the identification of tandem repeats showing variation across multiple sequences or genomes, and the prediction of potentially polymorphic tandem repeats. The VNTRfinder (Variable Number of Tandem Repeats finder) program enables the detection of sequence length variation between arrays of inter-specific or intra-specific tandem repeats. In the absence of comparable sequences to explore observed variation, predictions are provided describing which tandem repeats are more likely to be variable, to help guide and focus further experimental evaluation. CONCLUSION: These tools represent a resource for researchers interested in tandem repeats in nucleotide sequences that are most likely to be of clinical and evolutionary interest. The tools are available at . Downloadable versions for UNIX/LINUX and WINDOWS which permit the consideration of longer and more numerous sequences are also available

    Multiplex Target Enrichment Using DNA Indexing for Ultra-High Throughput SNP Detection

    Get PDF
    Screening large numbers of target regions in multiple DNA samples for sequence variation is an important application of next-generation sequencing but an efficient method to enrich the samples in parallel has yet to be reported. We describe an advanced method that combines DNA samples using indexes or barcodes prior to target enrichment to facilitate this type of experiment. Sequencing libraries for multiple individual DNA samples, each incorporating a unique 6-bp index, are combined in equal quantities, enriched using a single in-solution target enrichment assay and sequenced in a single reaction. Sequence reads are parsed based on the index, allowing sequence analysis of individual samples. We show that the use of indexed samples does not impact on the efficiency of the enrichment reaction. For three- and nine-indexed HapMap DNA samples, the method was found to be highly accurate for SNP identification. Even with sequence coverage as low as 8x, 99% of sequence SNP calls were concordant with known genotypes. Within a single experiment, this method can sequence the exonic regions of hundreds of genes in tens of samples for sequence and structural variation using as little as 1 μg of input DNA per sample

    Iron Age and Anglo-Saxon genomes from East England reveal British migration history

    Get PDF
    British population history has been shaped by a series of immigrations, including the early Anglo-Saxon migrations after 400 CE. It remains an open question how these events affected the genetic composition of the current British population. Here, we present whole-genome sequences from 10 individuals excavated close to Cambridge in the East of England, ranging from the late Iron Age to the middle Anglo-Saxon period. By analysing shared rare variants with hundreds of modern samples from Britain and Europe, we estimate that on average the contemporary East English population derives 38% of its ancestry from Anglo-Saxon migrations. We gain further insight with a new method, rarecoal, which infers population history and identifies fine-scale genetic ancestry from rare variants. Using rarecoal we find that the Anglo-Saxon samples are closely related to modern Dutch and Danish populations, while the Iron Age samples share ancestors with multiple Northern European populations including Britain

    Evidence that duplications of 22q11.2 protect against schizophrenia.

    Get PDF
    A number of large, rare copy number variants (CNVs) are deleterious for neurodevelopmental disorders, but large, rare, protective CNVs have not been reported for such phenotypes. Here we show in a CNV analysis of 47 005 individuals, the largest CNV analysis of schizophrenia to date, that large duplications (1.5-3.0 Mb) at 22q11.2--the reciprocal of the well-known, risk-inducing deletion of this locus--are substantially less common in schizophrenia cases than in the general population (0.014% vs 0.085%, OR=0.17, P=0.00086). 22q11.2 duplications represent the first putative protective mutation for schizophrenia

    Multi-locus genome-wide association analysis supports the role of glutamatergic synaptic transmission in the etiology of major depressive disorder

    Get PDF
    Major depressive disorder (MDD) is a common psychiatric illness characterized by low mood and loss of interest in pleasurable activities. Despite years of effort, recent genome-wide association studies (GWAS) have identified few susceptibility variants or genes that are robustly associated with MDD. Standard single-SNP (single nucleotide polymorphism)-based GWAS analysis typically has limited power to deal with the extensive heterogeneity and substantial polygenic contribution of individually weak genetic effects underlying the pathogenesis of MDD. Here, we report an alternative, gene-set-based association analysis of MDD in an effort to identify groups of biologically related genetic variants that are involved in the same molecular function or cellular processes and exhibit a significant level of aggregated association with MDD. In particular, we used a text-mining-based data analysis to prioritize candidate gene sets implicated in MDD and conducted a multi-locus association analysis to look for enriched signals of nominally associated MDD susceptibility loci within each of the gene sets. Our primary analysis is based on the meta-analysis of three large MDD GWAS data sets (total N = 4346 cases and 4430 controls). After correction for multiple testing, we found that genes involved in glutamatergic synaptic neurotransmission were significantly associated with MDD (set-based association P = 6.9 X 10(-4)). This result is consistent with previous studies that support a role of the glutamatergic system in synaptic plasticity and MDD and support the potential utility of targeting glutamatergic neurotransmission in the treatment of MDD

    Genetic Differences between Five European Populations

    Get PDF
    Aims: We sought to examine the magnitude of the differences in SNP allele frequencies between five European populations (Scotland, Ireland, Sweden, Bulgaria and Portugal) and to identify the loci with the greatest differences. Methods: We performed a population-based genome-wide association analysis with Affymetrix 6.0 and 5.0 arrays. We used a 4 degrees of freedom χ2 test to determine the magnitude of stratification for each SNP. We then examined the genes within the most stratified regions, using a highly conservative cutoff of p < 10–45. Results: We found 40,593 SNPs which are genome-wide significantly (p ≤ 10–8) stratified between these populations. The largest differences clustered in gene ontology categories for immunity and pigmentation. Some of the top loci span genes that have already been reported as highly stratified: genes for hair color and pigmentation (HERC2, EXOC2, IRF4), the LCT gene, genes involved in NAD metabolism, and in immunity (HLA and the Toll-like receptor genes TLR10, TLR1, TLR6). However, several genes have not previously been reported as stratified within European populations, indicating that they might also have provided selective advantages: several zinc finger genes, two genes involved in glutathione synthesis or function, and most intriguingly, FOXP2, implicated in speech development. Conclusion: Our analysis demonstrates that many SNPs show genome-wide significant differences within European populations and the magnitude of the differences correlate with the geographical distance. At least some of these differences are due to the selective advantage of polymorphisms within these loci

    Genome-wide association study in a Swedish population yields support for greater CNV and MHC involvement in schizophrenia compared with bipolar disorder

    Get PDF
    Schizophrenia (SCZ) and bipolar disorder (BD) are highly heritable psychiatric disorders with overlapping susceptibility loci and symptomatology. We conducted a genome-wide association study (GWAS) of these disorders in a large Swedish sample. We report a new and independent case–control analysis of 1507 SCZ cases, 836 BD cases and 2093 controls. No single-nucleotide polymorphisms (SNPs) achieved significance in these new samples; however, combining new and previously reported SCZ samples (2111 SCZ and 2535 controls) revealed a genome-wide significant association in the major histocompatibility complex (MHC) region (rs886424, P = 4.54 × 10−8). Imputation using multiple reference panels and meta-analysis with the Psychiatric Genomics Consortium SCZ results underscored the broad, significant association in the MHC region in the full SCZ sample. We evaluated the role of copy number variants (CNVs) in these subjects. As in prior reports, deletions were enriched in SCZ, but not BD cases compared with controls. Singleton deletions were more frequent in both case groups compared with controls (SCZ: P = 0.003, BD: P = 0.013), whereas the largest CNVs (>500 kb) were significantly enriched only in SCZ cases (P = 0.0035). Two CNVs with previously reported SCZ associations were also overrepresented in this SCZ sample: 16p11.2 duplications (P = 0.0035) and 22q11 deletions (P = 0.03). These results reinforce prior reports of significant MHC and CNV associations in SCZ, but not BD

    Identifying Consensus Disease Pathways in Parkinson's Disease Using an Integrative Systems Biology Approach

    Get PDF
    Parkinson's disease (PD) has had six genome-wide association studies (GWAS) conducted as well as several gene expression studies. However, only variants in MAPT and SNCA have been consistently replicated. To improve the utility of these approaches, we applied pathway analyses integrating both GWAS and gene expression. The top 5000 SNPs (p<0.01) from a joint analysis of three existing PD GWAS were identified and each assigned to a gene. For gene expression, rather than the traditional comparison of one anatomical region between sets of patients and controls, we identified differentially expressed genes between adjacent Braak regions in each individual and adjusted using average control expression profiles. Over-represented pathways were calculated using a hyper-geometric statistical comparison. An integrated, systems meta-analysis of the over-represented pathways combined the expression and GWAS results using a Fisher's combined probability test. Four of the top seven pathways from each approach were identical. The top three pathways in the meta-analysis, with their corrected p-values, were axonal guidance (p = 2.8E-07), focal adhesion (p = 7.7E-06) and calcium signaling (p = 2.9E-05). These results support that a systems biology (pathway) approach will provide additional insight into the genetic etiology of PD and that these pathways have both biological and statistical support to be important in PD

    Genetic regulation of Nrnx1 expression: an integrative cross-species analysis of schizophrenia candidate genes

    Get PDF
    Neurexin 1 (NRXN1) is a large presynaptic transmembrane protein that has complex and variable patterns of expression in the brain. Sequence variants in NRXN1 are associated with differences in cognition, and with schizophrenia and autism. The murine Nrxn1 gene is also highly polymorphic and is associated with significant variation in expression that is under strong genetic control. Here, we use co-expression analysis, high coverage genomic sequence, and expression quantitative trait locus (eQTL) mapping to study the regulation of this gene in the brain. We profiled a family of 72 isogenic progeny strains of a cross between C57BL/6J and DBA/2J (the BXD family) using exon arrays and massively parallel RNA sequencing. Expression of most Nrxn1 exons have high genetic correlation (r>0.6) because of the segregation of a common trans eQTL on chromosome (Chr) 8 and a common cis eQTL on Chr 17. These two loci are also linked to murine phenotypes relevant to schizophrenia and to a novel human schizophrenia candidate gene with high neuronal expression (Pleckstrin and Sec7 domain containing 3). In both human and mice, NRXN1 is co-expressed with numerous synaptic and cell signaling genes, and known schizophrenia candidates. Cross-species co-expression and protein interaction network analyses identified glycogen synthase kinase 3 beta (GSK3B) as one of the most consistent and conserved covariates of NRXN1. By using the Molecular Genetics of Schizophrenia data set, we were able to test and confirm that markers in NRXN1 and GSK3B have epistatic interactions in human populations that can jointly modulate risk of schizophrenia
    corecore