8 research outputs found

    Variation resources at UC Santa Cruz

    Get PDF
    The variation resources within the University of California Santa Cruz Genome Browser include polymorphism data drawn from public collections and analyses of these data, along with their display in the context of other genomic annotations. Primary data from dbSNP is included for many organisms, with added information including genomic alleles and orthologous alleles for closely related organisms. Display filtering and coloring is available by variant type, functional class or other annotations. Annotation of potential errors is highlighted and a genomic alignment of the variant's flanking sequence is displayed. HapMap allele frequencies and linkage disequilibrium (LD) are available for each HapMap population, along with non-human primate alleles. The browsing and analysis tools, downloadable data files and links to documentation and other information can be found at

    MouseIndelDB: a database integrating genomic indel polymorphisms that distinguish mouse strains

    Get PDF
    MouseIndelDB is an integrated database resource containing thousands of previously unreported mouse genomic indel (insertion and deletion) polymorphisms ranging from ∼100 nt to 10 Kb in size. The database currently includes polymorphisms identified from our alignment of 26 million whole-genome shotgun sequence traces from four laboratory mouse strains mapped against the reference C57BL/6J genome using GMAP. They can be queried on a local level by chromosomal coordinates, nearby gene names or other genomic feature identifiers, or in bulk format using categories including mouse strain(s), class of polymorphism(s) and chromosome number. The results of such queries are presented either as a custom track on the UCSC mouse genome browser or in tabular format. We anticipate that the MouseIndelDB database will be widely useful for research in mammalian genetics, genomics, and evolutionary biology. Access to the MouseIndelDB database is freely available at: http://variation.osu.edu/

    The distribution of a germline methylation marker suggests a regional mechanism of LINE-1 silencing by the piRNA-PIWI system

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>A defense system against transposon activity in the human germline based on PIWI proteins and piRNA has recently been discovered. It represses the activity of LINE-1 elements via DNA methylation by a largely unknown mechanism. Based on the dispersed distribution of clusters of piRNA genes in a strand-specific manner on all human chromosomes, we hypothesized that this system might work preferentially on local and proximal sequences. We tested this hypothesis with a methylation-associated SNP (mSNP) marker which is based on the density of C-T transitions in CpG dinucleotides as a surrogate marker for germline methylation.</p> <p>Results</p> <p>We found significantly higher density of mSNPs flanking piRNA clusters in the human genome for flank sizes of 1-16 Mb. A dose-response relationship between number of piRNA genes and mSNP density was found for up to 16 Mb of flanking sequences. The chromosomal density of hypermethylated LINE-1 elements had a significant positive correlation with the chromosomal density of piRNA genes (<it>r </it>= 0.41, <it>P </it>= 0.05<it>)</it>. Genome windows of 1-16 Mb containing piRNA clusters had significantly more hypermethylated LINE-1 elements than windows not containing piRNA clusters. Finally, the minimum distance to the next piRNA cluster was significantly shorter for hypermethylated LINE-1 compared to normally methylated elements (14.4 Mb vs 16.1 Mb).</p> <p>Conclusions</p> <p>Our observations support our hypothesis that the piRNA-PIWI system preferentially methylates sequences in close proximity to the piRNA clusters and perhaps physically adjacent sequences on other chromosomes. Furthermore they suggest that this proximity effect extends up to 16 Mb. This could be due to an unknown localization signal, transcription of piRNA genes near the nuclear membrane or the presence of an unknown RNA molecule that spreads across the chromosome and targets the methylation directed by the piRNA-PIWI complex. Our data suggest a region specific molecular mechanism which can be sought experimentally.</p

    A database and API for variation, dense genotyping and resequencing data

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Advances in sequencing and genotyping technologies are leading to the widespread availability of multi-species variation data, dense genotype data and large-scale resequencing projects. The 1000 Genomes Project and similar efforts in other species are challenging the methods previously used for storage and manipulation of such data necessitating the redesign of existing genome-wide bioinformatics resources.</p> <p>Results</p> <p>Ensembl has created a database and software library to support data storage, analysis and access to the existing and emerging variation data from large mammalian and vertebrate genomes. These tools scale to thousands of individual genome sequences and are integrated into the Ensembl infrastructure for genome annotation and visualisation. The database and software system is easily expanded to integrate both public and non-public data sources in the context of an Ensembl software installation and is already being used outside of the Ensembl project in a number of database and application environments.</p> <p>Conclusions</p> <p>Ensembl's powerful, flexible and open source infrastructure for the management of variation, genotyping and resequencing data is freely available at <url>http://www.ensembl.org</url>.</p

    NovelSNPer: A Fast Tool for the Identification and Characterization of Novel SNPs and InDels

    Get PDF
    Typically, next-generation resequencing projects produce large lists of variants. NovelSNPer is a software tool that permits fast and efficient processing of such output lists. In a first step, NovelSNPer determines if a variant represents a known variant or a previously unknown variant. In a second step, each variant is classified into one of 15 SNP classes or 19 InDel classes. Beside the classes used by Ensembl, we introduce POTENTIAL_START_GAINED and START_LOST as new functional classes and present a classification scheme for InDels. NovelSNPer is based upon the gene structure information stored in Ensembl. It processes two million SNPs in six hours. The tool can be used online or downloaded

    Concordant Gene Expression in Leukemia Cells and Normal Leukocytes Is Associated with Germline cis-SNPs

    Get PDF
    The degree to which gene expression covaries between different primary tissues within an individual is not well defined. We hypothesized that expression that is concordant across tissues is more likely influenced by genetic variability than gene expression which is discordant between tissues. We quantified expression of 11,873 genes in paired samples of primary leukemia cells and normal leukocytes from 92 patients with acute lymphoblastic leukemia (ALL). Genetic variation at >500,000 single nucleotide polymorphisms (SNPs) was also assessed. The expression of only 176/11,783 (1.5%) genes was correlated (p<0.008, FDR = 25%) in the two tissue types, but expression of a high proportion (20 of these 176 genes) was significantly related to cis-SNP genotypes (adjusted p<0.05). In an independent set of 134 patients with ALL, 14 of these 20 genes were validated as having expression related to cis-SNPs, as were 9 of 20 genes in a second validation set of HapMap cell lines. Genes whose expression was concordant among tissue types were more likely to be associated with germline cis-SNPs than genes with discordant expression in these tissues; genes affected were involved in housekeeping functions (GSTM2, GAPDH and NCOR1) and purine metabolism
    corecore