Search CORE

MouseIndelDB: a database integrating genomic indel polymorphisms that distinguish mouse strains

Author: Agrafioti
Akagi
Altshuler
Beck
Belancio
Bhangale
Churchill
Clark
D. E. Symer
Druker
E. Evdokimov
Eichler
Frazer
Hinrichs
Iafrate
J. Li
K. Akagi
Kidd
M. R. Kuehn
Maksakova
Manaster
N. Volfovsky
R. M. Stephens
Redon
Salmon Hillbertz
She
van de Lagemaat
Varki
Wade
Wang
Waterston
Yang
Zhang
Zhang
Publication venue: Oxford University Press
Publication date: 01/01/2010
Field of study

MouseIndelDB is an integrated database resource containing thousands of previously unreported mouse genomic indel (insertion and deletion) polymorphisms ranging from ∼100 nt to 10 Kb in size. The database currently includes polymorphisms identified from our alignment of 26 million whole-genome shotgun sequence traces from four laboratory mouse strains mapped against the reference C57BL/6J genome using GMAP. They can be queried on a local level by chromosomal coordinates, nearby gene names or other genomic feature identifiers, or in bulk format using categories including mouse strain(s), class of polymorphism(s) and chromosome number. The results of such queries are presented either as a custom track on the UCSC mouse genome browser or in tabular format. We anticipate that the MouseIndelDB database will be widely useful for research in mammalian genetics, genomics, and evolutionary biology. Access to the MouseIndelDB database is freely available at: http://variation.osu.edu/

CiteSeerX

Landspítali University Hospital Research Archive

The distribution of a germline methylation marker suggests a regional mechanism of LINE-1 silencing by the piRNA-PIWI system

Author: Bjornsson Hans T
Jonsson Jon J
Sigurdsson Martin I
Smith Albert V
Publication venue: BioMed Central
Publication date: 01/01/2012
Field of study

Abstract Background A defense system against transposon activity in the human germline based on PIWI proteins and piRNA has recently been discovered. It represses the activity of LINE-1 elements via DNA methylation by a largely unknown mechanism. Based on the dispersed distribution of clusters of piRNA genes in a strand-specific manner on all human chromosomes, we hypothesized that this system might work preferentially on local and proximal sequences. We tested this hypothesis with a methylation-associated SNP (mSNP) marker which is based on the density of C-T transitions in CpG dinucleotides as a surrogate marker for germline methylation. Results We found significantly higher density of mSNPs flanking piRNA clusters in the human genome for flank sizes of 1-16 Mb. A dose-response relationship between number of piRNA genes and mSNP density was found for up to 16 Mb of flanking sequences. The chromosomal density of hypermethylated LINE-1 elements had a significant positive correlation with the chromosomal density of piRNA genes (<it>r </it>= 0.41, <it>P </it>= 0.05<it>)</it>. Genome windows of 1-16 Mb containing piRNA clusters had significantly more hypermethylated LINE-1 elements than windows not containing piRNA clusters. Finally, the minimum distance to the next piRNA cluster was significantly shorter for hypermethylated LINE-1 compared to normally methylated elements (14.4 Mb vs 16.1 Mb). Conclusions Our observations support our hypothesis that the piRNA-PIWI system preferentially methylates sequences in close proximity to the piRNA clusters and perhaps physically adjacent sequences on other chromosomes. Furthermore they suggest that this proximity effect extends up to 16 Mb. This could be due to an unknown localization signal, transcription of piRNA genes near the nuclear membrane or the presence of an unknown RNA molecule that spreads across the chromosome and targets the methylation directed by the piRNA-PIWI complex. Our data suggest a region specific molecular mechanism which can be sought experimentally.</p

A database and API for variation, dense genotyping and resequencing data

Author: Birney Ewan
Chen Yuan
Cunningham Fiona
Flicek Paul
McLaren William M
Rios Daniel
Stabenau Arne
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Advances in sequencing and genotyping technologies are leading to the widespread availability of multi-species variation data, dense genotype data and large-scale resequencing projects. The 1000 Genomes Project and similar efforts in other species are challenging the methods previously used for storage and manipulation of such data necessitating the redesign of existing genome-wide bioinformatics resources. Results Ensembl has created a database and software library to support data storage, analysis and access to the existing and emerging variation data from large mammalian and vertebrate genomes. These tools scale to thousands of individual genome sequences and are integrated into the Ensembl infrastructure for genome annotation and visualisation. The database and software system is easily expanded to integrate both public and non-public data sources in the context of an Ensembl software installation and is already being used outside of the Ensembl project in a number of database and application environments. Conclusions Ensembl's powerful, flexible and open source infrastructure for the management of variation, genotyping and resequencing data is freely available at <url>http://www.ensembl.org</url>.</p

NovelSNPer: A Fast Tool for the Identification and Characterization of Novel SNPs and InDels

Author: Aßmus Jens
Bortfeldt Ralf H.
Brockmann Gudrun A.
Schmitt Armin O.
Publication venue: Hindawi Publishing Corporation
Publication date: 01/01/2011
Field of study

Typically, next-generation resequencing projects produce large lists of variants. NovelSNPer is a software tool that permits fast and efficient processing of such output lists. In a first step, NovelSNPer determines if a variant represents a known variant or a previously unknown variant. In a second step, each variant is classified into one of 15 SNP classes or 19 InDel classes. Beside the classes used by Ensembl, we introduce POTENTIAL_START_GAINED and START_LOST as new functional classes and present a classification scheme for InDels. NovelSNPer is based upon the gene structure information stored in Ensembl. It processes two million SNPs in six hours. The tool can be used online or downloaded

Public Library of Science (PLOS)

Concordant Gene Expression in Leukemia Cells and Normal Leukocytes Is Associated with Germline cis-SNPs

Author: A Holleman
AI Su
AR Whitney
AW Bergen
BE Stranger
BE Stranger
BE Stranger
C Hartford
CG Mullighan
CH Pui
Ching-Hon Pui
D French
Deborah French
DJ Thomas
E Birney
EE Schadt
EJ Yeoh
G Dennis Jr
Geoffrey Neale
GV Glinsky
HH Goring
JA Warrington
James R. Downing
JC Rocha
JD Storey
JJ Eady
JP Radich
Leo H. Hamilton
LL Hsiao
M Dai
M Morley
Mary V. Relling
ME Ross
MH Cheok
Nancy J. Cox
P Kuehl
P Westfall
R Redon
R Shyamsundar
RS Huang
RS Spielman
S Kishi
S Lugthart
SA Monks
TL Bailey
VG Cheung
W Zhang
Wenjian Yang
William E. Evans
Xiaolin Wu
Yiping Fan
Z Tu
Publication venue: Public Library of Science
Publication date: 01/05/2008
Field of study

The degree to which gene expression covaries between different primary tissues within an individual is not well defined. We hypothesized that expression that is concordant across tissues is more likely influenced by genetic variability than gene expression which is discordant between tissues. We quantified expression of 11,873 genes in paired samples of primary leukemia cells and normal leukocytes from 92 patients with acute lymphoblastic leukemia (ALL). Genetic variation at >500,000 single nucleotide polymorphisms (SNPs) was also assessed. The expression of only 176/11,783 (1.5%) genes was correlated (p<0.008, FDR = 25%) in the two tissue types, but expression of a high proportion (20 of these 176 genes) was significantly related to cis-SNP genotypes (adjusted p<0.05). In an independent set of 134 patients with ALL, 14 of these 20 genes were validated as having expression related to cis-SNPs, as were 9 of 20 genes in a second validation set of HapMap cell lines. Genes whose expression was concordant among tissue types were more likely to be associated with germline cis-SNPs than genes with discordant expression in these tissues; genes affected were involved in housekeeping functions (GSTM2, GAPDH and NCOR1) and purine metabolism

Ensembl variation resources

Author: Birney Ewan
Brent Simon
Chen Yuan
Cunningham Fiona
Flicek Paul
Kulesha Eugene
Marin-Garcia Pablo
McLaren William M
Pritchard Bethan
Rios Daniel
Smedley Damian
Smith James
Spudich Giulietta M
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Challenges in the association of human single nucleotide polymorphism mentions with unique database identifiers

Author: A Bhat
A Coulet
A Hamosh
A Hamosh
A Maniatis
A Rösler
A Yeh
AA Morgan
Ad Hoc Committee on Mutation Nomenclature
AL Beaudet
B Wolff
Christoph M Friedrich
CR Lee
D Hanisch
D Lechner
D Maglott
D Rebholz-Schuhmann
DA Benson
DJ Thomas
E Beutler
E Ha
F Horn
FS Collins
G Martinez
H Matsuzaki
H Rhee
H Shatkay
International HapMap Consortium
International Human Genome Sequencing Consortium
J Hakenberg
J Lafferty
J Wermter
JA Goldstein
JB Laurila
JC Chang
JG Caporaso
JT den Dunnen
JT den Dunnen
JW Cooper
K Franzén
K Yanase
L Bertram
L Furlong
L Hirschman
L Tanabe
Laura I Furlong
LC Lee
LJ Jensen
M Erdogmus
M Hirakawa
M Kanehisa
M Krallinger
M Margulies
M Weeber
M Wildeman
Martin Hofmann-Apitius
MC Owen
ML Arnold
O Attree
P Flicek
Philippe E Thomas
R Klinger
R Witte
Roman Klinger
RT McDonald
S Antonarakis
S Antonarakis
S Ogino
S Ogino
ST Bennett
ST Sherry
T Yoneyama
TA Eyre
The UniProt Consortium
U Leser
VM Ingram
X Ke
YL Yip
YL Yip
Z Meng
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Thomas PE, Klinger R, Furlong LI, Hofmann-Apitius M, Friedrich CM. Challenges in the association of human single nucleotide polymorphism mentions with unique database identifiers. BMC Bioinformatics. 2011;12(Suppl 4): S4