Search CORE

25,006 research outputs found

Ensembl’s 10th year

Author: Albert Vilella
Amonida Zadissa
Andreas Kähäri
Andrew Jenkinson
Anne Parker
Ashburner
Benoit Ballester
Bert Overduin
Bethan Pritchard
Birney
Bronwen L. Aken
Damian Keefe
Damian Smedley
Daniel Lawson
Daniel Rios
den Dunnen
Dwinell
Eugene Bragin
Eugene Kulesha
Ewan Birney
Felix Kokocinski
Fiona Cunningham
Flicek
Gautier Koscielny
Giulietta Spudich
Glenn Proctor
Guy Coates
Guy Slater
Haider
Hindorff
Hubbard
Ian Dunham
Ian Longden
James Smith
Jan Vogel
Javier Herrero
Julio Fernandez-Banet
Kaput
Karine Megy
Kathryn Beal
Kerstin Howe
Koscielny
Kuhn
Leo Gordon
Magali Ruffier
Martin Hammond
Michael Schuster
Mikkelsen
Nathan Johnson
Paten
Paten
Paten
Paul Flicek
Peter Clapham
Pruitt
Pruitt
Rhoda Kinsella
Richard Durbin
Sayers
Simon Brent
Simon White
Slater
Smedley
Stefan Gräf
Stephen Fitzgerald
Stephen Keenan
Stephen M. J. Searle
Stephen Trevanion
Steven P. Wilder
Susan Fairley
Syed Haider
Tim J. P. Hubbard
Tim Massingham
UniProt Consortium
Vilella
Wallace
Wang
William McLaren
Wilming
Xosé M. Fernández-Suarez
Y. Amy Tang
Yuan Chen
Publication venue: Oxford University Press
Publication date: 01/01/2010
Field of study

Ensembl (http://www.ensembl.org) integrates genomic information for a comprehensive set of chordate genomes with a particular focus on resources for human, mouse, rat, zebrafish and other high-value sequenced genomes. We provide complete gene annotations for all supported species in addition to specific resources that target genome variation, function and evolution. Ensembl data is accessible in a variety of formats including via our genome browser, API and BioMart. This year marks the tenth anniversary of Ensembl and in that time the project has grown with advances in genome technology. As of release 56 (September 2009), Ensembl supports 51 species including marmoset, pig, zebra finch, lizard, gorilla and wallaby, which were added in the past year. Major additions and improvements to Ensembl since our previous report include the incorporation of the human GRCh37 assembly, enhanced visualisation and data-mining options for the Ensembl regulatory features and continued development of our software infrastructure

Crossref

HAL AMU

PubMed Central

King's Research Portal

Pig genome sequence - analysis and publication strategy

Author: Archibald Alan L.
Bolund Lars
Churcher Carol
Fredholm Merete
Groenen Martien A. M.
Harlizius Barbara
Lee Kyung-Tai
Milan Denis
Rogers Jane
Rothschild Max F.
Schook Lawrence B.
Uenishi Hirohide
Wang Jun
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background The pig genome is being sequenced and characterised under the auspices of the Swine Genome Sequencing Consortium. The sequencing strategy followed a hybrid approach combining hierarchical shotgun sequencing of BAC clones and whole genome shotgun sequencing. Results Assemblies of the BAC clone derived genome sequence have been annotated using the Pre-Ensembl and Ensembl automated pipelines and made accessible through the Pre-Ensembl/Ensembl browsers. The current annotated genome assembly (Sscrofa9) was released with Ensembl 56 in September 2009. A revised assembly (Sscrofa10) is under construction and will incorporate whole genome shotgun sequence (WGS) data providing > 30× genome coverage. The WGS sequence, most of which comprise short Illumina/Solexa reads, were generated from DNA from the same single Duroc sow as the source of the BAC library from which clones were preferentially selected for sequencing. In accordance with the Bermuda and Fort Lauderdale agreements and the more recent Toronto Statement the data have been released into public sequence repositories (Genbank/EMBL, NCBI/Ensembl trace repositories) in a timely manner and in advance of publication. Conclusions In this marker paper, the Swine Genome Sequencing Consortium (SGSC) sets outs its plans for analysis of the pig genome sequence, for the application and publication of the results.</p

Digital Repository @ Iowa State University (ISU)

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Copenhagen University Research Information System

Edinburgh Research Explorer

Wageningen University & Research Publications

ProdInra

GenomeGraphs: integrated genomic data visualization with R.

Author: Bullard James
Dudoit Sandrine
Durinck Steffen
Spellman Paul T
Publication venue: eScholarship, University of California
Publication date: 01/01/2009
Field of study

BackgroundBiological studies involve a growing number of distinct high-throughput experiments to characterize samples of interest. There is a lack of methods to visualize these different genomic datasets in a versatile manner. In addition, genomic data analysis requires integrated visualization of experimental data along with constantly changing genomic annotation and statistical analyses.ResultsWe developed GenomeGraphs, as an add-on software package for the statistical programming environment R, to facilitate integrated visualization of genomic datasets. GenomeGraphs uses the biomaRt package to perform on-line annotation queries to Ensembl and translates these to gene/transcript structures in viewports of the grid graphics package. This allows genomic annotation to be plotted together with experimental data. GenomeGraphs can also be used to plot custom annotation tracks in combination with different experimental data types together in one plot using the same genomic coordinate system.ConclusionGenomeGraphs is a flexible and extensible software package which can be used to visualize a multitude of genomic datasets within the statistical programming environment R

Springer - Publisher Connector

PubMed Central

eScholarship - University of California

Ensembl Genomes: an integrative resource for genome-scale data from non-vertebrate species

Author: Andrew Yates
Arnaud Kerhornou
Atwell
Cochrane
Daniel Lawson
Daniel M. Staines
Daniel S. T. Hughes
Derek Wilson
Eugene Kulesha
Ewan Birney
Flicek
Fujita
Gan
Gaudet
Gautier Koscielny
Harris
Harris
Helder Pedro
Iliana Toneva
Jay C. Humphrey
Jenkinson
Karine Megy
Kent
Kent
Kinsella
Lawson
Liti
Mabey
Mark D. McDowall
Michael Nuhn
Michael Paulini
Nicholas Langridge
Paten
Paul Derwent
Paul J. Kersey
Quevillon
Sherry
Smedley
Stephan Keenan
Swarbreck
The UniProt Consortium
Uma Maheswari
Vilella
Youens-Clark
Publication venue: Oxford University Press
Publication date
Field of study

Ensembl Genomes (http://www.ensemblgenomes.org) is an integrative resource for genome-scale data from non-vertebrate species. The project exploits and extends technology (for genome annotation, analysis and dissemination) developed in the context of the (vertebrate-focused) Ensembl project and provides a complementary set of resources for non-vertebrate species through a consistent set of programmatic and interactive interfaces. These provide access to data including reference sequence, gene models, transcriptional data, polymorphisms and comparative analysis. Since its launch in 2009, Ensembl Genomes has undergone rapid expansion, with the goal of providing coverage of all major experimental organisms, and additionally including taxonomic reference points to provide the evolutionary context in which genes can be understood. Against the backdrop of a continuing increase in genome sequencing activities in all parts of the tree of life, we seek to work, wherever possible, with the communities actively generating and using data, and are participants in a growing range of collaborations involved in the annotation and analysis of genomes

Crossref

PubMed Central

A database of orthologous exons in primates for comparative analysis of RNA-seq data

Author: Ran Blekhman
Publication venue
Publication date: 31/03/2012
Field of study

RNA-seq technology facilitates the study of gene expression at the level of individual exons and transcripts. Moreover, RNA-seq enables unbiased comparative analysis of expression levels across species. Such analyses typically start by mapping sequenced reads to the appropriate reference genome before comparing expression levels across species. However, this comparison requires prior knowledge of orthology at the exon level. With this in mind, I constructed a database of orthologous exons across three primate species (human, chimpanzee, and rhesus macaque). The database facilitates cross-species comparative analysis of exon- and transcript-level regulation. A web application allowing for an easy database query: http://giladlab.uchicago.edu/orthoExon

Nature Precedings

Automated DNA Motif Discovery

Author: Graillet Olivia Sanchez
Harrison A. P.
Langdon W. B.
Publication venue
Publication date: 01/01/2010
Field of study

Ensembl's human non-coding and protein coding genes are used to automatically find DNA pattern motifs. The Backus-Naur form (BNF) grammar for regular expressions (RE) is used by genetic programming to ensure the generated strings are legal. The evolved motif suggests the presence of Thymine followed by one or more Adenines etc. early in transcripts indicate a non-protein coding gene. Keywords: pseudogene, short and microRNAs, non-coding transcripts, systems biology, machine learning, Bioinformatics, motif, regular expression, strongly typed genetic programming, context-free grammar.Comment: 12 pages, 2 figure

arXiv.org e-Print Archive

UCL Discovery

Publications at Bielefeld University

How and why DNA barcodes underestimate the diversity of microbial eukaryotes

Author: Adam Eyre-Walker
AR Boyko
AZ Worden
AZ Worden
B Charlesworth
B Palenik
DT Jones
F Not
G Piganeau
Gwenael Piganeau
Hervé Moreau
J Coyne
J Crow
JJ Welch
K Romari
M Viprey
ML Cuvelier
Nigel Grimsley
P Flicek
P Lopez-Garcia
PD Keightley
Purification Lopez-Garcia
S Gourbiere
S Jancek
S Proost
SB Needleman
SJ Williamson
SL Baldauf
SY Moon-van der Staay
Z Yang
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/02/2011
Field of study

Background: Because many picoplanktonic eukaryotic species cannot currently be maintained in culture, direct sequencing of PCR-amplified 18S ribosomal gene DNA fragments from filtered sea-water has been successfully used to investigate the astounding diversity of these organisms. The recognition of many novel planktonic organisms is thus based solely on their 18S rDNA sequence. However, a species delimited by its 18S rDNA sequence might contain many cryptic species, which are highly differentiated in their protein coding sequences. Principal Findings: Here, we investigate the issue of species identification from one gene to the whole genome sequence. Using 52 whole genome DNA sequences, we estimated the global genetic divergence in protein coding genes between organisms from different lineages and compared this to their ribosomal gene sequence divergences. We show that this relationship between proteome divergence and 18S divergence is lineage dependant. Unicellular lineages have especially low 18S divergences relative to their protein sequence divergences, suggesting that 18S ribosomal genes are too conservative to assess planktonic eukaryotic diversity. We provide an explanation for this lineage dependency, which suggests that most species with large effective population sizes will show far less divergence in 18S than protein coding sequences. Conclusions: There is therefore a trade-off between using genes that are easy to amplify in all species, but which by their nature are highly conserved and underestimate the true number of species, and using genes that give a better description of the number of species, but which are more difficult to amplify. We have shown that this trade-off differs between unicellular and multicellular organisms as a likely consequence of differences in effective population sizes. We anticipate that biodiversity of microbial eukaryotic species is underestimated and that numerous ''cryptic species'' will become discernable with the future acquisition of genomic and metagenomic sequences

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Sussex Research Online