Search CORE

117 research outputs found

ENCODE whole-genome data in the UCSC Genome Browser

Author: A. Pohl
A. S. Hinrichs
A. S. Zweig
B. J. Raney
B. Rhead
Celniker
D. Haussler
D. Karolchik
G. P. Barber
K. E. Smith
K. Learned
K. R. Rosenbloom
L. R. Meyer
M. Pheasant
P. A. Fujita
R. M. Kuhn
T. R. Dreszer
T. Wang
The ENCODE Project Consortium
W. J. Kent
Weinstock
Publication venue: Oxford University Press
Publication date: 01/01/2010
Field of study

The Encyclopedia of DNA Elements (ENCODE) project is an international consortium of investigators funded to analyze the human genome with the goal of producing a comprehensive catalog of functional elements. The ENCODE Data Coordination Center at The University of California, Santa Cruz (UCSC) is the primary repository for experimental results generated by ENCODE investigators. These results are captured in the UCSC Genome Bioinformatics database and download server for visualization and data mining via the UCSC Genome Browser and companion tools (Rhead et al. The UCSC Genome Browser Database: update 2010, in this issue). The ENCODE web portal at UCSC (http://encodeproject.org or http://genome.ucsc.edu/ENCODE) provides information about the ENCODE data and convenient links for access

Crossref

PubMed Central

University of Queensland eSpace

Forces Shaping the Fastest Evolving Regions in the Human Genome

Author: Adam Siepel
Andrew D Kern
Bryan King
Chimpanzee Sequencing and Analysis Consortium
David Haussler
ENCODE Project Consortium
Gene Ontology Consortium
Gill Bejerano
International HapMap Consortium
Jakob S Pedersen
Jim Kent
Kate R Rosenbloom
Katherine S Pollard
Molly Przeworski
Rat Genome Sequencing Project
Robert Baertsch
Sofie R Salama
Sol Katzman
Tim Dreszer
Publication venue: Public Library of Science
Publication date: 01/01/2005
Field of study

Comparative genomics allow us to search the human genome for segments that were extensively changed in the last ~5 million years since divergence from our common ancestor with chimpanzee, but are highly conserved in other species and thus are likely to be functional. We found 202 genomic elements that are highly conserved in vertebrates but show evidence of significantly accelerated substitution rates in human. These are mostly in non-coding DNA, often near genes associated with transcription and DNA binding. Resequencing confirmed that the five most accelerated elements are dramatically changed in human but not in other primates, with seven times more substitutions in human than in chimp. The accelerated elements, and in particular the top five, show a strong bias for adenine and thymine to guanine and cytosine nucleotide changes and are disproportionately located in high recombination and high guanine and cytosine content environments near telomeres, suggesting either biased gene conversion or isochore selection. In addition, there is some evidence of directional selection in the regions containing the two most accelerated regions. A combination of evolutionary forces has contributed to accelerated evolution of the fastest evolving elements in the human genome

Public Library of Science (PLOS)

Crossref

Cold Spring Harbor Laboratory Institutional Repository

Directory of Open Access Journals

PubMed Central

Copenhagen University Research Information System

eScholarship - University of California

The UCSC Genome Browser database: update 2010

Author: A. Pohl
A. S. Hinrichs
A. S. Zweig
Austin
B. Giardine
B. J. Raney
B. Rhead
Berman
Blanchette
D. Haussler
D. Karolchik
F. Hsu
Feuk
G. P. Barber
H. Clawson
Hsu
Iafrate
J. Hillman-Jackson
Jain
K. E. Smith
K. Learned
K. R. Rosenbloom
Kaiser
Karolchik
Karolchik
Kent
L. R. Meyer
M. Diekhans
M. Pheasant
Nord
P. A. Fujita
Pettersen
R. A. Harte
R. M. Kuhn
Sherry
T. R. Dreszer
The ENCODE Project Consortium
The MGC Project Team
W. J. Kent
Publication venue: Oxford University Press
Publication date: 01/01/2010
Field of study

The University of California, Santa Cruz (UCSC) Genome Browser website (http://genome.ucsc.edu/) provides a large database of publicly available sequence and annotation data along with an integrated tool set for examining and comparing the genomes of organisms, aligning sequence to genomes, and displaying and sharing users’ own annotation data. As of September 2009, genomic sequence and a basic set of annotation ‘tracks’ are provided for 47 organisms, including 14 mammals, 10 non-mammal vertebrates, 3 invertebrate deuterostomes, 13 insects, 6 worms and a yeast. New data highlights this year include an updated human genome browser, a 44-species multiple sequence alignment track, improved variation and phenotype tracks and 16 new genome-wide ENCODE tracks. New features include drag-and-zoom navigation, a Wiki track for user-added annotations, new custom track formats for large datasets (bigBed and bigWig), a new multiple alignment output tool, links to variation and protein structure tools, in silico PCR utility enhancements, and improved track configuration tools

CiteSeerX

Crossref

PubMed Central

University of Queensland eSpace

ENCODE whole-genome data in the UCSC genome browser (2011 update)

Author: Andy Pohl
Angie S. Hinrichs
Ann S. Zweig
Baroni
Bernard B. Suh
Birney
Brian J. Raney
Brooke Rhead
Celniker
Cricket A. Sloan
David Haussler
Donna Karolchik
Galt P. Barber
Greenbaum
Harrow
Hershey
Hesselberth
Hiram Clawson
Kan
Kate R. Rosenbloom
Katrina Learned
Kayla E. Smith
Kent
Khatun
King
Krishna M. Roskin
Kuhn
Laurence R. Meyer
Li
Melissa S. Cline
Pauline A. Fujita
Robert M. Kuhn
Rosenbloom
Timothy R. Dreszer
Vanessa Kirkup
Venkat S. Malladi
Via
W. James Kent
Weirauch
Publication venue: Oxford University Press
Publication date: 01/01/2010
Field of study

The ENCODE project is an international consortium with a goal of cataloguing all the functional elements in the human genome. The ENCODE Data Coordination Center (DCC) at the University of California, Santa Cruz serves as the central repository for ENCODE data. In this role, the DCC offers a collection of high-throughput, genome-wide data generated with technologies such as ChIP-Seq, RNA-Seq, DNA digestion and others. This data helps illuminate transcription factor-binding sites, histone marks, chromatin accessibility, DNA methylation, RNA expression, RNA binding and other cell-state indicators. It includes sequences with quality scores, alignments, signals calculated from the alignments, and in most cases, element or peak calls calculated from the signal data. Each data set is available for visualization and download via the UCSC Genome Browser (http://genome.ucsc.edu/). ENCODE data can also be retrieved using a metadata system that captures the experimental parameters of each assay. The ENCODE web portal at UCSC (http://encodeproject.org/) provides information about the ENCODE data and links for access

CiteSeerX

Crossref

PubMed Central

Comparative analysis of RNA sequencing methods for degraded or low-input samples

Author: A Roberts
Aaron M Berlin
AL Beyer
Alec Wysoker
Andreas Gnirke
Andrey Sivachenko
Aviv Regev
B Langmead
B Li
BE Maden
C Trapnell
D Aird
D Ramsköld
David S DeLuca
Dawn Anne Thompson
Diego Borges-Rivera
DS DeLuca
F Tang
G Giannoukos
H Aviv
H Li
H Yi
JD Morlan
Joshua Z Levin
JZ Levin
L Yang
M Griffin
MA Tariq
Michele A Busby
Nathalie Pochet
R Huang
R Rosenkranz
Rahul Satija
S Islam
Timothy Fennell
TR Dreszer
X Pan
Xian Adiconis
YH Yang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/02/2013
Field of study

available in PMC 2014 January 01RNA-seq is an effective method for studying the transcriptome, but it can be difficult to apply to scarce or degraded RNA from fixed clinical samples, rare cell populations or cadavers. Recent studies have proposed several methods for RNA-seq of low-quality and/or low-quantity samples, but the relative merits of these methods have not been systematically analyzed. Here we compare five such methods using metrics relevant to transcriptome annotation, transcript discovery and gene expression. Using a single human RNA sample, we constructed and sequenced ten libraries with these methods and compared them against two control libraries. We found that the RNase H method performed best for chemically fragmented, low-quality RNA, and we confirmed this through analysis of actual degraded samples. RNase H can even effectively replace oligo(dT)-based methods for standard RNA-seq. SMART and NuGEN had distinct strengths for measuring low-quantity RNA. Our analysis allows biologists to select the most suitable methods and provides a benchmark for future method development.National Institutes of Health (U.S.) (Pioneer Award DP1-OD003958-01)National Human Genome Research Institute (U.S.) (NHGRI) 1P01HG005062-01)National Human Genome Research Institute (U.S.) (NHGRI Center of Excellence in Genome Science Award 1P50HG006193-01)Howard Hughes Medical Institute (Investigator)Merkin Family Foundation for Stem Cell ResearchBroad Institute of MIT and Harvard (Klarman Cell Observatory)National Human Genome Research Institute (U.S.) (NHGRI grant HG03067)Fonds voor Wetenschappelijk Onderzoek--Vlaandere

DSpace@MIT

Crossref

PubMed Central

The UCSC Genome Browser database: extensions and updates 2011

Author: A. Pohl
A. S. Hinrichs
A. S. Zweig
Alfoldi
Altschul
Archibald
B. J. Raney
B. M. Giardine
B. Rhead
Bernstein
Blanchette
C. A. Sloan
C. H. Li
Cherry
D. Haussler
D. Karolchik
Durbin
F. Hsu
Firth
G. P. Barber
G. Roe
Gross
H. Clawson
Hubbard
K. Learned
K. R. Rosenbloom
L. Guruvadoo
L. R. Meyer
M. Diekhans
M. Goldman
M. S. Cline
M. Wong
McPherson
P. A. Fujita
R. A. Harte
R. M. Kuhn
Sherry
T. R. Dreszer
V. Kirkup
V. S. Malladi
W. James Kent
Zimin
Publication venue: Oxford University Press
Publication date
Field of study

The University of California Santa Cruz Genome Browser (http://genome.ucsc.edu) offers online public access to a growing database of genomic sequence and annotations for a wide variety of organisms. The Browser is an integrated tool set for visualizing, comparing, analyzing and sharing both publicly available and user-generated genomic data sets. In the past year, the local database has been updated with four new species assemblies, and we anticipate another four will be released by the end of 2011. Further, a large number of annotation tracks have been either added, updated by contributors, or remapped to the latest human reference genome. Among these are new phenotype and disease annotations, UCSC genes, and a major dbSNP update, which required new visualization methods. Growing beyond the local database, this year we have introduced ‘track data hubs’, which allow the Genome Browser to provide access to remotely located sets of annotations. This feature is designed to significantly extend the number and variety of annotation tracks that are publicly available for visualization and analysis from within our site. We have also introduced several usability features including track search and a context-sensitive menu of options available with a right-click anywhere on the Browser's image

Crossref

PubMed Central

The UCSC Genome Browser Database: update 2009

Author: A. Pohl
A. S. Hinrichs
A. S. Zweig
B. Giardine
B. J. Raney
B. Rhead
Bellen
Blanchette
D. Haussler
D. Karolchik
F. Hsu
G. P. Barber
H. Clawson
Hinrichs
Hsu
Iafrate
K. E. Smith
K. R. Rosenbloom
Karolchik
Karolchik
Kent
L. Meyer
M. Diekhans
M. Pheasant
Mattes
Nord
P. Fujita
R. A. Harte
R. M. Kuhn
Sherry
T. Dreszer
T. Wang
The ENCODE Project Consortium
The MGC Project Team
W. J. Kent
Yang
Zhu
Publication venue: Oxford University Press
Publication date
Field of study

The UCSC Genome Browser Database (GBD, http://genome.ucsc.edu) is a publicly available collection of genome assembly sequence data and integrated annotations for a large number of organisms, including extensive comparative-genomic resources. In the past year, 13 new genome assemblies have been added, including two important primate species, orangutan and marmoset, bringing the total to 46 assemblies for 24 different vertebrates and 39 assemblies for 22 different invertebrate animals. The GBD datasets may be viewed graphically with the UCSC Genome Browser, which uses a coordinate-based display system allowing users to juxtapose a wide variety of data. These data include all mRNAs from GenBank mapped to all organisms, RefSeq alignments, gene predictions, regulatory elements, gene expression data, repeats, SNPs and other variation data, as well as pairwise and multiple-genome alignments. A variety of other bioinformatics tools are also provided, including BLAT, the Table Browser, the Gene Sorter, the Proteome Browser, VisiGene and Genome Graphs

CiteSeerX

Crossref

PubMed Central

Substitution Patterns Are GC-Biased in Divergent Sequences across the Metazoans

Author: Berglund
Birdsell
Blanchette
Charlesworth
Clément
Coop
Cox
Dreszer
Duret
Duret
Eyre-Walker
Eyre-Walker
Fiston-Lavier
Fullerton
Galtier
Galtier
Glémin
Glémin
Groenen
Harrison
Hernandez
Hubisz
Hunter
Hurst
International Chicken Genome Sequencing Consortium
Jackson Laboratories
John A. Capra
Jones
Karolchik
Katherine S. Pollard
Katzman
Kent
Kent
Kong
Kuraku
Lynch
Mancera
Marais
Marais
Marais
Meunier
Oliver
Pollard
Pollard
Pollard
Prabhakar
R Development Core Team
Ratnakumar
Romiguier
Sherry
Shifman
Siepel
Siepel
Smit
The International Hapmap Consortium
Tsai
Tsai
Webster
Webster
Publication venue: Oxford University Press
Publication date: 01/01/2011
Field of study

The fastest-evolving regions in the human and chimpanzee genomes show a remarkable excess of weak (A,T) to strong (G,C) nucleotide substitutions since divergence from their common ancestor. We investigated the phylogenetic extent and possible causes of this weak to strong (W→S) bias in divergent sequences (BDS) using recently sequenced genomes and recombination maps from eight trios of eukaryotic species. To quantify evidence for BDS, we inferred substitution histories using an efficient maximum likelihood approach with a context-dependent evolutionary model. We then annotated all lineage-specific substitutions in terms of W→S bias and density on the chromosomes. Finally, we used the inferred substitutions to calculate a BDS score—a log odds ratio between substitution type and density—and assessed its statistical significance with Fisher's exact test. Applying this approach, we found significant BDS in the coding and noncoding sequence of human, mouse, dog, stickleback, fruit fly, and worm. We also observed a significant lack of W→S BDS in chicken and yeast. The BDS score varies between species and across the chromosomes within each species. It is most strongly correlated with different genomic features in different species, but a strong correlation with recombination rates is found in several species. Our results demonstrate that a W→S substitution bias in fast-evolving sequences is a widespread phenomenon. The patterns of BDS observed suggest that a recombination-associated process, such as GC-biased gene conversion, is involved in the production of the bias in many species, but the strength of the BDS likely depends on many factors, including genome stability, variability in recombination rate over time and across the genome, the frequency of meiosis, and the amount of outcrossing in each species

Crossref

PubMed Central

eScholarship - University of California

The Tetraodon nigroviridis reference transcriptome: Developmental transition, length retention and microsynteny of long non-coding RNAs in a compact vertebrate genome

Author: A Kapusta
A Necsulea
A Pauli
A Stabenau
AJ Vilella
AR Quinlan
B Maher
C Nepal
C Trapnell
C Weaver
CA Watson
CM Smith
D Kim
DR Kelley
F Pelegri
G St. Laurent
GT Williams
H Aanes
H Hezroni
H Roest Crollius
H Roest Crollius
H Tilgner
I Ulitsky
J Harrow
J Kim
J Ponjavic
J Ruiz-Orera
J-W Nam
JB Brown
M Blanchette
M Chorev
M Lohse
MD Robinson
MN Cabili
NT Ingolia
O Jaillon
P Flicek
P Heyn
P Miura
R Arrial
RC Gentleman
S Aparicio
S Basu
S Brenner
S Durinck
S Mathavan
SA Harvey
SS Paranjpe
T Derrien
T Kino
TR Dreszer
V Haberle
W Tadros
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Pufferfish such as fugu and tetraodon carry the smallest genomes among all vertebrates and are ideal for studying genome evolution. However, comparative genomics using these species is hindered by the poor annotation of their genomes. We performed RNA sequencing during key stages of maternal to zygotic transition of Tetraodon nigroviridis and report its first developmental transcriptome. We assembled 61,033 transcripts (23,837 loci) representing 80% of the annotated gene models and 3816 novel coding transcripts from 2667 loci. We demonstrate the similarities of gene expression profiles between pufferfish and zebrafish during maternal to zygotic transition and annotated 1120 long non-coding RNAs (lncRNAs) many of which differentially expressed during development. The promoters for 60% of the assembled transcripts result validated by CAGE-seq. Despite the extreme compaction of the tetraodon genome and the dramatic loss of transposons, the length of lncRNA exons remain comparable to that of other vertebrates and a small set of lncRNAs appears enriched for transposable elements suggesting a selective pressure acting on lncRNAs length and composition. Finally, a set of lncRNAs are microsyntenic between teleost and vertebrates, which indicates potential regulatory interactions between lncRNAs and their flanking coding genes. Our work provides a fundamental molecular resource for vertebrate comparative genomics and embryogenesis studies

Crossref

KITopen

University of Birmingham Research Portal

PubMed Central

Sissa Digital Library