Search CORE

23 research outputs found

ENCODE whole-genome data in the UCSC genome browser (2011 update)

Author: Andy Pohl
Angie S. Hinrichs
Ann S. Zweig
Baroni
Bernard B. Suh
Birney
Brian J. Raney
Brooke Rhead
Celniker
Cricket A. Sloan
David Haussler
Donna Karolchik
Galt P. Barber
Greenbaum
Harrow
Hershey
Hesselberth
Hiram Clawson
Kan
Kate R. Rosenbloom
Katrina Learned
Kayla E. Smith
Kent
Khatun
King
Krishna M. Roskin
Kuhn
Laurence R. Meyer
Li
Melissa S. Cline
Pauline A. Fujita
Robert M. Kuhn
Rosenbloom
Timothy R. Dreszer
Vanessa Kirkup
Venkat S. Malladi
Via
W. James Kent
Weirauch
Publication venue: Oxford University Press
Publication date: 01/01/2010
Field of study

The ENCODE project is an international consortium with a goal of cataloguing all the functional elements in the human genome. The ENCODE Data Coordination Center (DCC) at the University of California, Santa Cruz serves as the central repository for ENCODE data. In this role, the DCC offers a collection of high-throughput, genome-wide data generated with technologies such as ChIP-Seq, RNA-Seq, DNA digestion and others. This data helps illuminate transcription factor-binding sites, histone marks, chromatin accessibility, DNA methylation, RNA expression, RNA binding and other cell-state indicators. It includes sequences with quality scores, alignments, signals calculated from the alignments, and in most cases, element or peak calls calculated from the signal data. Each data set is available for visualization and download via the UCSC Genome Browser (http://genome.ucsc.edu/). ENCODE data can also be retrieved using a metadata system that captures the experimental parameters of each assay. The ENCODE web portal at UCSC (http://encodeproject.org/) provides information about the ENCODE data and links for access

CiteSeerX

Crossref

PubMed Central

Comparative analysis of RNA sequencing methods for degraded or low-input samples

Author: A Roberts
Aaron M Berlin
AL Beyer
Alec Wysoker
Andreas Gnirke
Andrey Sivachenko
Aviv Regev
B Langmead
B Li
BE Maden
C Trapnell
D Aird
D Ramsköld
David S DeLuca
Dawn Anne Thompson
Diego Borges-Rivera
DS DeLuca
F Tang
G Giannoukos
H Aviv
H Li
H Yi
JD Morlan
Joshua Z Levin
JZ Levin
L Yang
M Griffin
MA Tariq
Michele A Busby
Nathalie Pochet
R Huang
R Rosenkranz
Rahul Satija
S Islam
Timothy Fennell
TR Dreszer
X Pan
Xian Adiconis
YH Yang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/02/2013
Field of study

available in PMC 2014 January 01RNA-seq is an effective method for studying the transcriptome, but it can be difficult to apply to scarce or degraded RNA from fixed clinical samples, rare cell populations or cadavers. Recent studies have proposed several methods for RNA-seq of low-quality and/or low-quantity samples, but the relative merits of these methods have not been systematically analyzed. Here we compare five such methods using metrics relevant to transcriptome annotation, transcript discovery and gene expression. Using a single human RNA sample, we constructed and sequenced ten libraries with these methods and compared them against two control libraries. We found that the RNase H method performed best for chemically fragmented, low-quality RNA, and we confirmed this through analysis of actual degraded samples. RNase H can even effectively replace oligo(dT)-based methods for standard RNA-seq. SMART and NuGEN had distinct strengths for measuring low-quantity RNA. Our analysis allows biologists to select the most suitable methods and provides a benchmark for future method development.National Institutes of Health (U.S.) (Pioneer Award DP1-OD003958-01)National Human Genome Research Institute (U.S.) (NHGRI) 1P01HG005062-01)National Human Genome Research Institute (U.S.) (NHGRI Center of Excellence in Genome Science Award 1P50HG006193-01)Howard Hughes Medical Institute (Investigator)Merkin Family Foundation for Stem Cell ResearchBroad Institute of MIT and Harvard (Klarman Cell Observatory)National Human Genome Research Institute (U.S.) (NHGRI grant HG03067)Fonds voor Wetenschappelijk Onderzoek--Vlaandere

DSpace@MIT

Crossref

PubMed Central

Biased clustered substitutions in the human genome: The footprints of male-driven biased gene conversion

Author: Dreszer Timothy R.
Haussler David
Pollard Katherine S.
Wall Gregory D.
Publication venue: Cold Spring Harbor Laboratory Press
Publication date
Field of study

We examined fixed substitutions in the human lineage since divergence from the common ancestor with the chimpanzee, and determined what fraction are AT to GC (weak-to-strong). Substitutions that are densely clustered on the chromosomes show a remarkable excess of weak-to-strong “biased” substitutions. These unexpected biased clustered substitutions (UBCS) are common near the telomeres of all autosomes but not the sex chromosomes. Regions of extreme bias are enriched for genes. Human and chimp orthologous regions show a striking similarity in the shape and magnitude of their respective UBCS maps, suggesting a relatively stable force leads to clustered bias. The strong and stable signal near telomeres may have participated in the evolution of isochores. One exception to the UBCS pattern found in all autosomes is chromosome 2, which shows a UBCS peak midchromosome, mapping to the fusion site of two ancestral chromosomes. This provides evidence that the fusion occurred as recently as 740,000 years ago and no more than ∼3 million years ago. No biased clustering was found in SNPs, suggesting that clusters of biased substitutions are selected from mutations. UBCS is strongly correlated with male (and not female) recombination rates, which explains the lack of UBCS signal on chromosome X. These observations support the hypothesis that biased gene conversion (BGC), specifically in the male germline, played a significant role in the evolution of the human genome

Crossref

PubMed Central

Principles of metadata organization at the ENCODE data coordination center

Author: Aditi K. Narayanan
Benjamin C. Hitz
Brian T. Lee
Brinkman
Cricket A. Sloan
Esther T. Chan
Eurie L. Hong
Forrest Tanaka
Greg R. Roe
Idan Gabdank
J. Michael Cherry
J. Seth Strattan
Jason A. Hilton
Jean M. Davidson
Laurence D. Rowe
Marcus Ho
Nikhil R. Podduturi
Sloan
Timothy R. Dreszer
Venkat S. Malladi
Washington
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2016
Field of study

Crossref

ENCODE data at the ENCODE portal

Author: Aditi K. Narayanan
Benjamin C. Hitz
Brian T. Lee
Cricket A. Sloan
Esther T. Chan
Eurie L. Hong
Forrest Tanaka
Greg Roe
Idan Gabdank
J. Michael Cherry
J. Seth Strattan
Jean M. Davidson
Laurence D. Rowe
Marcus Ho
Nikhil R. Podduturi
Timothy R. Dreszer
Venkat S. Malladi
Publication venue: 'Oxford University Press (OUP)'
Publication date
Field of study

Crossref

SnoVault and encodeD: A novel object-based storage system and applications to ENCODE metadata.

Author: Aditi K Narayana
Benjamin C Hitz
Brian T Lee
Cricket A Sloan
David I Glick
Esther T Chan
Eurie L Hong
Forrest Y Tanaka
Idan Gabdank
J Michael Cherry
J Seth Strattan
Jason Hilton
Jean M Davidson
Kathrina C Onate
Laurence D Rowe
Marcus C Ho
Nikhil R Podduturi
Stuart R Miyasato
Timothy R Dreszer
Ulugbek K Baymuradov
Venkat S Malladi
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2017
Field of study

The Encyclopedia of DNA elements (ENCODE) project is an ongoing collaborative effort to create a comprehensive catalog of functional elements initiated shortly after the completion of the Human Genome Project. The current database exceeds 6500 experiments across more than 450 cell lines and tissues using a wide array of experimental techniques to study the chromatin structure, regulatory and transcriptional landscape of the H. sapiens and M. musculus genomes. All ENCODE experimental data, metadata, and associated computational analyses are submitted to the ENCODE Data Coordination Center (DCC) for validation, tracking, storage, unified processing, and distribution to community resources and the scientific community. As the volume of data increases, the identification and organization of experimental details becomes increasingly intricate and demands careful curation. The ENCODE DCC has created a general purpose software system, known as SnoVault, that supports metadata and file submission, a database used for metadata storage, web pages for displaying the metadata and a robust API for querying the metadata. The software is fully open-source, code and installation instructions can be found at: http://github.com/ENCODE-DCC/snovault/ (for the generic database) and http://github.com/ENCODE-DCC/encoded/ to store genomic data in the manner of ENCODE. The core database engine, SnoVault (which is completely independent of ENCODE, genomic data, or bioinformatic data) has been released as a separate Python package

Directory of Open Access Journals

PubMed Central

Recommended from our members

The UCSC Genome Browser database: 2014 update.

Author: Barber Galt P
Casper Jonathan
Clawson Hiram
Cline Melissa S
Diekhans Mark
Dreszer Timothy R
Fujita Pauline A
Guruvadoo Luvina
Haeussler Maximilian
Harte Rachel A
Haussler David
Heitner Steve
Hinrichs Angie S
Karolchik Donna
Kent W James
Kuhn Robert M
Learned Katrina
Lee Brian T
Li Chin H
Raney Brian J
Rhead Brooke
Rosenbloom Kate R
Sloan Cricket A
Speir Matthew L
Zweig Ann S
Publication venue: eScholarship, University of California
Publication date: 01/01/2014
Field of study

The University of California Santa Cruz (UCSC) Genome Browser (http://genome.ucsc.edu) offers online public access to a growing database of genomic sequence and annotations for a large collection of organisms, primarily vertebrates, with an emphasis on the human and mouse genomes. The Browser's web-based tools provide an integrated environment for visualizing, comparing, analysing and sharing both publicly available and user-generated genomic data sets. As of September 2013, the database contained genomic sequence and a basic set of annotation 'tracks' for ∼90 organisms. Significant new annotations include a 60-species multiple alignment conservation track on the mouse, updated UCSC Genes tracks for human and mouse, and several new sets of variation and ENCODE data. New software tools include a Variant Annotation Integrator that returns predicted functional effects of a set of variants uploaded as a custom track, an extension to UCSC Genes that displays haplotype alleles for protein-coding genes and an expansion of data hubs that includes the capability to display remotely hosted user-provided assembly sequence in addition to annotation data. To improve European access, we have added a Genome Browser mirror (http://genome-euro.ucsc.edu) hosted at Bielefeld University in Germany

eScholarship - University of California

The UCSC Genome Browser database: 2014 update.

Author: Armstrong Joel
Barber Galt P.
Casper Jonathan
Clawson Hiram
Diekhans Mark
Dreszer Timothy R.
Fujita Pauline A.
Guruvadoo Luvina
Haeussler Maximilian
Harte Rachel A.
Haussler David
Heitner Steve
Hickey Glenn
Hinrichs Angie S.
Hubley Robert
Karolchik Donna
Kent W. James
Kuhn Robert M.
Learned Katrina
Lee Brian T.
Li Chin H.
Miga Karen H.
Nguyen Ngan
Paten Benedict
Raney Brian J.
Rosenbloom Kate R.
Smit Arian F. A.
Speir Matthew L.
Zweig Ann S.
Publication venue: eScholarship, University of California
Publication date: 21/11/2013
Field of study

CiteSeerX

PubMed Central

eScholarship - University of California