Search CORE

203 research outputs found

DBTSS: DataBase of Transcriptional Start Sites progress report in 2012

Author: Barski
Ernst
K. Nakai
Mardis
Mills
R. Yamashita
S. Sugano
Sudmant
Suzuki
The FANTOM Consortium
Y. Suzuki
Publication venue: Oxford University Press
Publication date
Field of study

To support transcriptional regulation studies, we have constructed DBTSS (DataBase of Transcriptional Start Sites), which contains exact positions of transcriptional start sites (TSSs), determined with our own technique named TSS-seq, in the genomes of various species. In its latest version, DBTSS covers the data of the majority of human adult and embryonic tissues: it now contains 418 million TSS tag sequences from 28 tissues/cell cultures. Moreover, we integrated a series of our own transcriptomic data, such as the RNA-seq data of subcellular-fractionated RNAs as well as the ChIP-seq data of histone modifications and the binding of RNA polymerase II/several transcription factors in cultured cell lines into our original TSS information. We also included several external epigenomic data, such as the chromatin map of the ENCODE project. We further associated our TSS information with public or original single-nucleotide variation (SNV) data, in order to identify SNVs in the regulatory regions. These data can be browsed in our new viewer, which supports versatile search conditions of users. We believe that our new DBTSS will be an invaluable resource for interpreting the differential uses of TSSs and for identifying human genetic variations that are associated with disordered transcriptional regulation. DBTSS can be accessed at http://dbtss.hgc.jp

Crossref

PubMed Central

DBTSS: database of transcription start sites, progress report 2008

Author: Bentley
H. Wakaguri
K. Nakai
Lamb
Matys
Prabhakar
R. Yamashita
S. Sugano
Suzuki
The FANTOM Consortium
Y. Suzuki
Yamashita
Publication venue: Oxford University Press
Publication date: 01/01/2007
Field of study

DBTSS is a database of transcriptional start sites, based on our unique collection of precise, experimentally determined 5′-end sequences of full-length cDNAs. Since its first release in 2002, several major updates have been made. In this update, we expanded the human transcriptional start site dataset by 19 million uniquely mapped, and RefSeq-associated, 5′-end sequences, which were generated by a newly introduced Solexa sequencer. Moreover, in order to provide means for interpreting those massive TSS data, we implemented two new analytical tools: one for connecting expression information with predicted transcription factor binding sites; the other for examining evolutionary conservation or species-specificity of promoters and transcripts, which can be browsed by our own comparative genome viewer. With the expanded dataset and the enhanced functionalities, DBTSS provides a unique platform that enables in-depth transcriptome analyses. DBTSS is accessible at http://dbtss.hgc.jp/

CiteSeerX

Crossref

PubMed Central

Conserved temporal ordering of promoter activation implicates common mechanisms governing the immediate early response across cell types and stimuli

Author: Aitken James
Arner Erik
Carninci Piero
Daub Carsten
FANTOM consortium The
Forrest Alistair R. R.
Hayashizaki Yosihide
Itoh Masayoshi
Kawaji Hideya
Lassmann Timo
Semple Colin
Vacca Annalaura
Publication venue: 'The Royal Society'
Publication date: 16/07/2018
Field of study

Conserved temporal precedence between IEGs (light blue nodes) and other protein-coding genes (green nodes) is shown by directed edges. Genes annotated with the GO term 'response to endoplasmic reticulum stress' (GO:003497) have a red rectangle around the gene name; red squares indicate genes with CAGE clusters enriched for XBP1 transcription factor binding sites

Crossref

Edinburgh Research Explorer

FigShare

NONCODE v2.0: decoding the non-coding

Author: Aravin
B. Bai
Benson
C. Liu
G. Skogerbo
Girard
H. Zhao
Huang
J. Wang
Mattick
R. Chen
Rivas
S. He
T. Liu
The FANTOM Consortium
Y. Zhao
Zemann
Publication venue: Oxford University Press
Publication date
Field of study

The NONCODE database is an integrated knowledge database designed for the analysis of non-coding RNAs (ncRNAs). Since NONCODE was first released 3 years ago, the number of known ncRNAs has grown rapidly, and there is growing recognition that ncRNAs play important regulatory roles in most organisms. In the updated version of NONCODE (NONCODE v2.0), the number of collected ncRNAs has reached 206 226, including a wide range of microRNAs, Piwi-interacting RNAs and mRNA-like ncRNAs. The improvements brought to the database include not only new and updated ncRNA data sets, but also an incorporation of BLAST alignment search service and access through our custom UCSC Genome Browser. NONCODE can be found under http://www.noncode.org or http://noncode.bioinfo.org.cn

Crossref

PubMed Central

Penalized likelihood for sparse contingency tables with an application to full-length cDNA libraries

Author: A Mironov
BS Everitt
C Southan
Corinne Dahinden
D Brett
D Brett
F Liang
Giovanni Parmigiani
International Human Genome Sequencing Consortium
International Human Genome Sequencing Consortium
M Yuan
M Zavolan
Mark C Emerick
MR Regan
Peter Bühlmann
R Christensen
R Tibshirani
S Rosset
SL Lauritzen
T Imanishi
The FANTOM Consortium
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Abstract Background The joint analysis of several categorical variables is a common task in many areas of biology, and is becoming central to systems biology investigations whose goal is to identify potentially complex interaction among variables belonging to a network. Interactions of arbitrary complexity are traditionally modeled in statistics by log-linear models. It is challenging to extend these to the high dimensional and potentially sparse data arising in computational biology. An important example, which provides the motivation for this article, is the analysis of so-called full-length cDNA libraries of alternatively spliced genes, where we investigate relationships among the presence of various exons in transcript species. Results We develop methods to perform model selection and parameter estimation in log-linear models for the analysis of sparse contingency tables, to study the interaction of two or more factors. Maximum Likelihood estimation of log-linear model coefficients might not be appropriate because of the presence of zeros in the table's cells, and new methods are required. We propose a computationally efficient ℓ1-penalization approach extending the Lasso algorithm to this context, and compare it to other procedures in a simulation study. We then illustrate these algorithms on contingency tables arising from full-length cDNA libraries. Conclusion We propose regularization methods that can be used successfully to detect complex interaction patterns among categorical variables in a broad range of biological problems involving categorical variables.</p

Repository for Publications and Research Data

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Comprehensive characterisation of transcriptional activity during influenza A virus infection reveals biases in cap-snatching of host RNA sequences.

Author: Baillie Kenneth
Bertin Nicolas
Carninci Piero
Clohisey Sara
Digard Paul
FANTOM consortium The
Forrest Alistair A.
Hayashizaki Yoshihide
Hendry Ross W.
Hume David
Parkinson Nicholas
Summers Kim M
Tomoiu Andru
Wang Bo
Wise Helen
Publication venue: 'American Society for Microbiology'
Publication date: 11/03/2020
Field of study

Macrophages in the lung detect and respond to influenza A virus (IAV), determining the nature of the immune response. Using terminal-depth cap analysis of gene expression (CAGE), we quantified transcriptional activity of both host and pathogen over a 24-h time course of IAV infection in primary human monocyte-derived macrophages (MDMs). This method allowed us to observe heterogenous host sequences incorporated into IAV mRNA, "snatched" 5' RNA caps, and corresponding RNA sequences from host RNAs. In order to determine whether capsnatching is random or exhibits a bias, we systematically compared host sequences incorporated into viral mRNA ("snatched") against a complete survey of all background host RNA in the same cells, at the same time. Using a computational strategy designed to eliminate sources of bias due to read length, sequencing depth, and multimapping, we were able to quantify overrepresentation of host RNA features among the sequences that were snatched by IAV. We demonstrate biased snatching of numerous host RNAs, particularly small nuclear RNAs (snRNAs), and avoidance of host transcripts encoding host ribosomal proteins, which are required by IAV for replication. We then used a systems approach to describe the transcriptional landscape of the host response to IAV, observing many new features, including a failure of IAV-treated MDMs to induce feedback inhibitors of inflammation, seen in response to other treatments.IMPORTANCE Infection with influenza A virus (IAV) infection is responsible for an estimated 500,000 deaths and up to 5 million cases of severe respiratory illness each year. In this study, we looked at human primary immune cells (macrophages) infected with IAV. Our method allows us to look at both the host and the virus in parallel. We used these data to explore a process known as "cap-snatching," where IAV snatches a short nucleotide sequence from capped host RNA. This process was believed to be random. We demonstrate biased snatching of numerous host RNAs, including those associated with snRNA transcription, and avoidance of host transcripts encoding host ribosomal proteins, which are required by IAV for replication. We then describe the transcriptional landscape of the host response to IAV, observing new features, including a failure of IAV-treated MDMs to induce feedback inhibitors of inflammation, seen in response to other treatments

Crossref

Edinburgh Research Explorer

University of Queensland eSpace

The Functional RNA Database 3.0: databases to support mining and annotation of functional RNAs

Author: A. Yoshizawa
Altschul
Czech
E. Hattori
G. Terai
Griffiths-Jones
H. Okida
Inagaki
K. Asai
K. Yamada
Kawamura
Landgraf
Lestrade
Okamura
Sasaki
T. Komori
T. Mituyama
The FANTOM Consortium
Y. Ono
Publication venue: Oxford University Press
Publication date
Field of study

We developed a pair of databases that support two important tasks: annotation of anonymous RNA transcripts and discovery of novel non-coding RNAs. The database combo is called the Functional RNA Database and consists of two databases: a rewrite of the original version of the Functional RNA Database (fRNAdb) and the latest version of the UCSC GenomeBrowser for Functional RNA. The former is a sequence database equipped with a powerful search function and hosts a large collection of known/predicted non-coding RNA sequences acquired from existing databases as well as novel/predicted sequences reported by researchers of the Functional RNA Project. The latter is a UCSC Genome Browser mirror with large additional custom tracks specifically associated with non-coding elements. It also includes several functional enhancements such as a presentation of a common secondary structure prediction at any given genomic window ⩽500 bp. Our GenomeBrowser supports user authentication and user-specific tracks. The current version of the fRNAdb is a complete rewrite of the former version, hosting a larger number of sequences and with a much friendlier interface. The current version of UCSC GenomeBrowser for Functional RNA features a larger number of tracks and richer features than the former version. The databases are available at http://www.ncrna.org/

Crossref

PubMed Central

The UniTrap resource: tools for the biologist enabling optimized use of gene trap clones

Author: Altschul
Austin
Benson
Burge
Cobellis
Curwen
E. Stupka
G. Cobellis
G. Lago
G. Roma
Hamosh
Hicks
Horn
M. Sardiello
Nord
P. Cruz
Pruitt
R. Sanges
Raymond
Rice
Schuler
Stajich
Stanford
Stryke
The FANTOM Consortium
To
Wiles
Zambrowicz
Zambrowicz
Publication venue: Oxford University Press
Publication date: 01/01/2007
Field of study

We have developed a comprehensive resource devoted to biologists wanting to optimize the use of gene trap clones in their experiments. We have processed 300 602 such clones from both public and private projects to generate 28 199 ‘UniTraps’, i.e. distinct collections of unambiguous insertions at the same subgenic region of annotated genes. The UniTrap resource contains data relative to 9583 trapped genes, which represent 42.3% of the mouse gene content. Among the trapped genes, 7 728 have a counterpart in humans, and 677 are known to be involved in the pathogenesis of human diseases. The aim of this analysis is to provide the wet lab researchers with a comprehensive database and curated tools for (i) identifying and comparing the clones carrying a trap into the genes of interest, (ii) evaluating the severity of the mutation to the protein function in each independent trapping event and (iii) supplying complete information to perform PCR, RT-PCR and restriction experiments to verify the clone and identify the exact point of vector insertion. To share this unique resource with the scientific community, we have designed and implemented a web interface that is freely accessible at http://unitrap.cbm.fvg.it/

CiteSeerX

Crossref

PubMed Central

Sissa Digital Library

Archivio Istituzionale della Ricerca - Università degli Studi della Campania "Luigi Vanvitelli"

Characterization of Transcription Start Sites of Putative Non-coding RNAs by Multifaceted Use of Massively Paralleled Sequencer

Author: A. Kanai
Albert
Babak
Bentley
Brosius
Cooper
Ebisuya
Feng
Gene Ontology Consortium
Gupta
Guttman
Hashimoto
H ttenhofer
Jiang
K. Nakai
K. Tanimoto
Khaitovich
Matouk
Mortazavi
N. Sathira
Nakaya
Numata
Ota
Ozsolak
Ponting
R. Yamashita
Rinn
R sok
S. Kanematsu
S. Sugano
Schones
Struhl
Suzuki
Suzuki
Szell
T. Arauchi
The FANTOM Consortium
Y. Suzuki
Yelin
Publication venue: Oxford University Press
Publication date: 01/01/2010
Field of study

On the basis of integrated transcriptome analysis, we show that not all transcriptional start site clusters (TSCs) in the intergenic regions (iTSCs) have the same properties; thus, it is possible to discriminate the iTSCs that are likely to have biological relevance from the other noise-level iTSCs. We used a total of 251 933 381 short-read sequence tags generated from various types of transcriptome analyses in order to characterize 6039 iTSCs, which have significant expression levels. We analyzed and found that 23% of these iTSCs were located in the proximal regions of the RefSeq genes. These RefSeq-linked iTSCs showed similar expression patterns with the neighboring RefSeq genes, had widely fluctuating transcription start sites and lacked ordered nucleosome positioning. These iTSCs seemed not to form independent transcriptional units, simply representing the by-products of the neighboring RefSeq genes, in spite of their significant expression levels. Similar features were also observed for the TSCs located in the antisense regions of the RefSeq genes. Furthermore, for the remaining iTSCs that were not associated with any RefSeq genes, we demonstrate that integrative interpretation of the transcriptome data provides essential information to specify their biological functions in the hypoxic responses of the cells

CiteSeerX

Crossref

PubMed Central

Chemical synthesis of a very long oligoribonucleotide with 2-cyanoethoxymethyl (CEM) as the 2′-O-protecting group: structural identification and biological activity of a synthetic 110mer precursor-microRNA candidate

A long RNA oligomer, a 110mer with the sequence of a precursor-microRNA candidate, has been chemically synthesized in a single synthesizer run by means of standard automated phosphoramidite chemistry. The synthetic method involved the use of 2-cyanoethoxymethyl (CEM), a 2′-hydroxyl protecting group recently developed in our laboratory. We improved the methodology, introducing better coupling and capping conditions. The overall isolated yield of highly pure 110mer was 5.5%. Such a yield on a 1-μmol scale corresponds to 1 mg of product and emphasizes the practicality of the CEM method for synthesizing oligomers of more than 100 nt in sufficient quantity for biological research. We confirmed the identity of the 110mer by matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) mass spectrometry, as well as HPLC, electrophoretic methods, and RNase-digestion experiments. The 110mer also showed sense-selective specific gene-silencing activity. As far as we know, this is the longest chemically synthesized RNA oligomer reported to date. Furthermore, the identity of the 110mer was confirmed by both physicochemical and biological methods

Crossref

PubMed Central