Search CORE

9 research outputs found

Examples of the complex architecture of the human transcriptome revealed by RACE and high-density tiling arrays

Author: Cheng Jill
Dike Sujit
Drenkow Jorg
Gingeras Thomas R.
Helt Gregg
Kapranov Philipp
Long Jeffrey
Publication venue: Cold Spring Harbor Laboratory Press
Publication date: 01/01/2005
Field of study

Recently, we mapped the sites of transcription across ∼30% of the human genome and elucidated the structures of several hundred novel transcripts. In this report, we describe a novel combination of techniques including the rapid amplification of cDNA ends (RACE) and tiling array technologies that was used to further characterize transcripts in the human transcriptome. This technical approach allows for several important pieces of information to be gathered about each array-detected transcribed region, including strand of origin, start and termination positions, and the exonic structures of spliced and unspliced coding and noncoding RNAs. In this report, the structures of transcripts from 14 transcribed loci, representing both known genes and unannotated transcripts taken from the several hundred randomly selected unannotated transcripts described in our previous work are represented as examples of the complex organization of the human transcriptome. As a consequence of this complexity, it is not unusual that a single base pair can be part of an intricate network of multiple isoforms of overlapping sense and antisense transcripts, the majority of which are unannotated. Some of these transcripts follow the canonical splicing rules, whereas others combine the exons of different genes or represent other types of noncanonical transcripts. These results have important implications concerning the correlation of genotypes to phenotypes, the regulation of complex interlaced transcriptional patterns, and the definition of a gene

Crossref

Cold Spring Harbor Laboratory Institutional Repository

PubMed Central

The mouse genome: Experimental examination of gene predictions and transcriptional start sites

Author: Balija Vivekanand S.
Dike Sujit
Hannon Greg
McCombie W. Richard
Nascimento Lidia U.
Ou Jacqueline
Palmer Lance E.
Xuan Zhenyu
Zhang Michael Q.
Zutavern Theresa
Publication venue: Cold Spring Harbor Laboratory Press
Publication date: 01/12/2004
Field of study

The completion of the mouse and other mammalian genome sequences will provide necessary, but not sufficient, knowledge for an understanding of much of mouse biology at the molecular level. As a requisite next step in this process, the genes in mouse and their structure must be elucidated. In particular, knowledge of the transcriptional start site of these genes will be necessary for further study of their regulatory regions. To assess the current state of mouse genome annotation to support this activity, we identified several hundred gene predictions in mouse with varying levels of supporting evidence and tested them using RACE–PCR. Modifications were made to the procedure allowing pooling of RNA samples, resulting in a scaleable procedure. The results illustrate potential errors or omissions in the current 5′ end annotations in 58% of the genes detected. In testing experimentally unsupported gene predictions, we were able to identify 58 that are not usually annotated as genes but produced spliced transcripts (∼25% success rate). In addition, in many genes we were able to detect novel exons not predicted by any gene prediction algorithms. In 19.8% of the genes detected in this study, multiple transcript species were observed. These data show an urgent need to provide direct experimental validation of gene annotations. Moreover, these results show that direct validation using RACE–PCR can be an important component of genome-wide validation. This approach can be a useful tool in the ongoing efforts to increase the quality of gene annotations, especially transcriptional start sites, in complex genomes

Crossref

Cold Spring Harbor Laboratory Institutional Repository

PubMed Central

Prominent use of distal 5′ transcription start sites and discovery of a large number of additional exons in ENCODE regions

Author: Alioto Tyler
Antonarakis Stylianos E.
Castelo Robert
Chrast Jacqueline
Denoeud France
Dickson Mark C.
Dike Sujit
Drenkow Jorg
Foissac Sylvain
Frankish Adam
Gingeras Thomas R.
Guigó Roderic
Hance Zahra
Harrow Jennifer
Henrichsen Charlotte N.
Holroyd Nancy
Hubbard Tim
Kapranov Philipp
Lagarde Julien
Manzano Caroline
Myers Richard M.
Reymond Alexandre
Rogers Jane
Taylor Ruth
Ucla Catherine
Wyss Carine
Publication venue: Cold Spring Harbor Laboratory Press
Publication date: 01/01/2007
Field of study

This report presents systematic empirical annotation of transcript products from 399 annotated protein-coding loci across the 1% of the human genome targeted by the Encyclopedia of DNA elements (ENCODE) pilot project using a combination of 5′ rapid amplification of cDNA ends (RACE) and high-density resolution tiling arrays. We identified previously unannotated and often tissue- or cell-line-specific transcribed fragments (RACEfrags), both 5′ distal to the annotated 5′ terminus and internal to the annotated gene bounds for the vast majority (81.5%) of the tested genes. Half of the distal RACEfrags span large segments of genomic sequences away from the main portion of the coding transcript and often overlap with the upstream-annotated gene(s). Notably, at least 20% of the resultant novel transcripts have changes in their open reading frames (ORFs), most of them fusing ORFs of adjacent transcripts. A significant fraction of distal RACEfrags show expression levels comparable to those of known exons of the same locus, suggesting that they are not part of very minority splice forms. These results have significant implications concerning (1) our current understanding of the architecture of protein-coding genes; (2) our views on locations of regulatory regions in the genome; and (3) the interpretation of sequence polymorphisms mapping to regions hitherto considered to be “noncoding,” ultimately relating to the identification of disease-related sequence alterations

Crossref

Cold Spring Harbor Laboratory Institutional Repository

Serveur académique lausannois

PubMed Central

UPF Digital Repository

King's Research Portal

Archive ouverte UNIGE

Biological function of unannotated transcription during the early development of Drosophila melanogaster

Author: A Bernards
AC Spradling
AL Beyer
Antonio Piccolboni
B Denholm
CL Wei
D Kampa
Frederic Biemar
HJ Bellen
HK Shamloula
Ian Bell
J Cheng
J Jiang
J Robert Manak
Jeff Long
Jill Cheng
M Hild
P Bertone
P Kapranov
P Ng
P Tomayo
Philipp Kapranov
Srinka Ghosh
ST Thibault
Sujit Dike
T Shiraki
Thomas R Gingeras
V Stolc
V Velculescu
VE Foe
Victor Sementchenko
W Tadros
WR Strapps
Y Su
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2006
Field of study

Many animal and plant genomes are transcribed much more extensively than current annotations predict. However, the biological function of these unannotated transcribed regions is largely unknown. Approximately 7% and 23% of the detected transcribed nucleotides during D. melanogaster embryogenesis map to unannotated intergenic and intronic regions, respectively. Based on computational analysis of coordinated transcription, we conservatively estimate that 29% of all unannotated transcribed sequences function as missed or alternative exons of well-characterized protein-coding genes. We estimate that 15.6% of intergenic transcribed regions function as missed or alternative transcription start sites (TSS) used by 11.4% of the expressed protein-coding genes. Identification of P element mutations within or near newly identified 5′ exons provides a strategy for mapping previously uncharacterized mutations to their respective genes. Collectively, these data indicate that at least 85% of the fly genome is transcribed and processed into mature transcripts representing at least 30% of the fly genome. © 2006 Nature Publishing Group

Crossref

Cold Spring Harbor Laboratory Institutional Repository

Prominent use of distal 5’ transcription start sites and discovery of a large number of additional exons in ENCODE regions

Author: Alioto Tyler
Antonarakis Stylianos E.
Castelo Valdueza Robert
Chrast Jacqueline
Denoeud France
Dickson Mark C.
Dike Sujit
Drenkow Jorg
Foissac Sylvain
Frankish Adam
Gingeras Thomas R.
Guigó Serra Roderic
Hance Zahra
Harrow Jennifer
Henrichsen Charlotte N.
Holroyd Nancy
Hubbard Tim J.
Kapranov Philipp
Lagarde Julien
Manzano Caroline
Myers Richard M.
Reymond Alexandre
Rogers Jane
Taylor Ruth
Ucla Catherine
Wyss Carine
Publication venue: Cold Spring Harbor Laboratory Press-CSHL Press
Publication date
Field of study

This report presents systematic empirical annotation of transcript products from 399 annotated protein-coding loci across the 1% of the human genome targeted by the Encyclopedia of DNA elements (ENCODE) pilot project using a combination of 5' rapid amplification of cDNA ends (RACE) and high-density resolution tiling arrays. We identified previously unannotated and often tissue- or cell-line-specific transcribed fragments (RACEfrags), both 5' distal to the annotated 5' terminus and internal to the annotated gene bounds for the vast majority (81.5%) of the tested genes. Half of the distal RACEfrags span large segments of genomic sequences away from the main portion of the coding transcript and often overlap with the upstream-annotated gene(s). Notably, at least 20% of the resultant novel transcripts have changes in their open reading frames (ORFs), most of them fusing ORFs of adjacent transcripts. A significant fraction of distal RACEfrags show expression levels comparable to those of known exons of the same locus, suggesting that they are not part of very minority splice forms. These results have significant implications concerning (1) our current understanding of the architecture of protein-coding genes; (2) our views on locations of regulatory regions in the genome; and (3) the interpretation of sequence polymorphisms mapping to regions hitherto considered to be "noncoding," ultimately relating to the identification of disease-related sequence alterations

RECERCAT

Biological function of unannotated transcription during the early development of Drosophila melanogaster

Author: A Bernards
AC Spradling
AL Beyer
Antonio Piccolboni
B Denholm
CL Wei
D Kampa
Frederic Biemar
HJ Bellen
HK Shamloula
Ian Bell
J Cheng
J Jiang
J Robert Manak
Jeff Long
Jill Cheng
M Hild
P Bertone
P Kapranov
P Ng
P Tomayo
Philipp Kapranov
Srinka Ghosh
ST Thibault
Sujit Dike
T Shiraki
Thomas R Gingeras
V Stolc
V Velculescu
VE Foe
Victor Sementchenko
W Tadros
WR Strapps
Y Su
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project.

Author: Abecasis Gonçalo R.
Abril Josep F.
Adzhubei Ivan
Aldred Shelley Force
Alioto Tyler
Ameur Adam
Andrews Robert M.
Andrews Robert M.
Antonarakis Stylianos E.
Antonarakis Stylianos E.
Armengol Lluis
Asimenos George
Asthana Saurabh
Baertsch Robert
Barber Galt
Barrera Leah
Batzoglou Serafim
Batzoglou Serafim
Bell Ian
Bell Ian
Bhinge Akshay A.
Bickel Peter
Bickel Peter
Bieda Mark C.
Bird Christine P.
Birney Ewan
Birney Ewan
Birney Ewan
Birney Ewan
Birney Ewan
Birney Ewan
Blakesley Robert W.
Bouffard Gerard G.
Boyle Patrick J.
Brent Michael
Broad Institute
Brown James B.
Brown James B.
Brown James B.
Bruce Alexander W.
Cao Hua
Carninci Piero
Carter Nigel P.
Carter Nigel P.
Castelo Robert
Chang Jean L.
Cheng Jill
Cheng Jill
Cheung Evelyn
Chiu Kuo Ping
Chiu Kuo Ping
Choo Chiou Yu
Choo Siew Woh
Chrast Jacqueline
Church Deanna
Clamp Michele
Clark Taane G.
Clark Taane G.
Clawson Hiram
Clelland Gayle K.
Clelland Gayle K.
Collins Francis S.
Collins Francis S.
Cooper Gregory M.
Cooper Gregory M.
Cooper Sara J.
Couttet Phillippe
Crawford Gregory E.
Cuff James
Davis Sean
Davydov Eugene
Day Nathan
de Bakker Paul I. W.
de Jong Pieter J.
Dekker Job
Denoeud France
Denoeud France
Dermitzakis Emmanouil T.
Dermitzakis Emmanouil T.
Dermitzakis Emmanouil T.
Dewey Colin N.
Dhami Pawandeep
Dhami Pawandeep
Dickson Mark C.
Dike Sujit
Dillon Shane C.
Dillon Shane C.
Dimas Antigone
Dorschner Michael O.
Dovey Oliver M.
Drenkow Jorg
Dunham Ian
Dunham Ian
Dutta Anindya
Dutta Anindya
Ellis Peter D.
Emanuelsson Olof
Enroth Stefan
Estivill Xavier
Euskirchen Ghia
Eyras Eduardo
Farnham Peggy J.
Feingold Elise A.
Fiegler Heike
Flamm Christoph
Flicek Paul
Flicek Paul
Flicek Paul
Flicek Paul
Foissac Sylvain
Fowler Joanna C.
Frankish Adam
Fried Claudia
Frum Tristan T.
Fu Yutao
Fu Yutao
Fulton Robert
Ganesh Madhavan
Genome Sequencing Center* Washington University
Gerstein Mark
Gerstein Mark
Gerstein Mark
Gerstein Mark
Ghosh Srinka
Ghosh Srinka
Gibbs Richard A.
Gilbert James
Gingeras Thomas R.
Gingeras Thomas R.
Gingeras Thomas R.
Giresi Paul G.
Giresi Paul G.
Glass Christopher K.
Gnerre Sante
Goldman Nick
Goldy Jeff
Good Peter J.
Graves Tina
Green Eric D.
Green Eric D.
Green Roland D.
Greenbaum Jason A.
Guan Xiaobin
Guigó Roderic
Guigó Roderic
Guigó Roderic
Guyer Mark S.
Hackermüller Jörg
Haidar Jaafar N. S.
Halees Anason
Hallgrímsdóttir Ingileif B.
Hansen Nancy F.
Hardison Ross C.
Hardison Ross C.
Hardison Ross C.
Harrow Jennifer
Harte Rachel A.
Hartman Stephen
Haussler David
Haussler David
Hawrylycz Michael
Hayashizaki Yoshihide
Haydock Andrew
Heintzman Nate
Henrichsen Charlotte N.
Hertel Jana
Hillman-Jackson Jennifer
Hinrichs Angie S.
Hirsch Heather A.
Hirsch Heather A.
Hofacker Ivo L.
Holmes Ian
Holroyd Nancy
Hon Gary
Hoon Kim Tae
Hou Minmei
Huang Haiyan
Hubbard Tim
Human Genome Sequencing Center* Baylor College of Medicine
Humbert Richard
Huppert Julian
Idol Jacquelyn R.
Inman David
Iyer Vishwanath R.
Jaffe David B.
James Keith D.
James Keith D.
Jiang Huaiyang
Jiang Nan
Johnson Brett E.
Johnson Ericka M.
Kai Chikatoshi
Kapranov Philipp
Kapranov Philipp
Kapranov Philipp
Karaöz Ulaş
Karaöz Ulaş
Karnani Neerja
Karolchik Donna
Kawai Jun
Keefe Damian
Keefe Damian
Keefe Damian
Kent W. James
Kent W. James
Kern Andrew D.
Kim Jonghwan
King David C.
Koch Christoph M.
Koch Christoph M.
Komorowski Jan
Korbel Jan
Koriabine Maxim
Kraus Peter
Kuehn Michael S.
Kuhn Robert M.
Lagarde Julien
Lander Eric S.
Langford Cordelia F.
Lee Charlie W.H.
Lee Kirsten
Lefebvre Gregory C.
Lefebvre Gregory C.
Lian Jin
Lian Jin
Lian Zheng
Lieb Jason D.
Lieb Jason D.
Liefer Laura A.
Lin Jane M.
Lindblad-Toh Kerstin
Lindemeyer Manja
Liu Jun
Lopez-Bigas Nuria
Lowe Todd M.
Luna Rosa
Löytynoja Ari
Maduro Valerie V.B.
Malhotra Ankit
Manzano Caroline
Mardis Elaine R.
Margulies Elliott H.
Margulies Elliott H.
Margulies Elliott H.
Martin Joel D.
Maskeri Baishali
Massingham Tim
Matthews Nicholas
Mattick John S.
McDowell Jennifer C.
Miller Webb
Missal Kristin
Montoya-Burgos Juan I.
Moqtaderi Zarmik
Mullikin James C.
Mullikin James C.
Munn Kyle J.
Muzny Donna M.
Myers Richard M.
Myers Richard M.
Myers Richard M.
Nagalakshmi Ugrappa
Navas Patrick A.
Nefedov Mikhail
Neph Shane
Neri Fidencio
Newburger Peter
Ng Patrick
Nikolaev Sergey
Nix David A.
Noble William S.
Noble William S.
Oakland Research Institute* Children’s Hospital
Oberley Matthew J.
Ooi Hong Sain
Osoegawa Kazutoyo
Pachter Lior
Pachter Lior
Pardi Fabio
Park Morgan
Parker Stephen C. J.
Patel Sandeep
Patel Sandeep
Paten Benedict
Pedersen Jakob S.
Qu Chunxu
Rada-Iglesias Alvaro
Ren Bing
Reymond Alexandre
Reymond Alexandre
Richmond Todd A.
Rogers Jane
Rosenbloom Kate
Rosenbloom Kate
Rosenfeld M. Geoff
Rosenzweig Elizabeth R.
Rozowsky Joel
Rozowsky Joel
Rozowsky Joel
Ruan Yijun
Ruan Yijun
Sabo Peter J.
Sandelin Albin
Sandstrom Richard
Sekinger Edward A.
Sekinger Edward A.
Sequencing Program NISC Comparative
Seringhaus Michael
Shafer Anthony
Shahab Atif
Shahab Atif
Shulha Hennady P.
Sidow Arend
Sidow Arend
Siepel Adam
Siepel Adam
Singer Michael A.
Smith Kayla
Snyder Michael
Snyder Michael
Snyder Michael
Sodergren Erica
Squazzo Sharon
Srinivasan K.G.
Stadler Peter F.
Stamatoyannopoulos John A.
Stamatoyannopoulos John A.
Stamatoyannopoulos John A.
Stone Eric A.
Stranger Barbara E.
Struhl Kevin
Struhl Kevin
Struhl Kevin
Stuart Rhona
Sung Wing-Kin
Sung Wing-Kin
Sunyaev Shamil
Swarbreck David
Tammana Hari
Tanzer Andrea
Taylor Christopher M.
Taylor Christopher M.
Taylor James
Taylor James
Taylor James
Taylor James
Taylor Ruth
Thakkapallayil Archana
Thomas Daryl J.
Thomas Daryl J.
Thomas Daryl J.
Thomas Daryl J.
Thomas Pamela J.
Thurman Robert E.
Thurman Robert E.
Tress Michael L.
Trinklein Nathan D.
Trumbower Heather
Trumbower Heather
Tullius Thomas D.
Tullius Thomas D.
Ucla Catherine
Urban Alexander E.
Ureta-Vidal Abel
Valencia Alfonso
Van Calcar Sara
Vega Vinsensius B.
Vetrie David
Vetrie David
Wadelius Claes
Wallerman Ola
Wang Kun
Washietl Stefan
Washietl Stefan
Weaver Molly
Wei Chia-Lin
Wei Chia-Lin
Weinstock George M.
Weirauch Matthew T.
Weissman Sherman
Weissman Sherman
Weng Zhiping
Weng Zhiping
Wetterstrand Kris A.
Wheeler David A.
Whelan Simon
Wilcox Sarah
Wilcox Sarah
Wilson Richard K.
Woodroffe Abigail
Worley Kim C.
Wu Jiaqian
Wu Jiaqian
Wyss Carine
Xu Mousheng
Xu Xiaoqin
Yang Annie
Yao Fei
Yoshinaga Yuko
Young Alice C.
Yu Yong
Yu1 Man
Zhang Nancy R.
Zhang Xiaoling
Zhang Xueqing
Zhang Zhengdong D.
Zhao XiaoDong
Zheng Deyou
Zheng Deyou
Zheng Deyou
Zhu Baoli
Zhu Zhou
Zody Michael C.
Zweig Ann S.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2007
Field of study

We report the generation and analysis of functional data from multiple, diverse experiments performed on a targeted 1% of the human genome as part of the pilot phase of the ENCODE Project. These data have been further integrated and augmented by a number of evolutionary and computational analyses. Together, our results advance the collective knowledge about human genome function in several major areas. First, our studies provide convincing evidence that the genome is pervasively transcribed, such that the majority of its bases can be found in primary transcripts, including non-protein-coding transcripts, and those that extensively overlap one another. Second, systematic examination of transcriptional regulation has yielded new understanding about transcription start sites, including their relationship to specific regulatory sequences and features of chromatin accessibility and histone modification. Third, a more sophisticated view of chromatin structure has emerged, including its inter-relationship with DNA replication and transcriptional regulation. Finally, integration of these new sources of information, in particular with respect to mammalian evolution based on inter- and intra-species sequence comparisons, has yielded new mechanistic and evolutionary insights concerning the functional landscape of the human genome. Together, these studies are defining a path for pursuit of a more comprehensive characterization of human genome function

Cold Spring Harbor Laboratory Institutional Repository

Serveur académique lausannois

Fraunhofer-ePrints

eScholarship@UMMS

Enlighten

The University of Manchester - Institutional Repository

University of Queensland eSpace

Archive ouverte UNIGE

Crossref

LSHTM Research Online

Copenhagen University Research Information System

UCL Discovery

King's Research Portal