Search CORE

22 research outputs found

GENCODE 2021

Author: Armstrong J
Barnes I
Berry A
Bignell A
Boix C
Carbonell Sala S
Choudhary JS
Cunningham F
Di Domenico T
Diekhans M
Donaldson S
Fiddes IT
Flicek P
Frankish A
García Girón C
Gerstein M
Gonzalez JM
Grego T
Guigó R
Hardy M
Hourlier T
Howe KL
Hubbard TJP
Hunt T
Izuogu OG
Johnson R
Jungreis I
Kellis M
Lagarde J
Loveland JE
Martin FJ
Martínez L
Mohanan S
Mudge JM
Muir P
Navarro FCP
Parker A
Paten B
Pei B
Pozo F
Riera FC
Ruffier M
Schmitt BM
Sisu C
Stapleton E
Suner MM
Sycheva I
Tress ML
Uszczynska-Ratajczak B
Wolf MY
Wright JC
Xu J
Yang YT
Yates A
Zerbino D
Zhang Y
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2021
Field of study

© The Author(s) 2020. Published by Oxford University Press on behalf of Nucleic Acids Research. The GENCODE project annotates human and mouse genes and transcripts supported by experimental data with high accuracy, providing a foundational resource that supports genome biology and clinical genomics. GENCODE annotation processes make use of primary data and bioinformatic tools and analysis generated both within the consortium and externally to support the creation of transcript structures and the determination of their function. Here, we present improvements to our annotation infrastructure, bioinformatics tools, and analysis, and the advances they support in the annotation of the human and mouse genomes including: the completion of first pass manual annotation for the mouse reference genome; targeted improvements to the annotation of genes associated with SARS-CoV-2 infection; collaborative projects to achieve convergence across reference annotation databases for the annotation of human and mouse protein-coding genes; and the first GENCODE manually supervised automated annotation of lncRNAs. Our annotation is accessible via Ensembl, the UCSC Genome Browser and https://www.gencodegenes.org.National Human Genome Research Institute of the National Institutes of Health [U41HG007234]; the content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health; Wellcome Trust [WT108749/Z/15/Z, WT200990/Z/16/Z]; European Molecular Biology Laboratory; Swiss National Science Foundation through the National Center of Competence in Research ‘RNA & Disease’ (to R.J.); Medical Faculty of the University of Bern (to R.J). Funding for open access charge: National Institutes of Health

DSpace@MIT

UPF Digital Repository

King's Research Portal

Bern Open Repository and Information System (BORIS)

Institute of Cancer Research Repository

Brunel University Research Archive

Recommended from our members

GENCODE: reference annotation for the human and mouse genomes in 2023

Author: Arnan C
Banerjee A
Barnes I
Bennett R
Berry A
Bignell A
Boix C
Calvet F
Carbonell-Sala S
Cerdán-Vélez D
Choudhary JS
Cunningham F
Davidson C
Diekhans M
Donaldson S
Dursun C
Fatima R
Flicek P
Frankish A
Gerstein M
Giorgetti S
Giron CG
Gonzalez JM
Guigo R
Gómez LM
Hardy M
Harrison PW
Hollis Z
Hourlier T
Hubbard TJP
Hunt T
James B
Jiang Y
Johnson R
Jungreis I
Kay M
Kellis M
Kundaje A
Lagarde J
Loveland JE
Martin FJ
Mudge JM
Nair S
Ni P
Paten B
Pozo F
Ramalingam V
Ruffier M
Schmitt BM
Schreiber JM
Sisu C
Steed E
Sumathipala D
Suner M-M
Sycheva I
Tress ML
Uszczynska-Ratajczak B
Wass E
Wright JC
Yang YT
Yates A
Zafrulla Z
Publication venue: 'Oxford University Press (OUP)'
Publication date: 24/11/2022
Field of study

Data availability: No new data were generated or analysed in support of this research.Copyright © The Author(s) 2022. GENCODE produces high quality gene and transcript annotation for the human and mouse genomes. All GENCODE annotation is supported by experimental data and serves as a reference for genome biology and clinical genomics. The GENCODE consortium generates targeted experimental data, develops bioinformatic tools and carries out analyses that, along with externally produced data and methods, support the identification and annotation of transcript structures and the determination of their function. Here, we present an update on the annotation of human and mouse genes, including developments in the tools, data, analyses and major collaborations which underpin this progress. For example, we report the creation of a set of non-canonical ORFs identified in GENCODE transcripts, the LRGASP collaboration to assess the use of long transcriptomic data to build transcript models, the progress in collaborations with RefSeq and UniProt to increase convergence in the annotation of human and mouse protein-coding genes, the propagation of GENCODE across the human pan-genome and the development of new tools to support annotation of regulatory features by GENCODE. Our annotation is accessible via Ensembl, the UCSC Genome Browser and https://www.gencodegenes.org.National Human Genome Research Institute of the National Institutes of Health [U41HG007234, R01HG004037]; Wellcome Trust [WT222155/Z/20/Z]; European Molecular Biology Laboratory. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. Funding for open access charge: National Institutes of Health

Brunel University Research Archive

Application of epoxy functional silanes in the preparation of DNA microarrays

Author: Chmielewski M.K.
Figlerowicz M.
Frydrych-Tomczak E.
Maciejewski H.
Markiewicz W.T.
Nowicki M.
Ratajczak T.
Uszczynska B.
Publication venue: 'Termedia Sp. z.o.o.'
Publication date: 01/01/2014
Field of study

Biblioteka Nauki - repozytorium artykuÅÃ³w

Crossref

Profiling subcellular localization of nuclear-encoded mitochondrial gene products in zebrafish

Author: Carbonell-Sala S.
Chacinska A.
Kwiatkowska M.
Migdal M.
Sokol A.
Sugunan S.
Uszczynska-Ratajczak B.
Winata C.
Publication venue: 'Life Science Alliance, LLC'
Publication date: 01/01/2023
Field of study

Most mitochondrial proteins are encoded by nuclear genes, synthetized in the cytosol and targeted into the organelle. To characterize the spatial organization of mitochondrial gene products in zebrafish (Danio redo), we sequenced RNA from different cellular fractions. Our results confirmed the presence of nuclear-encoded mRNA5 in the mitochondrial fraction, which in unperturbed conditions, are mainly transcripts encoding large proteins with specific properties, like transmembrane domains. To further explore the principles of mitochondrial protein compartmentalization in zebrafish, we quantified the transcriptomic changes for each subcellular fraction triggered by the chchd4a(-)(/-) mutation, causing the disorders in the mitochondrial protein import. Our results indicate that the proteostatic stress further restricts the population of transcripts on the mitochondrial surface, allowing only the largest and the most evolutionary conserved proteins to be synthetized there. We also show that many nuclear-encoded mitochondrial transcripts translated by the cytosolic ribosomes stay resistant to the global translation shutdown. Thus, vertebrates, in contrast to yeast, are not likely to use localized translation to facilitate synthesis of mitochondrial proteins under proteostatic stress conditions

MPG.PuRe

High-throughput annotation of full-length long noncoding RNAs with capture long-read sequencing

Author: Abad A.
Carbonell S.
Davis C.
Frankish A.
Gingeras T. R.
Guigo R.
Harrow J.
Johnson R.
Lagarde J.
Perez-Lluch S.
Uszczynska-Ratajczak B.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Accurate annotation of genes and their transcripts is a foundation of genomics, but currently no annotation technique combines throughput and accuracy. As a result, reference gene collections remain incomplete-many gene models are fragmentary, and thousands more remain uncataloged, particularly for long noncoding RNAs (lncRNAs). To accelerate lncRNA annotation, the GENCODE consortium has developed RNA Capture Long Seq (CLS), which combines targeted RNA capture with third-generation long-read sequencing. Here we present an experimental reannotation of the GENCODE intergenic lncRNA populations in matched human and mouse tissues that resulted in novel transcript models for 3,574 and 561 gene loci, respectively. CLS approximately doubled the annotated complexity of targeted loci, outperforming existing short-read techniques. Full-length transcript models produced by CLS enabled us to definitively characterize the genomic features of lncRNAs, including promoter and gene structure, and protein-coding potential. Thus, CLS removes a long-standing bottleneck in transcriptome annotation and generates manual-quality full-length transcript models at high-throughput scales

Crossref

Cold Spring Harbor Laboratory Institutional Repository

UPF Digital Repository

Bern Open Repository and Information System (BORIS)

The abundance of the long intergenic non-coding RNA 01087 differentiates between luminal and triple-negative breast cancers and predicts patient outcome

Author: Baldi A.
Botti G.
D'Aiuto M.
D'Argenio V.
De Palma F. D. E.
Del Monaco V.
Guigo R.
Klein C. C.
Kremer M.
Kroemer G.
Maiuri M. C.
Montanaro D.
Pol J. G.
Salvatore F.
Stoll G.
Uszczynska-Ratajczak B.
Vlasova A.
Publication venue: 'Elsevier BV'
Publication date: 01/01/2020
Field of study

The molecular complexity of human breast cancer (BC) renders the clinical management of the disease challenging. Long non-coding RNAs (lncRNAs) are promising biomarkers for BC patient stratification, early detection, and disease monitoring. Here, we identified the involvement of the long intergenic non-coding RNA 01087 (LINC01087) in breast oncogenesis. LINC01087 appeared significantly downregulated in triple-negative BCs (TNBCs) and upregulated in the luminal BC subtypes in comparison to mammary samples from cancer-free women and matched normal cancer pairs. Interestingly, deregulation of LINC01087 allowed to accurately distinguish between luminal and TNBC specimens, independently of the clinicopathological parameters, and of the histological and TP53 or BRCA1/2 mutational status. Moreover, increased expression of LINC01087 predicted a better prognosis in luminal BCs, while TNBC tumors that harbored lower levels of LINC01087 were associated with reduced relapse-free survival. Furthermore, bioinformatics analyses were performed on TNBC and luminal BC samples and suggested that the putative tumor suppressor activity of LINC01087 may rely on interferences with pathways involved in cell survival, proliferation, adhesion, invasion, inflammation and drug sensitivity. Altogether, these data suggest that the assessment of LINC01087 deregulation could represent a novel, specific and promising biomarker not only for the diagnosis and prognosis of luminal BC subtypes and TNBCs, but also as a predictive biomarker of pharmacological interventions

Archivio della ricerca - Università degli studi di Napoli Federico II

Archivio Istituzionale della Ricerca - Università degli Studi della Campania "Luigi Vanvitelli"

Annotation of full-length long noncoding RNAs with capture long-read sequencing (CLS)

Author: A Frankish
A Lanzós
B Uszczynska-Ratajczak
D Sharon
G Bussotti
H Hezroni
H Tilgner
I Ulitsky
IW Deveson
J Lagarde
J Lagarde
JS Mattick
KD Hansen
KD Pruitt
KR Sanson
M Jain
MB Clark
S Fang
SA Hardwick
SJ Crider-Miller
T Derrien
T Steijger
TR Mercer
TR Mercer
XC Quek
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2021
Field of study

Metazoan genomes produce thousands of long-noncoding RNAs (lncRNAs), of which just a small fraction have been well characterized. Understanding their biological functions requires accurate annotations, or maps of the precise location and structure of genes and transcripts in the genome. Current lncRNA annotations are limited by compromises between quality and size, with many gene models being fragmentary or uncatalogued. To overcome this, the GENCODE consortium has developed RNA capture long-read sequencing (CLS), an approach combining targeted RNA capture with third-generation long-read sequencing. CLS provides accurate annotations at high-throughput rates. It eliminates the need for noisy transcriptome assembly from short reads, and requires minimal manual curation. The full-length transcript models produced are of quality comparable to present-day manually curated annotations. Here we describe a detailed CLS protocol, from probe design through long-read sequencing to creation of final annotations

Crossref

UPF Digital Repository

Bern Open Repository and Information System (BORIS)