55 research outputs found

    Cross-species inference of long non-coding RNAs greatly expands the ruminant transcriptome

    Get PDF
    Additional file 3. This file contains all supplementary tables relating to lncRNA identification via the conservation of synteny. Table S3. lncRNAs inferred in one species by the genomic alignment of a transcript assembled with the RNA-seq libraries from a related spdecies. Table S12. Presence of intergenic lncRNAs both in sheep and cattle, in regions of conserved synteny. Table S13. Presence of intergenic lncRNAs both in sheep and goat, in regions of conserved synteny. Table S14. Presence of intergenic lncRNAs both in cattle and goat, in regions of conserved synteny. Table S15. Presence of intergenic lncRNAs both in sheep and humans, in regions of conserved synteny. Table S16. Presence of intergenic lncRNAs both in goat and humans, in regions of conserved synteny. Table S17. Presence of intergenic lncRNAs both in cattle and humans, in regions of conserved synteny. Table S18. High-confidence lncRNA pairs, those conserved across species both sequentially and positionally

    RNAcentral: a comprehensive database of non-coding RNA sequences

    Get PDF
    RNAcentral is a database of non-coding RNA (ncRNA) sequences that aggregates data from specialised ncRNA resources and provides a single entry point for accessing ncRNA sequences of all ncRNA types from all organisms. Since its launch in 2014, RNAcentral has integrated twelve new resources, taking the total number of collaborating database to 22, and began importing new types of data, such as modified nucleotides from MODOMICS and PDB. We created new species-specific identifiers that refer to unique RNA sequences within a context of single species. The website has been subject to continuous improvements focusing on text and sequence similarity searches as well as genome browsing functionality. All RNAcentral data is provided for free and is available for browsing, bulk downloads, and programmatic access at http://rnacentral.org/.Biotechnology and Biological Sciences Research Council (BBSRC) [BB/J019232/1]

    Presentació del Servei de Biblioteques als usuaris del Programa ARGÓ

    Get PDF
    Presentació realitzada per Rafa Giráldez als alumnes del programa ARGÓ el 26 de juny de 2015 a la Sala d'Actes del Rectora

    Transcriptomic analyses reveal groups of co-expressed, syntenic lncRNAs in four species of the genus Caenorhabditis

    No full text
    Long non-coding RNAs (lncRNAs) are a heterogeneous class of genes that do not code for proteins. Since lncRNAs (or a fraction thereof) are expected to be functional, many efforts have been dedicated to catalog lncRNAs in numerous organisms, but our knowledge of lncRNAs in non vertebrate species remains very limited. Here, we annotated lncRNAs using transcriptomic data from the same larval stage of four Caenorhabditis species. The number of annotated lncRNAs in self-fertile nematodes was lower than in out-crossing species. We used a combination of approaches to identify putatively homologous lncRNAs: synteny, sequence conservation, and structural conservation. We classified a total of 1,532 out of 7,635 genes from the four species into families of lncRNAs with conserved synteny and expression at the larval stage, suggesting that a large fraction of the predicted lncRNAs may be species specific. Despite both sequence and local secondary structure seem to be poorly conserved, sequences within families frequently shared BLASTn hits and short sequence motifs, which were more likely to be unpaired in the predicted structures. We provide the first multi-species catalog of lncRNAs in nematodes and identify groups of lncRNAs with conserved synteny and expression, that share exposed motifs.This work was supported by the Spanish Ministry of Economy,Industry, and Competitiveness (MEIC), INB Grant (PT17/0009/0023 - ISCIII-SGEFI/ERDF) by the EMBL partnership, and grants ‘Centro de Excelencia Severo Ochoa 2013–2017’ SEV-2012-0208, and BFU2015-67107 cofounded by European Regional 710 Development Fund (ERDF); by the CERCA Programme/Generalitat de Catalunya; from the Catalan Research Agency (AGAUR) SGR857, and grant from the European Union’s Horizon 2020 research and innovation programme under the grant agreement ERC-2016-724173 the Marie Sklodowska-Curie grant agreement No 715 H2020-MSCA-ITN-2014-642095 and the Marie Skłodowska-Curie Actions [H2020-MSCA-IF-2017-793699]

    Intergenic disease-associated regions are abundant in novel transcripts

    No full text
    Background Genotyping of large populations through genome-wide association studies (GWAS) has successfully identified many genomic variants associated with traits or disease risk. Unexpectedly, a large proportion of GWAS single nucleotide polymorphisms (SNPs) and associated haplotype blocks are in intronic and intergenic regions, hindering their functional evaluation. While some of these risk-susceptibility regions encompass cis-regulatory sites, their transcriptional potential has never been systematically explored. Results To detect rare tissue-specific expression, we employed the transcript-enrichment method CaptureSeq on 21 human tissues to identify 1775 multi-exonic transcripts from 561 intronic and intergenic haploblocks associated with 392 traits and diseases, covering 73.9 Mb (2.2%) of the human genome. We show that a large proportion (85%) of disease-associated haploblocks express novel multi-exonic non-coding transcripts that are tissue-specific and enriched for GWAS SNPs as well as epigenetic markers of active transcription and enhancer activity. Similarly, we captured transcriptomes from 13 melanomas, targeting nine melanoma-associated haploblocks, and characterized 31 novel melanoma-specific transcripts that include fusion proteins, novel exons and non-coding RNAs, one-third of which showed allelically imbalanced expression. Conclusions This resource of previously unreported transcripts in disease-associated regions (http://gwas-captureseq.dingerlab.org) should provide an important starting point for the translational community in search of novel biomarkers, disease mechanisms, and drug targets. </div

    Annotation of full-length long noncoding RNAs with capture long-read sequencing (CLS)

    No full text
    Metazoan genomes produce thousands of long-noncoding RNAs (lncRNAs), of which just a small fraction have been well characterized. Understanding their biological functions requires accurate annotations, or maps of the precise location and structure of genes and transcripts in the genome. Current lncRNA annotations are limited by compromises between quality and size, with many gene models being fragmentary or uncatalogued. To overcome this, the GENCODE consortium has developed RNA capture long-read sequencing (CLS), an approach combining targeted RNA capture with third-generation long-read sequencing. CLS provides accurate annotations at high-throughput rates. It eliminates the need for noisy transcriptome assembly from short reads, and requires minimal manual curation. The full-length transcript models produced are of quality comparable to present-day manually curated annotations. Here we describe a detailed CLS protocol, from probe design through long-read sequencing to creation of final annotations
    • …
    corecore