3 research outputs found

    ncRNA orthologies in the vertebrate lineage.

    Get PDF
    Annotation of orthologous and paralogous genes is necessary for many aspects of evolutionary analysis. Methods to infer these homology relationships have traditionally focused on protein-coding genes and evolutionary models used by these methods normally assume the positions in the protein evolve independently. However, as our appreciation for the roles of non-coding RNA genes has increased, consistently annotated sets of orthologous and paralogous ncRNA genes are increasingly needed. At the same time, methods such as PHASE or RAxML have implemented substitution models that consider pairs of sites to enable proper modelling of the loops and other features of RNA secondary structure. Here, we present a comprehensive analysis pipeline for the automatic detection of orthologues and paralogues for ncRNA genes. We focus on gene families represented in Rfam and for which a specific covariance model is provided. For each family ncRNA genes found in all Ensembl species are aligned using Infernal, and several trees are built using different substitution models. In parallel, a genomic alignment that includes the ncRNA genes and their flanking sequence regions is built with PRANK. This alignment is used to create two additional phylogenetic trees using the neighbour-joining (NJ) and maximum-likelihood (ML) methods. The trees arising from both the ncRNA and genomic alignments are merged using TreeBeST, which reconciles them with the species tree in order to identify speciation and duplication events. The final tree is used to infer the orthologues and paralogues following Fitch's definition. We also determine gene gain and loss events for each family using CAFE. All data are accessible through the Ensembl Comparative Genomics ('Compara') API, on our FTP site and are fully integrated in the Ensembl genome browser, where they can be accessed in a user-friendly manner.Database URL: http://www.ensembl.org

    RNAcentral 2021: secondary structure integration, improved sequence search and new member databases

    Get PDF
    RNAcentral is a comprehensive database of non-coding RNA (ncRNA) sequences that provides a single access point to 44 RNA resources and >18 million ncRNA sequences from a wide range of organisms and RNA types. RNAcentral now also includes secondary (2D) structure information for >13 million sequences, making RNAcentral the world's largest RNA 2D structure database. The 2D diagrams are displayed using R2DT, a new 2D structure visualization method that uses consistent, reproducible and recognizable layouts for related RNAs. The sequence similarity search has been updated with a faster interface featuring facets for filtering search results by RNA type, organism, source database or any keyword. This sequence search tool is available as a reusable web component, and has been integrated into several RNAcentral member databases, including Rfam, miRBase and snoDB. To allow for a more fine-grained assignment of RNA types and subtypes, all RNAcentral sequences have been annotated with Sequence Ontology terms. The RNAcentral database continues to grow and provide a central data resource for the RNA community

    Genome-wide microRNA screening in Nile tilapia reveals pervasive isomiRs’ transcription, sex-biased arm switching and increasing complexity of expression throughout development

    Get PDF
    MicroRNAs (miRNAs) are key regulators of gene expression in multicellular organisms. The elucidation of miRNA function and evolution depends on the identification and characterization of miRNA repertoire of strategic organisms, as the fast-evolving cichlid fishes. Using RNA-seq and comparative genomics we carried out an in-depth report of miRNAs in Nile tilapia (Oreochromis niloticus), an emergent model organism to investigate evo-devo mechanisms. Five hundred known miRNAs and almost one hundred putative novel vertebrate miRNAs have been identified, many of which seem to be teleost-specific, cichlid-specific or tilapia-specific. Abundant miRNA isoforms (isomiRs) were identified with modifications in both 5p and 3p miRNA transcripts. Changes in arm usage (arm switching) of nine miRNAs were detected in early development, adult stage and even between male and female samples. We found an increasing complexity of miRNA expression during ontogenetic development, revealing a remarkable synchronism between the rate of new miRNAs recruitment and morphological changes. Overall, our results enlarge vertebrate miRNA collection and reveal a notable differential ratio of miRNA arms and isoforms influenced by sex and developmental life stage, providing a better picture of the evolutionary and spatiotemporal dynamics of miRNAs
    corecore