42 research outputs found

    Birth and expression evolution of mammalian microRNA genes.

    Get PDF
    MicroRNAs (miRNAs) are major post-transcriptional regulators of gene expression, yet their origins and functional evolution in mammals remain little understood due to the lack of appropriate comparative data. Using RNA sequencing, we have generated extensive and comparable miRNA data for five organs in six species that represent all main mammalian lineages and birds (the evolutionary outgroup) with the aim to unravel the evolution of mammalian miRNAs. Our analyses reveal an overall expansion of miRNA repertoires in mammals, with threefold accelerated birth rates of miRNA families in placentals and marsupials, facilitated by the de novo emergence of miRNAs in host gene introns. Generally, our analyses suggest a high rate of miRNA family turnover in mammals with many newly emerged miRNA families being lost soon after their formation. Selectively preserved mammalian miRNA families gradually evolved higher expression levels, as well as altered mature sequences and target gene repertoires, and were apparently mainly recruited to exert regulatory functions in nervous tissues. However, miRNAs that originated on the X chromosome evolved high expression levels and potentially diverse functions during spermatogenesis, including meiosis, through selectively driven duplication-divergence processes. Overall, our study thus provides detailed insights into the birth and evolution of mammalian miRNA genes and the associated selective forces

    Gene length and detection bias in single cell RNA sequencing protocols

    No full text
    Background: Single cell RNA sequencing (scRNA-seq) has rapidly gained popularity for profiling transcriptomes of hundreds to thousands of single cells. This technology has led to the discovery of novel cell types and revealed insights into the development of complex tissues. However, many technical challenges need to be overcome during data generation. Due to minute amounts of starting material, samples undergo extensive amplification, increasing technical variability. A solution for mitigating amplification biases is to include unique molecular identifiers (UMIs), which tag individual molecules. Transcript abundances are then estimated from the number of unique UMIs aligning to a specific gene, with PCR duplicates resulting in copies of the UMI not included in expression estimates. Methods: Here we investigate the effect of gene length bias in scRNA-Seq across a variety of datasets that differ in terms of capture technology, library preparation, cell types and species. Results: We find that scRNA-seq datasets that have been sequenced using a full-length transcript protocol exhibit gene length bias akin to bulk RNA-seq data. Specifically, shorter genes tend to have lower counts and a higher rate of dropout. In contrast, protocols that include UMIs do not exhibit gene length bias, with a mostly uniform rate of dropout across genes of varying length. Across four different scRNA-Seq datasets profiling mouse embryonic stem cells (mESCs), we found the subset of genes that are only detected in the UMI datasets tended to be shorter, while the subset of genes detected only in the full-length datasets tended to be longer. Conclusions: We find that the choice of scRNA-seq protocol influences the detection rate of genes, and that full-length datasets exhibit gene-length bias. In addition, despite clear differences between UMI and full-length transcript data, we illustrate that full-length and UMI data can be combined to reveal the underlying biology influencing expression of mESCs

    Parallel derivation of isogenic human primed and naive induced pluripotent stem cells

    No full text
    Induced pluripotent stem cells (iPSCs) have considerably impacted human developmental biology and regenerative medicine, notably because they circumvent the use of cells from embryonic origin and offer the potential to generate patient-specific pluripotent stem cells. However, conventional reprogramming protocols produce developmentally advanced, or primed, human iPSCs (hiPSCs), restricting their use to postimplantation human development modeling. Hence, there is a need for hiPSCs resembling preimplantation naive epiblast. Here, we develop a method to generate naive hiPSCs directly from somatic cells, using OKMS overexpression and specific culture conditions, further enabling parallel generation of their isogenic primed counterparts. We benchmark naive hiPSCs against human preimplantation epiblast and reveal a remarkable concordance in their transcriptome, dependency on mitochondrial respiration and X chromosome status. Collectively, our results are essential for the understanding of pluripotency regulation throughout preimplantation development and generate new opportunities for disease modeling and regenerative medicine.status: publishe

    Compound Heterozygosity for Y Box Proteins Causes Sterility Due to Loss of Translational Repression

    No full text
    <div><p>The Y-box proteins YBX2 and YBX3 bind RNA and DNA and are required for metazoan development and fertility. However, possible functional redundancy between YBX2 and YBX3 has prevented elucidation of their molecular function as RNA masking proteins and identification of their target RNAs. To investigate possible functional redundancy between YBX2 and YBX3, we attempted to construct <i>Ybx2</i><sup><i>-/-</i></sup><i>;Ybx3</i><sup><i>-/-</i></sup> double mutants using a previously reported <i>Ybx2</i><sup><i>-/-</i></sup> model and a newly generated global <i>Ybx3</i><sup><i>-/-</i></sup> model. Loss of YBX3 resulted in reduced male fertility and defects in spermatid differentiation. However, homozygous double mutants could not be generated as haploinsufficiency of both <i>Ybx2</i> and <i>Ybx3</i> caused sterility characterized by extensive defects in spermatid differentiation. RNA sequence analysis of mRNP and polysome occupancy in single and compound <i>Ybx2/3</i> heterozygotes revealed loss of translational repression almost exclusively in the compound <i>Ybx2/3</i> heterozygotes. RNAseq analysis also demonstrated that Y-box protein dose-dependent loss of translational regulation was inversely correlated with the presence of a Y box recognition target sequence, suggesting that Y box proteins bind RNA hierarchically to modulate translation in a range of targets.</p></div

    The evolution of lncRNA repertoires and expression patterns in tetrapods.

    No full text
    Only a very small fraction of long noncoding RNAs (lncRNAs) are well characterized. The evolutionary history of lncRNAs can provide insights into their functionality, but the absence of lncRNA annotations in non-model organisms has precluded comparative analyses. Here we present a large-scale evolutionary study of lncRNA repertoires and expression patterns, in 11 tetrapod species. We identify approximately 11,000 primate-specific lncRNAs and 2,500 highly conserved lncRNAs, including approximately 400 genes that are likely to have originated more than 300 million years ago. We find that lncRNAs, in particular ancient ones, are in general actively regulated and may function predominantly in embryonic development. Most lncRNAs evolve rapidly in terms of sequence and expression levels, but tissue specificities are often conserved. We compared expression patterns of homologous lncRNA and protein-coding families across tetrapods to reconstruct an evolutionarily conserved co-expression network. This network suggests potential functions for lncRNAs in fundamental processes such as spermatogenesis and synaptic transmission, but also in more specific mechanisms such as placenta development through microRNA production
    corecore