90 research outputs found

    A systematic evaluation of single cell RNA-seq analysis pipelines

    Get PDF
    The recent rapid spread of single cell RNA sequencing (scRNA-seq) methods has created a large variety of experimental and computational pipelines for which best practices have not yet been established. Here, we use simulations based on five scRNA-seq library protocols in combination with nine realistic differential expression (DE) setups to systematically evaluate three mapping, four imputation, seven normalisation and four differential expression testing approaches resulting in similar to 3000 pipelines, allowing us to also assess interactions among pipeline steps. We find that choices of normalisation and library preparation protocols have the biggest impact on scRNA-seq analyses. Specifically, we find that library preparation determines the ability to detect symmetric expression differences, while normalisation dominates pipeline performance in asymmetric DE-setups. Finally, we illustrate the importance of informed choices by showing that a good scRNA-seq pipeline can have the same impact on detecting a biological signal as quadrupling the sample size

    The impact of amplification on differential expression analyses by RNA-seq

    Get PDF
    Currently, quantitative RNA-seq methods are pushed to work with increasingly small starting amounts of RNA that require amplification. However, it is unclear how much noise or bias amplification introduces and how this affects precision and accuracy of RNA quantification. To assess the effects of amplification, reads that originated from the same RNA molecule (PCR-duplicates) need to be identified. Computationally, read duplicates are defined by their mapping position, which does not distinguish PCR-from natural duplicates and hence it is unclear how to treat duplicated reads. Here, we generate and analyse RNA-seq data sets prepared using three different protocols (Smart-Seq, TruSeq and UMI-seq). We find that a large fraction of computationally identified read duplicates are not PCR duplicates and can be explained by sampling and fragmentation bias. Consequently, the computational removal of duplicates does improve neither accuracy nor precision and can actually worsen the power and the False Discovery Rate (FDR) for differential gene expression. Even when duplicates are experimentally identified by unique molecular identifiers (UMIs), power and FDR are only mildly improved. However, the pooling of samples as made possible by the early barcoding of the UMI-protocol leads to an appreciable increase in the power to detect differentially expressed genes

    Functional dissection of two amino acid substitutions unique to the human FOXP2 protein

    Get PDF
    The transcription factor forkhead box P2 (FOXP2) is involved in the development of language and speech in humans. Two amino acid substitutions (T303N, N325S) occurred in the human FOXP2 after the divergence from the chimpanzee lineage. It has previously been shown that when they are introduced into the FOXP2 protein of mice they alter striatal synaptic plasticity by increasing long-term depression in medium spiny neurons. Here we introduce each of these amino acid substitutions individually into mice and analyze their effects in the striatum. We find that long-term depression in medium spiny neurons is increased in mice carrying only the T303N substitution to the same extent as in mice carrying both amino acid substitutions. In contrast, the N325S substitution has no discernable effects.journal articl

    Functional analysis of human and chimpanzee promoters

    Get PDF
    BACKGROUND: It has long been argued that changes in gene expression may provide an additional and crucial perspective on the evolutionary differences between humans and chimpanzees. To investigate how often expression differences seen in tissues are caused by sequence differences in the proximal promoters, we tested the expression activity in cultured cells of human and chimpanzee promoters from genes that differ in mRNA expression between human and chimpanzee tissues. RESULTS: Twelve promoters for which the corresponding gene had been shown to be differentially expressed between humans and chimpanzees in liver or brain were tested. Seven showed a significant difference in activity between the human promoter and the orthologous chimpanzee promoter in at least one of the two cell lines used. However, only three of them showed a difference in the same direction as in the tissues. CONCLUSION: Differences in proximal promoter activity are likely to be common between humans and chimpanzees, but are not linked in a simple fashion to gene-expression levels in tissues. This suggests that several genetic differences between humans and chimpanzees might be responsible for a single expression difference and thus that relevant expression differences between humans and chimpanzees will be difficult to predict from cell culture experiments or DNA sequences

    Retrotransposons as pathogenicity factors of the plant pathogenic fungus Botrytis cinerea

    Get PDF
    BACKGROUND Retrotransposons are genetic elements inducing mutations in all domains of life. Despite their detrimental effect, retrotransposons can become temporarily active during epigenetic reprogramming and cellular stress response, which may accelerate host genome evolution. In fungal pathogens, a positive role has been attributed to retrotransposons when shaping genome architecture and expression of genes encoding pathogenicity factors; thus, retrotransposons are known to influence pathogenicity. RESULTS We uncover a hitherto unknown role of fungal retrotransposons as being pathogenicity factors, themselves. The aggressive fungal plant pathogen, Botrytis cinerea, is known to deliver some long-terminal repeat (LTR) deriving regulatory trans-species small RNAs (BcsRNAs) into plant cells to suppress host gene expression for infection. We find that naturally occurring, less aggressive B. cinerea strains possess considerably lower copy numbers of LTR retrotransposons and had lost retrotransposon BcsRNA production. Using a transgenic proof-of-concept approach, we reconstitute retrotransposon expression in a BcsRNA-lacking B. cinerea strain, which results in enhanced aggressiveness in a retrotransposon and BcsRNA expression-dependent manner. Moreover, retrotransposon expression in B. cinerea leads to suppression of plant defence-related genes during infection. CONCLUSIONS We propose that retrotransposons are pathogenicity factors that manipulate host plant gene expression by encoding trans-species BcsRNAs. Taken together, the novelty that retrotransposons are pathogenicity factors will have a broad impact on studies of host-microbe interactions and pathology

    zUMIs - A fast and flexible pipeline to process RNA sequencing data with UMIs

    Get PDF
    Background: Single-cell RNA-sequencing (scRNA-seq) experiments typically analyze hundreds or thousands of cells after amplification of the cDNA. The high throughput is made possible by the early introduction of sample-specific bar codes (BCs), and the amplification bias is alleviated by unique molecular identifiers (UMIs). Thus, the ideal analysis pipeline for scRNA-seq data needs to efficiently tabulate reads according to both BC and UMI. Findings: zUMIs is a pipeline that can handle both known and random BCs and also efficiently collapse UMIs, either just for exon mapping reads or for both exon and intron mapping reads. If BC annotation is missing, zUMIs can accurately detect intact cells from the distribution of sequencing reads. Another unique feature of zUMIs is the adaptive downsampling function that facilitates dealing with hugely varying library sizes but also allows the user to evaluate whether the library has been sequenced to saturation. To illustrate the utility of zUMIs, we analyzed a single-nucleus RNA-seq dataset and show that more than 35% of all reads map to introns. Also, we show that these intronic reads are informative about expression levels, significantly increasing the number of detected genes and improving the cluster resolution. Conclusions: zUMIs flexibility makes if possible to accommodate data generated with any of the major scRNA-seq protocols that use BCs and UMIs and is the most feature-rich, fast, and user-friendly pipeline to process such scRNA-seq data

    FUNC: a package for detecting significant associations between gene sets and ontological annotations

    Get PDF
    BACKGROUND: Genome-wide expression, sequence and association studies typically yield large sets of gene candidates, which must then be further analysed and interpreted. Information about these genes is increasingly being captured and organized in ontologies, such as the Gene Ontology. Relationships between the gene sets identified by experimental methods and biological knowledge can be made explicit and used in the interpretation of results. However, it is often difficult to assess the statistical significance of such analyses since many inter-dependent categories are tested simultaneously. RESULTS: We developed the program package FUNC that includes and expands on currently available methods to identify significant associations between gene sets and ontological annotations. Implemented are several tests in particular well suited for genome wide sequence comparisons, estimates of the family-wise error rate, the false discovery rate, a sensitive estimator of the global significance of the results and an algorithm to reduce the complexity of the results. CONCLUSION: FUNC is a versatile and useful tool for the analysis of genome-wide data. It is freely available under the GPL license and also accessible via a web service

    Mice carrying a human GLUD2 gene recapitulate aspects of human transcriptome and metabolome development

    Get PDF
    Whereas all mammals have one glutamate dehydrogenase gene (GLUD1), humans and apes carry an additional gene (GLUD2), which encodes an enzyme with distinct biochemical properties. We inserted a bacterial artificial chromosome containing the human GLUD2 gene into mice and analyzed the resulting changes in the transcriptome and metabolome during postnatal brain development. Effects were most pronounced early postnatally, and predominantly genes involved in neuronal development were affected. Remarkably, the effects in the transgenic mice partially parallel the transcriptome and metabolome differences seen between humans and macaques analyzed. Notably, the introduction of GLUD2 did not affect glutamate levels in mice, consistent with observations in the primates. Instead, the metabolic effects of GLUD2 center on the tricarboxylic acid cycle, suggesting that GLUD2 affects carbon flux during early brain development, possibly supporting lipid biosynthesis

    Primate iPS cells as tools for evolutionary analyses

    Get PDF
    Induced pluripotent stem cells (iPSCs) are regarded as a central tool to understand human biology in health and disease. Similarly, iPSCs from non-human primates should be a central tool to understand human evolution, in particular for assessing the conservation of regulatory networks in iPSC models. Here, we have generated human, gorilla, bonobo and cynomolgus monkey iPSCs and assess their usefulness in such a framework. We show that these cells are well comparable in their differentiation potential and are generally similar to human, cynomolgus and rhesus monkey embryonic stem cells (ESCs). RNA sequencing reveals that expression differences among clones, individuals and stem cell type are all of very similar magnitude within a species. In contrast, expression differences between closely related primate species are three times larger and most genes show significant expression differences among the analyzed species. However, pseudogenes differ more than twice as much, suggesting that evolution of expression levels in primate stem cells is rapid, but constrained. These patterns in pluripotent stem cells are comparable to those found in other tissues except testis. Hence, primate iPSCs reveal insights into general primate gene expression evolution and should provide a rich source to identify conserved and species-specific gene expression patterns for cellular phenotypes
    • …
    corecore