13 research outputs found

    Corrigendum: Characterizing noise structure in single-cell RNA-seq distinguishes genuine from technical stochastic allelic expression.

    Get PDF
    Nature Communications 6: Article number: 8687 (2015); Published: 22 October 2015; Updated: 11 January 2016. The original version of this Article contained an error in the spelling of the author Tomislav Ilicic, which was incorrectly given as Tomislav Illicic. This has now been corrected in both the PDF and HTML versions of the Article.</jats:p

    Characterizing noise structure in single-cell RNA-seq distinguishes genuine from technical stochastic allelic expression.

    Get PDF
    Single-cell RNA-sequencing (scRNA-seq) facilitates identification of new cell types and gene regulatory networks as well as dissection of the kinetics of gene expression and patterns of allele-specific expression. However, to facilitate such analyses, separating biological variability from the high level of technical noise that affects scRNA-seq protocols is vital. Here we describe and validate a generative statistical model that accurately quantifies technical noise with the help of external RNA spike-ins. Applying our approach to investigate stochastic allele-specific expression in individual cells, we demonstrate that a large fraction of stochastic allele-specific expression can be explained by technical noise, especially for lowly and moderately expressed genes: we predict that only 17.8% of stochastic allele-specific expression patterns are attributable to biological noise with the remainder due to technical noise

    Single Cell RNA-Sequencing of Pluripotent States Unlocks Modular Transcriptional Variation

    Get PDF
    SummaryEmbryonic stem cell (ESC) culture conditions are important for maintaining long-term self-renewal, and they influence cellular pluripotency state. Here, we report single cell RNA-sequencing of mESCs cultured in three different conditions: serum, 2i, and the alternative ground state a2i. We find that the cellular transcriptomes of cells grown in these conditions are distinct, with 2i being the most similar to blastocyst cells and including a subpopulation resembling the two-cell embryo state. Overall levels of intercellular gene expression heterogeneity are comparable across the three conditions. However, this masks variable expression of pluripotency genes in serum cells and homogeneous expression in 2i and a2i cells. Additionally, genes related to the cell cycle are more variably expressed in the 2i and a2i conditions. Mining of our dataset for correlations in gene expression allowed us to identify additional components of the pluripotency network, including Ptma and Zfp640, illustrating its value as a resource for future discovery

    Classification of low quality cells from single-cell RNA-seq data.

    Get PDF
    Single-cell RNA sequencing (scRNA-seq) has broad applications across biomedical research. One of the key challenges is to ensure that only single, live cells are included in downstream analysis, as the inclusion of compromised cells inevitably affects data interpretation. Here, we present a generic approach for processing scRNA-seq data and detecting low quality cells, using a curated set of over 20 biological and technical features. Our approach improves classification accuracy by over 30 % compared to traditional methods when tested on over 5,000 cells, including CD4+ T cells, bone marrow dendritic cells, and mouse embryonic stem cells

    Signatures of mutational processes in human cancer.

    Get PDF
    All cancers are caused by somatic mutations; however, understanding of the biological processes generating these mutations is limited. The catalogue of somatic mutations from a cancer genome bears the signatures of the mutational processes that have been operative. Here we analysed 4,938,362 mutations from 7,042 cancers and extracted more than 20 distinct mutational signatures. Some are present in many cancer types, notably a signature attributed to the APOBEC family of cytidine deaminases, whereas others are confined to a single cancer class. Certain signatures are associated with age of the patient at cancer diagnosis, known mutagenic exposures or defects in DNA maintenance, but many are of cryptic origin. In addition to these genome-wide mutational signatures, hypermutation localized to small genomic regions, 'kataegis', is found in many cancer types. The results reveal the diversity of mutational processes underlying the development of cancer, with potential implications for understanding of cancer aetiology, prevention and therapy

    Additional file 6: Figure S3. of Classification of low quality cells from single-cell RNA-seq data

    No full text
    Post-QC outliers and SVM performance evaluation. (A) Visualization of low and high quality cells after outlier detection with traditional and with our PCA feature-based methods (B) Schematic of nested cross-validation. The training set was split twice into 10 folds. The inner folds were important to estimate optimal hyperparameters, whereas the outer folds served to measure accuracy. Optimal hyperparameters were saved for later use. (C) Sensitivity and specificity of feature-based PCA and SVM using TPM values. (PDF 558 kb

    Additional file 4: Table S3. of Classification of low quality cells from single-cell RNA-seq data

    No full text
    P values of t-test comparing features between each type of low quality and high quality cells (training mES dataset). (TXT 1 kb

    Additional file 7: Figure S4. of Classification of low quality cells from single-cell RNA-seq data

    No full text
    Datasets distant from mES training data. (A) Comparing log normalized UMI counts (y-axis) and log normalized read counts (x-axis) for each gene in 960 mESCs. (B) PCA of first two principal components of all features. Low quality cells separate from high quality cells. (C, D) PCA plot of features of two published human cancer cell datasets [28, 53]. Boxplots on the left and bottom show the top three features separating low from high quality cells for PC1 and PC2, respectively. They align with our previous findings that the mtDNA and ERCC to mapped reads ratios are upregulated in low quality cells. (E) Feature-based PCA combining mouse ES training set and two published human cancer datasets. ‘Cytoplasm’ separates not only the human from the mouse but also the two different cancer samples from each other, meaning that the features trained on mouse cells are not directly transferrable to human cancer cells. (PDF 591 kb

    Additional file 1: Figure S1. of Classification of low quality cells from single-cell RNA-seq data

    No full text
    Overview of single cell RNA sequencing datasets. (A) Total number of cells per dataset. (B) Number of high quality and low quality cells per dataset. (C) Proportion of each type of low quality cells (broken, empty, multiple). (D) Number of cells for 2i/LIF, alternative 2i/LIF, and serum/LIF condition for the training dataset (960 mESCs). (PDF 441 kb
    corecore