494 research outputs found

    Data Exploration, Quality Control and Testing in Single-Cell qPCR-Based Gene Expression Experiments

    Full text link
    Cell populations are never truly homogeneous; individual cells exist in biochemical states that define functional differences between them. New technology based on microfluidic arrays combined with multiplexed quantitative polymerase chain reactions (qPCR) now enables high-throughput single-cell gene expression measurement, allowing assessment of cellular heterogeneity. However very little analytic tools have been developed specifically for the statistical and analytical challenges of single-cell qPCR data. We present a statistical framework for the exploration, quality control, and analysis of single-cell gene expression data from microfluidic arrays. We assess accuracy and within-sample heterogeneity of single-cell expression and develop quality control criteria to filter unreliable cell measurements. We propose a statistical model accounting for the fact that genes at the single-cell level can be on (and for which a continuous expression measure is recorded) or dichotomously off (and the recorded expression is zero). Based on this model, we derive a combined likelihood-ratio test for differential expression that incorporates both the discrete and continuous components. Using an experiment that examines treatment-specific changes in expression, we show that this combined test is more powerful than either the continuous or dichotomous component in isolation, or a t-test on the zero-inflated data. While developed for measurements from a specific platform (Fluidigm), these tools are generalizable to other multi-parametric measures over large numbers of events.Comment: 9 pages, 5 figure

    Graphical models for zero-inflated single cell gene expression

    Get PDF
    Bulk gene expression experiments relied on aggregations of thousands of cells to measure the average expression in an organism. Advances in microfluidic and droplet sequencing now permit expression profiling in single cells. This study of cell-to-cell variation reveals that individual cells lack detectable expression of transcripts that appear abundant on a population level, giving rise to zero-inflated expression patterns. To infer gene co-regulatory networks from such data, we propose a multivariate Hurdle model. It is comprised of a mixture of singular Gaussian distributions. We employ neighborhood selection with the pseudo-likelihood and a group lasso penalty to select and fit undirected graphical models that capture conditional independences between genes. The proposed method is more sensitive than existing approaches in simulations, even under departures from our Hurdle model. The method is applied to data for T follicular helper cells, and a high-dimensional profile of mouse dendritic cells. It infers network structure not revealed by other methods; or in bulk data sets. An R implementation is available at https://github.com/amcdavid/HurdleNormal .Comment: Fixed error in software UR

    MAST: a flexible statistical framework for assessing transcriptional changes and characterizing heterogeneity in single-cell RNA sequencing data

    Get PDF
    Single-cell transcriptomics reveals gene expression heterogeneity but suffers from stochastic dropout and characteristic bimodal expression distributions in which expression is either strongly non-zero or non-detectable. We propose a two-part, generalized linear model for such bimodal data that parameterizes both of these features. We argue that the cellular detection rate, the fraction of genes expressed in a cell, should be adjusted for as a source of nuisance variation. Our model provides gene set enrichment analysis tailored to single-cell data. It provides insights into how networks of co-expressed genes evolve across an experimental treatment. MAST is available at https://github.com/RGLab/MAST

    QUAliFiER: An automated pipeline for quality assessment of gated flow cytometry data

    Full text link

    Clinical, radiologic, pathologic, and molecular characteristics of long-term survivors of diffuse intrinsic pontine glioma (DIPG): a collaborative report from the International and European Society for Pediatric Oncology DIPG registries

    Get PDF
    Purpose Diffuse intrinsic pontine glioma (DIPG) is a brainstem malignancy with a median survival of < 1 year. The International and European Society for Pediatric Oncology DIPG Registries collaborated to compare clinical, radiologic, and histomolecular characteristics between short-term survivors (STSs) and long-term survivors (LTSs). Materials and Methods Data abstracted from registry databases included patients from North America, Australia, Germany, Austria, Switzerland, the Netherlands, Italy, France, the United Kingdom, and Croatia. Results Among 1,130 pediatric and young adults with radiographically confirmed DIPG, 122 (11%) were excluded. Of the 1,008 remaining patients, 101 (10%) were LTSs (survival ≥ 2 years). Median survival time was 11 months (interquartile range, 7.5 to 16 months), and 1-, 2-, 3-, 4-, and 5-year survival rates were 42.3% (95% CI, 38.1% to 44.1%), 9.6% (95% CI, 7.8% to 11.3%), 4.3% (95% CI, 3.2% to 5.8%), 3.2% (95% CI, 2.4% to 4.6%), and 2.2% (95% CI, 1.4% to 3.4%), respectively. LTSs, compared with STSs, more commonly presented at age < 3 or > 10 years (11% v 3% and 33% v 23%, respectively; P < .001) and with longer symptom duration ( P < .001). STSs, compared with LTSs, more commonly presented with cranial nerve palsy (83% v 73%, respectively; P = .008), ring enhancement (38% v 23%, respectively; P = .007), necrosis (42% v 26%, respectively; P = .009), and extrapontine extension (92% v 86%, respectively; P = .04). LTSs more commonly received systemic therapy at diagnosis (88% v 75% for STSs; P = .005). Biopsies and autopsies were performed in 299 patients (30%) and 77 patients (10%), respectively; 181 tumors (48%) were molecularly characterized. LTSs were more likely to harbor a HIST1H3B mutation (odds ratio, 1.28; 95% CI, 1.1 to 1.5; P = .002). Conclusion We report clinical, radiologic, and molecular factors that correlate with survival in children and young adults with DIPG, which are important for risk stratification in future clinical trials

    Lateral gene transfer and ancient paralogy of operons containing redundant copies of tryptophan-pathway genes in Xylella species and in heterocystous cyanobacteria

    Get PDF
    BACKGROUND: Tryptophan-pathway genes that exist within an apparent operon-like organization were evaluated as examples of multi-genic genomic regions that contain phylogenetically incongruous genes and coexist with genes outside the operon that are congruous. A seven-gene cluster in Xylella fastidiosa includes genes encoding the two subunits of anthranilate synthase, an aryl-CoA synthetase, and trpR. A second gene block, present in the Anabaena/Nostoc lineage, but not in other cyanobacteria, contains a near-complete tryptophan operon nested within an apparent supraoperon containing other aromatic-pathway genes. RESULTS: The gene block in X. fastidiosa exhibits a sharply delineated low-GC content. This, as well as bias of codon usage and 3:1 dinucleotide analysis, strongly implicates lateral gene transfer (LGT). In contrast, parametric studies and protein tree phylogenies did not support the origination of the Anabaena/Nostoc gene block by LGT. CONCLUSIONS: Judging from the apparent minimal amelioration, the low-GC gene block in X. fastidiosa probably originated by LGT at a relatively recent time. The surprising inability to pinpoint a donor lineage still leaves room for alternative, albeit less likely, explanations other than LGT. On the other hand, the large Anabaena/Nostoc gene block does not seem to have arisen by LGT. We suggest that the contemporary Anabaena/Nostoc array of divergent paralogs represents an ancient ancestral state of paralog divergence, with extensive streamlining by gene loss occurring in the lineage of descent representing other (unicellular) cyanobacteria
    corecore