17 research outputs found

    Flexible, non-parametric modeling using regularized neural networks

    Get PDF
    Non-parametric, additive models are able to capture complex data dependencies in a flexible, yet interpretable way. However, choosing the format of the additive components often requires non-trivial data exploration. Here, as an alternative, we propose PrAda-net, a one-hidden-layer neural network, trained with proximal gradient descent and adaptive lasso. PrAda-net automatically adjusts the size and architecture of the neural network to reflect the complexity and structure of the data. The compact network obtained by PrAda-net can be translated to additive model components, making it suitable for non-parametric statistical modelling with automatic model selection. We demonstrate PrAda-net on simulated data, where we compare the test error performance, variable importance and variable subset identification properties of PrAda-net to other lasso-based regularization approaches for neural networks. We also apply PrAda-net to the massive U.K. black smoke data set, to demonstrate how PrAda-net can be used to model complex and heterogeneous data with spatial and temporal components. In contrast to classical, statistical non-parametric approaches, PrAda-net requires no preliminary modeling to select the functional forms of the additive components, yet still results in an interpretable model representation

    Generation and analysis of context-specific genome-scale metabolic models derived from single-cell RNA-Seq data

    Get PDF
    Single-cell RNA sequencing combined with genome-scale metabolic models (GEMs) has the potential to unravel the differences in metabolism across both cell types and cell states but requires new computational methods. Here, we present a method for generating cell-type-specific genome-scale models from clusters of single-cell RNA-Seq profiles. Specifically, we developed a method to estimate the minimum number of cells required to pool to obtain stable models, a bootstrapping strategy for estimating statistical inference, and a faster version of the task-driven integrative network inference for tissues\ua0algorithm for generating context-specific GEMs. In addition, we evaluated the effect of different RNA-Seq normalization methods on model topology and differences in models generated from single-cell and bulk RNA-Seq data. We applied our methods on data from mouse cortex neurons and cells from the tumor microenvironment of lung cancer and in both cases found that almost every cell subtype had a unique metabolic profile. In addition, our approach was able to detect cancer-associated metabolic differences between cancer cells and healthy cells, showcasing its utility. We also contextualized models from 202 single-cell clusters across 19 human organs using data from Human Protein Atlas and made these available in the web portal Metabolic Atlas, thereby providing a valuable resource to the scientific community. With the ever-increasing availability of single-cell RNA-Seq datasets and continuously improved GEMs, their combination holds promise to become an important approach in the study of human metabolism

    Modeling glioblastoma heterogeneity as a dynamic network of cell states

    Get PDF
    Tumor cell heterogeneity is a crucial characteristic of malignant brain tumors and underpins phenomena such as therapy resistance and tumor recurrence. Advances in single-cell analysis have enabled the delineation of distinct cellular states of brain tumor cells, but the time-dependent changes in such states remain poorly understood. Here, we construct quantitative models of the time-dependent transcriptional variation of patient-derived glioblastoma (GBM) cells. We build the models by sampling and profiling barcoded GBM cells and their progeny over the course of 3\ua0weeks and by fitting a mathematical model to estimate changes in GBM cell states and their growth rates. Our model suggests a hierarchical yet plastic organization of GBM, where the rates and patterns of cell state switching are partly patient-specific. Therapeutic interventions produce complex dynamic effects, including inhibition of specific states and altered differentiation. Our method provides a general strategy to uncover time-dependent changes in cancer cells and offers a way to evaluate and predict how therapy affects cell state composition

    DSAVE: Detection of misclassified cells in single-cell RNA-Seq data

    Get PDF
    Single-cell RNA sequencing has become a valuable tool for investigating cell types in complex tissues, where clustering of cells enables the identification and comparison of cell populations. Although many studies have sought to develop and compare different clustering approaches, a deeper investigation into the properties of the resulting populations is lacking. Specifically, the presence of misclassified cells can influence downstream analyses, highlighting the need to assess subpopulation purity and to detect such cells. We developed DSAVE (Down-SAmpling based Variation Estimation), a method to evaluate the purity of single-cell transcriptome clusters and to identify misclassified cells. The method utilizes down-sampling to eliminate differences in sampling noise and uses a log-likelihood based metric to help identify misclassified cells. In addition, DSAVE estimates the number of cells needed in a population to achieve a stable average gene expression profile within a certain gene expression range. We show that DSAVE can be used to find potentially misclassified cells that are not detectable by similar tools and reveal the cause of their divergence from the other cells, such as differing cell state or cell type. With the growing use of single-cell RNA-seq, we foresee that DSAVE will be an increasingly useful tool for comparing and purifying subpopulations in single-cell RNA-Seq datasets

    Digital twins to personalize medicine

    Get PDF
    Personalized medicine requires the integration and processing of vast amounts of data. Here, we propose a solution to this challenge that is based on constructing Digital Twins. These are high-resolution models of individual patients that are computationally treated with thousands of drugs to find the drug that is optimal for the patient

    Generation and analysis of context-specific genome-scale metabolic models derived from single-cell RNA-Seq data

    No full text
    Single-cell RNA sequencing has the potential to unravel the differences in metabolism across cell types and cell states in both the healthy and diseased human body. The use of existing knowledge in the form of genome-scale metabolic models (GEMs) holds promise to strengthen such analyses, but the combined use of these two methods requires new computational methods. Here, we present a method for generating cell-type-specific genome-scale models from clusters of single-cell RNA-Seq profiles. Specifically, we developed a method to estimate the number of cells required to pool to obtain stable models, a bootstrapping strategy for estimating statistical inference, and a faster version of the tINIT algorithm for generating context-specific GEMs. In addition, we evaluated the effect of different RNA-Seq normalization methods on model topology and differences in models generated from single-cell and bulk RNA-Seq data. We applied our methods on data from mouse cortex neurons and cells from the tumor microenvironment of lung cancer and in both cases found that almost every cell subtype had a unique metabolic profile, emphasizing the need to study them separately rather than to build models from bulk RNA-Seq data. In addition, our approach was able to detect cancer-associated metabolic differences between cancer cells and healthy cells, showcasing its utility. With the ever-increasing availability of single-cell RNA-Seq datasets and continuously improved GEMs, their combination holds promise to become an important approach in the study of human metabolism
    corecore