30 research outputs found

    Single-cell transcriptional uncertainty landscape of cell differentiation [version 2; peer review: 2 approved]

    Get PDF
    Background: Single-cell studies have demonstrated the presence of significant cell-to-cell heterogeneity in gene expression. Whether such heterogeneity is only a bystander or has a functional role in the cell differentiation process is still hotly debated. Methods: In this study, we quantified and followed single-cell transcriptional uncertainty – a measure of gene transcriptional stochasticity in single cells – in 10 cell differentiation systems of varying cell lineage progressions, from single to multi-branching trajectories, using the stochastic two-state gene transcription model. Results: By visualizing the transcriptional uncertainty as a landscape over a two-dimensional representation of the single-cell gene expression data, we observed universal features in the cell differentiation trajectories that include: (i) a peak in single-cell uncertainty during transition states, and in systems with bifurcating differentiation trajectories, each branching point represents a state of high transcriptional uncertainty; (ii) a positive correlation of transcriptional uncertainty with transcriptional burst size and frequency; (iii) an increase in RNA velocity preceding the increase in the cell transcriptional uncertainty. Conclusions: Our findings suggest a possible universal mechanism during the cell differentiation process, in which stem cells engage stochastic exploratory dynamics of gene expression at the start of the cell differentiation by increasing gene transcriptional bursts, and disengage such dynamics once cells have decided on a particular terminal cell identity. Notably, the peak of single-cell transcriptional uncertainty signifies the decision-making point in the cell differentiation process

    Harissa: Stochastic Simulation and Inference of Gene Regulatory Networks Based on Transcriptional Bursting

    No full text
    International audienceGene regulatory networks, as a powerful abstraction for describing complex biological interactions between genes through their expression products within a cell, are often regarded as virtually deterministic dynamical systems. However, this view is now being challenged by the fundamentally stochastic, ‘bursty’ nature of gene expression revealed at the single cell level. We present a Python package called Harissa which is dedicated to simulation and inference of such networks, based upon an underlying stochastic dynamical model driven by the transcriptional bursting phenomenon. As part of this tool, network inference can be interpreted as a calibration procedure for a mechanistic model: once calibrated, the model is able to capture the typical variability of single-cell data without requiring ad hoc external noise, unlike ordinary or even stochastic differential equations frequently used in this context. Therefore, Harissa can be used both as an inference tool, to reconstruct biologically relevant networks from time-course scRNA-seq data, and as a simulation tool, to generate quantitative gene expression profiles in a non-trivial way through gene interactions

    Stochastic Gene Expression with a Multistate Promoter: Breaking Down Exact Distributions

    No full text
    International audienceWe consider a stochastic model of gene expression in which transcription depends on a multistate promoter, including the famous two-state model and refractory promoters as special cases, and focus on deriving the exact stationary distribution. Building upon several successful approaches, we present a more unified viewpoint that enables us to simplify and generalize existing results. In particular, the original jump process is deeply related to a multivariate piecewise-deterministic Markov process that may also be of interest beyond the biological field. In a very particular case of promoter configuration, this underlying process is shown to have a simple Dirichlet stationary distribution. In the general case, the corresponding marginal distributions extend the well-known class of Beta products, involving complex parameters that directly relate to spectral properties of the promoter transition matrix. Finally, we illustrate these results with biologically plausible examples

    Harissa: tools for mechanistic gene network inference from single-cell data

    No full text
    Harissa (HARtree approximation for Inference along with a Stochastic Simulation Algorithm) is a Python package for both inference and simulation of gene regulatory networks, based on stochastic gene expression with transcriptional bursting. It was implemented in the context of a mechanistic approach to gene regulatory network inference from single-cell data

    Modélisation stochastique de l’expression des gènes et inférence de réseaux de régulation

    No full text
    Gene expression in cells has long been only observable through averaged quantities over cell populations. The recent development of single-cell transcriptomics has enabled gene expression to be measured in individual cells: it turns out that even in an isogenic population, the molecular variability can be very important. In particular, an average description is not sufficient to account for cell differentiation. In this thesis, we are interested in the emergence of such cell decision-making from underlying gene regulatory networks, which we would like to infer from data. The starting point is the construction of a stochastic gene network model that is able to explain the data using physical arguments. Genes are then seen as an interacting particle system that happens to be a piecewise-deterministic Markov process, and our aim is to derive a tractable statistical model from its invariant distribution. We present two approaches: the first one is a popular self-consistent field approximation, for which we obtain a concentration result, and the second one is based on an analytically tractable particular case, which provides a hidden Markov random field with interesting properties.L'expression des gènes dans les cellules a longtemps été observable uniquement à travers des quantités moyennes mesurées sur des populations. L'arrivée des techniques « single-cell » permet aujourd'hui de mesurer des niveaux d'ARN et de protéines dans des cellules individuelles : il s'avère que même dans une population de génome identique, la variabilité entre les cellules est parfois très forte. En particulier, une description moyenne est insuffisante pour étudier la différenciation cellulaire, c'est-à-dire la façon dont les cellules souches effectuent des choix de spécialisation. Dans cette thèse, on s'intéresse à l'émergence de tels choix à partir de réseaux de régulation sous-jacents entre les gènes, que l'on souhaiterait pouvoir inférer à partir de données. Le point de départ est la construction d'un modèle stochastique de réseau de gènes capable de reproduire les observations à partir d'arguments physiques. Les gènes sont alors décrits comme un système de particules en interaction qui se trouve être un processus de Markov déterministe par morceaux, et l'on cherche à obtenir un modèle statistique à partir de sa loi invariante. Nous présentons deux approches : la première correspond à une approximation de champ auto-cohérent assez populaire en physique, pour laquelle nous obtenons un résultat de concentration, et la deuxième se base sur un cas particulier que l'on sait résoudre explicitement, ce qui aboutit à un champ de Markov caché aux propriétés intéressantes

    Gene regulatory network inference from single-cell data using a self-consistent proteomic field

    No full text
    The well-known issue of reconstructing regulatory networks from gene expression measurements has been somewhat disrupted by the emergence and rapid development of single-cell data. Indeed, the traditional way of seeing a gene regulatory network as a deterministic system affected by small noise is being challenged by the highly stochastic, bursty nature of gene expression revealed at single-cell level. In previous work, we described a promising strategy in which network inference is seen as a calibration procedure for a mechanistic model driven by transcriptional bursting: this model inherently captures the typical variability of single-cell data without requiring ad hoc external noise, unlike ordinary or even stochastic differential equations often used in this context. The resulting algorithm, based on approximate resolution of the related master equation using a self-consistent field, was derived in detail but only applied as a proof of concept to simulated two-gene networks. Here we derive a simplified version of the algorithm and apply it, in more relevant situations, to both simulated and real single-cell RNA-Seq data. We point out three interesting features of this approach: it is computationally tractable with realistic numbers of cells and genes, it provides inferred networks with biological interpretability, and the underlying mechanistic model allows testable predictions to be made. A practical implementation of the inference procedure, together with an efficient stochastic simulation algorithm for the model, is available as a Python package

    Gene regulatory network inference from single-cell data using a self-consistent proteomic field

    No full text
    The well-known issue of reconstructing regulatory networks from gene expression measurements has been somewhat disrupted by the emergence and rapid development of single-cell data. Indeed, the traditional way of seeing a gene regulatory network as a deterministic system affected by small noise is being challenged by the highly stochastic, bursty nature of gene expression revealed at single-cell level. In previous work, we described a promising strategy in which network inference is seen as a calibration procedure for a mechanistic model driven by transcriptional bursting: this model inherently captures the typical variability of single-cell data without requiring ad hoc external noise, unlike ordinary or even stochastic differential equations often used in this context. The resulting algorithm, based on approximate resolution of the related master equation using a self-consistent field, was derived in detail but only applied as a proof of concept to simulated two-gene networks. Here we derive a simplified version of the algorithm and apply it, in more relevant situations, to both simulated and real single-cell RNA-Seq data. We point out three interesting features of this approach: it is computationally tractable with realistic numbers of cells and genes, it provides inferred networks with biological interpretability, and the underlying mechanistic model allows testable predictions to be made. A practical implementation of the inference procedure, together with an efficient stochastic simulation algorithm for the model, is available as a Python package

    From stochastic modelling of gene expression to inference of regulatory networks

    No full text
    L'expression des gènes dans une cellule a longtemps été observable uniquement à travers des quantités moyennes mesurées sur des populations. L'arrivée des techniques «single-cell» permet aujourd'hui d'observer des niveaux d'ARN et de protéines dans des cellules individuelles : il s'avère que même dans une population de génome identique, la variabilité entre les cellules est parfois très forte. En particulier, une description moyenne est clairement insuffisante étudier la différenciation cellulaire, c'est-à-dire la façon dont les cellules souches effectuent des choix de spécialisation. Dans cette thèse, on s'intéresse à l'émergence de tels choix à partir de réseaux de régulation sous-jacents entre les gènes, que l'on souhaiterait pouvoir inférer à partir de données. Le point de départ est la construction d'un modèle stochastique de réseaux de gènes capable de reproduire les observations à partir d'arguments physiques. Les gènes sont alors décrits comme un système de particules en interaction qui se trouve être un processus de Markov déterministe par morceaux, et l'on cherche à obtenir un modèle statistique à partir de sa loi invariante. Nous présentons deux approches : la première correspond à une approximation de champ assez populaire en physique, pour laquelle nous obtenons un résultat de concentration, et la deuxième se base sur un cas particulier que l'on sait résoudre explicitement, ce qui aboutit à un champ de Markov caché aux propriétés intéressantesGene expression in a cell has long been only observable through averaged quantities over cell populations. The recent development of single-cell transcriptomics has enabled gene expression to be measured in individual cells: it turns out that even in an isogenic population, the molecular variability can be very important. In particular, an averaged description is not sufficient to account for cell differentiation. In this thesis, we are interested in the emergence of such cell decision-making from underlying gene regulatory networks, which we would like to infer from data. The starting point is the construction of a stochastic gene network model that is able to explain the data using physical arguments. Genes are then seen as an interacting particle system that happens to be a piecewise-deterministic Markov process, and our aim is to derive a tractable statistical model from its stationary distribution. We present two approaches: the first one is a popular field approximation, for which we obtain a concentration result, and the second one is based on an analytically tractable particular case, which provides a hidden Markov random field with interesting propertie

    Stochastic Gene Expression with a Multistate Promoter: Breaking Down Exact Distributions

    No full text
    corecore