660 research outputs found
Trajectory-based differential expression analysis for single-cell sequencing data
Trajectory inference has radically enhanced single-cell RNA-seq research by enabling the study of dynamic changes in gene expression. Downstream of trajectory inference, it is vital to discover genes that are (i) associated with the lineages in the trajectory, or (ii) differentially expressed between lineages, to illuminate the underlying biological processes. Current data analysis procedures, however, either fail to exploit the continuous resolution provided by trajectory inference, or fail to pinpoint the exact types of differential expression. We introduce tradeSeq, a powerful generalized additive model framework based on the negative binomial distribution that allows flexible inference of both within-lineage and between-lineage differential expression. By incorporating observation-level weights, the model additionally allows to account for zero inflation. We evaluate the method on simulated datasets and on real datasets from droplet-based and full-length protocols, and show that it yields biological insights through a clear interpretation of the data. Downstream of trajectory inference for cell lineages based on scRNA-seq data, differential expression analysis yields insight into biological processes. Here, Van den Berge et al. develop tradeSeq, a framework for the inference of within and between-lineage differential expression, based on negative binomial generalized additive models
Manifold Interpolating Optimal-Transport Flows for Trajectory Inference
We present a method called Manifold Interpolating Optimal-Transport Flow
(MIOFlow) that learns stochastic, continuous population dynamics from static
snapshot samples taken at sporadic timepoints. MIOFlow combines dynamic models,
manifold learning, and optimal transport by training neural ordinary
differential equations (Neural ODE) to interpolate between static population
snapshots as penalized by optimal transport with manifold ground distance.
Further, we ensure that the flow follows the geometry by operating in the
latent space of an autoencoder that we call a geodesic autoencoder (GAE). In
GAE the latent space distance between points is regularized to match a novel
multiscale geodesic distance on the data manifold that we define. We show that
this method is superior to normalizing flows, Schr\"odinger bridges and other
generative models that are designed to flow from noise to data in terms of
interpolating between populations. Theoretically, we link these trajectories
with dynamic optimal transport. We evaluate our method on simulated data with
bifurcations and merges, as well as scRNA-seq data from embryoid body
differentiation, and acute myeloid leukemia treatment.Comment: Presented at NeurIPS 2022, 24 pages, 7 tables, 14 figure
Recommended from our members
Simulating multiple faceted variability in single cell RNA sequencing.
The abundance of new computational methods for processing and interpreting transcriptomes at a single cell level raises the need for in silico platforms for evaluation and validation. Here, we present SymSim, a simulator that explicitly models the processes that give rise to data observed in single cell RNA-Seq experiments. The components of the SymSim pipeline pertain to the three primary sources of variation in single cell RNA-Seq data: noise intrinsic to the process of transcription, extrinsic variation indicative of different cell states (both discrete and continuous), and technical variation due to low sensitivity and measurement noise and bias. We demonstrate how SymSim can be used for benchmarking methods for clustering, differential expression and trajectory inference, and for examining the effects of various parameters on their performance. We also show how SymSim can be used to evaluate the number of cells required to detect a rare population under various scenarios
TinGa : fast and flexible trajectory inference with Growing Neural Gas
Motivation: During the last decade, trajectory inference (TI) methods have emerged as a novel framework to model cell developmental dynamics, most notably in the area of single-cell transcriptomics. At present, more than 70 TI methods have been published, and recent benchmarks showed that even state-of-the-art methods only perform well for certain trajectory types but not others. Results: In this work, we present TinGa, a new TI model that is fast and flexible, and that is based on Growing Neural Graphs. We performed an extensive comparison of TinGa to five state-of-the-art methods for TI on a set of 250 datasets, including both synthetic as well as real datasets. Overall, TinGa improves the state-of-the-art by producing accurate models (comparable to or an improvement on the state-of-the-art) on the whole spectrum of data complexity, from the simplest linear datasets to the most complex disconnected graphs. In addition, TinGa obtained the fastest execution times, showing that our method is thus one of the most versatile methods up to date
The boundary coefficient : a vertex measure for visualizing and finding structure in weighted graphs
- …