170,313 research outputs found
Discovering Neuronal Cell Types and Their Gene Expression Profiles Using a Spatial Point Process Mixture Model
Cataloging the neuronal cell types that comprise circuitry of individual
brain regions is a major goal of modern neuroscience and the BRAIN initiative.
Single-cell RNA sequencing can now be used to measure the gene expression
profiles of individual neurons and to categorize neurons based on their gene
expression profiles. While the single-cell techniques are extremely powerful
and hold great promise, they are currently still labor intensive, have a high
cost per cell, and, most importantly, do not provide information on spatial
distribution of cell types in specific regions of the brain. We propose a
complementary approach that uses computational methods to infer the cell types
and their gene expression profiles through analysis of brain-wide single-cell
resolution in situ hybridization (ISH) imagery contained in the Allen Brain
Atlas (ABA). We measure the spatial distribution of neurons labeled in the ISH
image for each gene and model it as a spatial point process mixture, whose
mixture weights are given by the cell types which express that gene. By fitting
a point process mixture model jointly to the ISH images, we infer both the
spatial point process distribution for each cell type and their gene expression
profile. We validate our predictions of cell type-specific gene expression
profiles using single cell RNA sequencing data, recently published for the
mouse somatosensory cortex. Jointly with the gene expression profiles, cell
features such as cell size, orientation, intensity and local density level are
inferred per cell type
Three-dimensional morphology and gene expression in the Drosophila blastoderm at cellular resolution II: dynamics.
BackgroundTo accurately describe gene expression and computationally model animal transcriptional networks, it is essential to determine the changing locations of cells in developing embryos.ResultsUsing automated image analysis methods, we provide the first quantitative description of temporal changes in morphology and gene expression at cellular resolution in whole embryos, using the Drosophila blastoderm as a model. Analyses based on both fixed and live embryos reveal complex, previously undetected three-dimensional changes in nuclear density patterns caused by nuclear movements prior to gastrulation. Gene expression patterns move, in part, with these changes in morphology, but additional spatial shifts in expression patterns are also seen, supporting a previously proposed model of pattern dynamics based on the induction and inhibition of gene expression. We show that mutations that disrupt either the anterior/posterior (a/p) or the dorsal/ventral (d/v) transcriptional cascades alter morphology and gene expression along both the a/p and d/v axes in a way suggesting that these two patterning systems interact via both transcriptional and morphological mechanisms.ConclusionOur work establishes a new strategy for measuring temporal changes in the locations of cells and gene expression patterns that uses fixed cell material and computational modeling. It also provides a coordinate framework for the blastoderm embryo that will allow increasingly accurate spatio-temporal modeling of both the transcriptional control network and morphogenesis
DOT: A flexible multi-objective optimization framework for transferring features across single-cell and spatial omics
Single-cell RNA sequencing (scRNA-seq) and spatially-resolved
imaging/sequencing technologies have revolutionized biomedical research. On one
hand, scRNA-seq provides information about a large portion of the transcriptome
for individual cells, but lacks the spatial context. On the other hand,
spatially-resolved measurements come with a trade-off between resolution and
gene coverage. Combining scRNA-seq with different spatially-resolved
technologies can thus provide a more complete map of tissues with enhanced
cellular resolution and gene coverage. Here, we propose DOT, a novel
multi-objective optimization framework for transferring cellular features
across these data modalities. DOT is flexible and can be used to infer
categorical (cell type or cell state) or continuous features (gene expression)
in different types of spatial omics. Our optimization model combines practical
aspects related to tissue composition, technical effects, and integration of
prior knowledge, thereby providing flexibility to combine scRNA-seq and both
low- and high-resolution spatial data. Our fast implementation based on the
Frank-Wolfe algorithm achieves state-of-the-art or improved performance in
localizing cell features in high- and low-resolution spatial data and
estimating the expression of unmeasured genes in low-coverage spatial data
across different tissues. DOT is freely available and can be deployed
efficiently without large computational resources; typical cases-studies can be
run on a laptop, facilitating its use.Comment: 36 pages, 6 figure
A CAUSAL HIERARCHICAL MARKOV FRAMEWORK FOR THE CLASSIFICATION OF MULTIRESOLUTION AND MULTISENSOR REMOTE SENSING IMAGES
Abstract. In this paper, a multiscale Markov framework is proposed in order to address the problem of the classification of multiresolution and multisensor remotely sensed data. The proposed framework makes use of a quadtree to model the interactions across different spatial resolutions and a Markov model with respect to a generic total order relation to deal with contextual information at each scale in order to favor applicability to very high resolution imagery. The methodological properties of the proposed hierarchical framework are investigated. Firstly, we prove the causality of the overall proposed model, a particularly advantageous property in terms of computational cost of the inference. Secondly, we prove the expression of the marginal posterior mode criterion for inference on the proposed framework. Within this framework, a specific algorithm is formulated by defining, within each layer of the quadtree, a Markov chain model with respect to a pixel scan that combines both a zig-zag trajectory and a Hilbert space-filling curve. Data collected by distinct sensors at the same spatial resolution are fused through gradient boosted regression trees. The developed algorithm was experimentally validated with two very high resolution datasets including multispectral, panchromatic and radar satellite images. The experimental results confirm the effectiveness of the proposed algorithm as compared to previous techniques based on alternate approaches to multiresolution fusion
Automatic Annotation of Spatial Expression Patterns via Sparse Bayesian Factor Models
Advances in reporters for gene expression have made it possible to document and quantify expression patterns in 2D–4D. In contrast to microarrays, which provide data for many genes but averaged and/or at low resolution, images reveal the high spatial dynamics of gene expression. Developing computational methods to compare, annotate, and model gene expression based on images is imperative, considering that available data are rapidly increasing. We have developed a sparse Bayesian factor analysis model in which the observed expression diversity of among a large set of high-dimensional images is modeled by a small number of hidden common factors. We apply this approach on embryonic expression patterns from a Drosophila RNA in situ image database, and show that the automatically inferred factors provide for a meaningful decomposition and represent common co-regulation or biological functions. The low-dimensional set of factor mixing weights is further used as features by a classifier to annotate expression patterns with functional categories. On human-curated annotations, our sparse approach reaches similar or better classification of expression patterns at different developmental stages, when compared to other automatic image annotation methods using thousands of hard-to-interpret features. Our study therefore outlines a general framework for large microscopy data sets, in which both the generative model itself, as well as its application for analysis tasks such as automated annotation, can provide insight into biological questions
Automated annotation of gene expression image sequences via non-parametric factor analysis and conditional random fields
Motivation: Computational approaches for the annotation of phenotypes from image data have shown promising results across many applications, and provide rich and valuable information for studying gene function and interactions. While data are often available both at high spatial resolution and across multiple time points, phenotypes are frequently annotated independently, for individual time points only. In particular, for the analysis of developmental gene expression patterns, it is biologically sensible when images across multiple time points are jointly accounted for, such that spatial and temporal dependencies are captured simultaneously. Methods: We describe a discriminative undirected graphical model to label gene-expression time-series image data, with an efficient training and decoding method based on the junction tree algorithm. The approach is based on an effective feature selection technique, consisting of a non-parametric sparse Bayesian factor analysis model. The result is a flexible framework, which can handle large-scale data with noisy incomplete samples, i.e. it can tolerate data missing from individual time points. Results: Using the annotation of gene expression patterns across stages of Drosophila embryonic development as an example, we demonstrate that our method achieves superior accuracy, gained by jointly annotating phenotype sequences, when compared with previous models that annotate each stage in isolation. The experimental results on missing data indicate that our joint learning method successfully annotates genes for which no expression data are available for one or more stages
Modeling and Reconstruction of Mixed Functional and Molecular Patterns
Functional medical imaging promises powerful tools for the
visualization and elucidation of important disease-causing
biological processes in living tissue. Recent research aims to
dissect the distribution or expression of multiple biomarkers
associated with disease progression or response, where the signals
often represent a composite of more than one distinct source
independent of spatial resolution. Formulating the task as a blind
source separation or composite signal factorization problem, we
report here a statistically principled method for modeling and
reconstruction of mixed functional or molecular patterns. The
computational algorithm is based on a latent variable model whose
parameters are estimated using clustered component analysis. We
demonstrate the principle and performance of the approaches on the
breast cancer data sets acquired by dynamic contrast-enhanced
magnetic resonance imaging
- …