157 research outputs found

    Three-dimensional imaging mass cytometry for highly multiplexed molecular and cellular mapping of tissues and the tumor microenvironment

    Full text link
    A holistic understanding of tissue and organ structure and function requires the detection of molecular constituents in their original three-dimensional (3D) context. Imaging mass cytometry (IMC) enables simultaneous detection of up to 40 antigens and transcripts using metal-tagged antibodies but has so far been restricted to two-dimensional imaging. Here we report the development of 3D IMC for multiplexed 3D tissue analysis at single-cell resolution and demonstrate the utility of the technology by analysis of human breast cancer samples. The resulting 3D models reveal cellular and microenvironmental heterogeneity and cell-level tissue organization not detectable in two dimensions. 3D IMC will prove powerful in the study of phenomena occurring in 3D space such as tumor cell invasion and is expected to provide invaluable insights into cellular microenvironments and tissue architecture

    Whole-cell segmentation of tissue images with human-level performance using large-scale data annotation and deep learning

    Get PDF
    Understanding the spatial organization of tissues is of critical importance for both basic and translational research. While recent advances in tissue imaging are opening an exciting new window into the biology of human tissues, interpreting the data that they create is a significant computational challenge. Cell segmentation, the task of uniquely identifying each cell in an image, remains a substantial barrier for tissue imaging, as existing approaches are inaccurate or require a substantial amount of manual curation to yield useful results. Here, we addressed the problem of cell segmentation in tissue imaging data through large-scale data annotation and deep learning. We constructed TissueNet, an image dataset containing >1 million paired whole-cell and nuclear annotations for tissue images from nine organs and six imaging platforms. We created Mesmer, a deep learning-enabled segmentation algorithm trained on TissueNet that performs nuclear and whole-cell segmentation in tissue imaging data. We demonstrated that Mesmer has better speed and accuracy than previous methods, generalizes to the full diversity of tissue types and imaging platforms in TissueNet, and achieves human-level performance for whole-cell segmentation. Mesmer enabled the automated extraction of key cellular features, such as subcellular localization of protein signal, which was challenging with previous approaches. We further showed that Mesmer could be adapted to harness cell lineage information present in highly multiplexed datasets. We used this enhanced version to quantify cell morphology changes during human gestation. All underlying code and models are released with permissive licenses as a community resource

    Segmentation-free inference of cell types from in situ transcriptomics data

    Get PDF
    Recent advances in the fields of genome editing, whole-genome sequencing, single-cell RNA sequencing, and in situ spatial transcriptomics have enabled the cost-efficient production of high-throughput big data. However, the lack of dedicated bioinformatics algorithms to analyze such data has been a big hurdle. In this thesis, several novel bioinformatics tools applicable to each field are presented. First, a series of web-based tools for genome editing are presented: Cpf1-Database, Cas-Analyzer, web-based Digenome-seq software, BE-Designer/Analyzer. These tools have been developed to guide researchers to easily use genome editing systems, using Cas9 or Cpf1, by providing an easily accessible web-based interface. Second, the development of two bioinformatics pipelines are described: a small variant calling pipeline to process tumor genome sequencing data without a matched control, and a pipeline to pre-process single-cell RNA sequencing data. Third, a novel segmentation-free algorithm to call cell-types from in situ transcriptomics data, namely Spot-based Spatial cell-type Analysis by Multidimensional mRNA density estimation (SSAM) is presented. Recent advances of in situ spatial transcriptomics techniques, such as multiplexed fluorescence in situ hybridization or in situ/intact tissue sequencing have enabled the discovery of spatial heterogeneity of cell types at the tissue level. However, cell type calling methods are often limited by cell segmentation algorithms due to various imaging problems. SSAM circumvents these problems by estimating spatial gene expressions as a density estimation of the mRNA in a spatial context and identifying de novo cell-types and their spatial organization without the need to segment cells. Optionally, SSAM can be guided by external sources of cell-type information, integrating them in a spatial context. In this thesis, SSAM is demonstrated with three different mouse brain tissues imaged by different imaging techniques: the somatosensory cortex (SSp) imaged by osmFISH; the hypothalamic preoptic region (POA) by MERFISH; and the visual cortex (VISp) by multiplexed smFISH. SSAM can produce similar results compared to segmentation-based methods and outperforms them when cell segmentation is the limiting factor. In summary, the bioinformatics tools presented in this thesis overcome major obstacles that would normally hinder effective analysis: the web-based tools for genome editing have a wide user base due to their easy-to-use web-based interfaces; omics data analysis pipeline that enables fast analysis of omics data utilizing a compute cluster and facilitate hypothesis generation when lacking control tissue; and SSAM that enables the analysis of in situ spatial transcriptomics data without being limited by cell segmentation. All of the tools and pipelines described in this thesis are open-sourced and freely accessible for non-profit, research-purpose use

    Mesmerize is a dynamically adaptable user-friendly analysis platform for 2D and 3D calcium imaging data

    Get PDF
    Calcium imaging is an increasingly valuable technique for understanding neural circuits, neuroethology, and cellular mechanisms. The analysis of calcium imaging data presents challenges in image processing, data organization, analysis, and accessibility. Tools have been created to address these problems independently, however a comprehensive user-friendly package does not exist. Here we present Mesmerize, an efficient, expandable and user-friendly analysis platform, which uses a Findable, Accessible, Interoperable and Reproducible (FAIR) system to encapsulate the entire analysis process, from raw data to interactive visualizations for publication. Mesmerize provides a user-friendly graphical interface to state-of-the-art analysis methods for signal extraction & downstream analysis. We demonstrate the broad scientific scope of Mesmerize’s applications by analyzing neuronal datasets from mouse and a volumetric zebrafish dataset. We also applied contemporary time-series analysis techniques to analyze a novel dataset comprising neuronal, epidermal, and migratory mesenchymal cells of the protochordate Ciona intestinalis.publishedVersio

    Visualization and exploration of next-generation proteomics data

    Get PDF

    Understanding cellular differentiation by modelling of single-cell gene expression data

    Get PDF
    Over the course of the last decade single-cell RNA sequencing (scRNA-seq) has revolutionized the study of cellular heterogeneity, as one experiment routinely covers the expression of thousands of genes in tens or hundreds of thousands of cells. By quantifying differences between the single cell transcriptomes it is possible to reconstruct the process that gives rise to different cell fates from a progenitor population and gain access to trajectories of gene expression over developmental time. Tree reconstruction algorithms must deal with the high levels of noise, the high dimensionality of gene expression space, and strong non-linear dependencies between genes. In this thesis we address three aspects of working with scRNA-seq data: (1) lineage tree reconstruction, where we propose MERLoT, a novel trajectory inference method, (2) method comparison, where we propose PROSSTT, a novel algorithm that simulates scRNA-seq count data of complex differentiation trajectories, and (3) noise modelling, where we propose a novel probabilistic description of count data, a statistically motivated local averaging strategy, and an adaptation of the cross validation approach for the evaluation of gene expression imputation strategies. While statistical modelling of the data was our primary motivation, due to time constraints we did not manage to fully realize our plans for it. Increasingly complex processes like whole-organism development are being studied by single-cell transcriptomics, producing large amounts of data. Methods for trajectory inference must therefore efficiently reconstruct \textit{a priori} unknown lineage trees with many cell fates. We propose MERLoT, a method that can reconstruct trees in sub-quadratic time by utilizing a local averaging strategy, scaling very well on large datasets. MERLoT compares favorably to the state of the art, both on real data and a large synthetic benchmark. The absence of data with known complex underlying topologies makes it challenging to quantitatively compare tree reconstruction methods to each other. PROSSTT is a novel algorithm that simulates count data from complex differentiation processes, facilitating comparisons between algorithms. We created the largest synthetic dataset to-date, and the first to contain simulations with up to 12 cell fates. Additionally, PROSSTT can learn simulation parameters from reconstructed lineage trees and produce cells with expression profiles similar to the real data. Quantifying similarity between single-cell transcriptomes is crucial for clustering scRNA-seq profiles to cell types or inferring developmental trajectories, and appropriate statistical modelling of the data should improve such similarity calculations. We propose a Gaussian mixture of negative binomial distributions where gene expression variance depends on the square of the average expression. The model hyperparameters can be learned via the hybrid Monte Carlo algorithm, and a good initialization of average expression and variance parameters can be obtained by trajectory inference. A way to limit noise in the data is to apply local averaging, using the nearest neighbours of each cell to recover expression of non-captured mRNA. Our proposal, nearest neighbour smoothing with optimal bias-variance trade-off, optimizes the k-nearest neighbours approach by reducing the contribution of inappropriate neighbours. We also propose a way to assess the quality of gene expression imputation. After reconstructing a trajectory with imputed data, each cell can be projected to the trajectory using non-overlapping subsets of genes. The robustness of these assignments over multiple partitions of the genes is a novel estimator of imputation performance. Finally, I was involved in the planning and initial stages of a mouse ovary cell atlas as a collaboration

    DESIGN OF A GAIT ACQUISITION AND ANALYSIS SYSTEM FOR ASSESSING THE RECOVERY OF MICE POST-SPINAL CORD INJURY

    Get PDF
    Current methods of determining spinal cord recovery in mice, post-directed injury, are qualitative measures. This is due to the small size and quickness of mice. This thesis presents a design for a gait acquisition and analysis system able to capture the footfalls of a mouse, extract position and timing data, and report quantitative gait metrics to the operator. These metrics can then be used to evaluate the recovery of the mouse. This work presents the design evolution of the system, from initial sensor design concepts through prototyping and testing to the final implementation. The system utilizes a machine vision camera, a well-designed walkway enclosure, and image processing techniques to capture and analyze paw strikes. Quantitative results gained from live animal experiments are presented, and it is shown how the measurements can be used to determine healthy, injured, and recovered gait

    An image-based data-driven analysis of cellular architecture in a developing tissue

    Full text link
    Quantitative microscopy is becoming increasingly crucial in efforts to disentangle the complexity of organogenesis, yet adoption of the potent new toolbox provided by modern data science has been slow, primarily because it is often not directly applicable to developmental imaging data. We tackle this issue with a newly developed algorithm that uses point cloud-based morphometry to unpack the rich information encoded in 3D image data into a straightforward numerical representation. This enabled us to employ data science tools, including machine learning, to analyze and integrate cell morphology, intracellular organization, gene expression and annotated contextual knowledge. We apply these techniques to construct and explore a quantitative atlas of cellular architecture for the zebrafish posterior lateral line primordium, an experimentally tractable model of complex self-organized organogenesis. In doing so, we are able to retrieve both previously established and novel biologically relevant patterns, demonstrating the potential of our data-driven approach
    corecore