1,114 research outputs found

    Clustering environmental flow cytometry data by searching density peaks

    Get PDF
    Microbial single cells can be characterized by their phenotypic properties using flow cytometry. Therefore flow cytometry can be used to analyze various aspects of environmental microbial communities. In recent years, researchers have focused on fully exploiting the multivariate data that such analyses generate. As they are interested in the diversity of an environmental sample, we need a proper estimation of the number of species and their abundances. We modified a recently published algorithm to estimate the microbial diversity based on flow cytometry data. After giving a brief sketch of the problem setup, we will review this algorithm alongside its various implementations. Moreover we will present our current implementation combined with future challenges we foresee

    Understanding Health and Disease with Multidimensional Single-Cell Methods

    Full text link
    Current efforts in the biomedical sciences and related interdisciplinary fields are focused on gaining a molecular understanding of health and disease, which is a problem of daunting complexity that spans many orders of magnitude in characteristic length scales, from small molecules that regulate cell function to cell ensembles that form tissues and organs working together as an organism. In order to uncover the molecular nature of the emergent properties of a cell, it is essential to measure multiple cell components simultaneously in the same cell. In turn, cell heterogeneity requires multiple cells to be measured in order to understand health and disease in the organism. This review summarizes current efforts towards a data-driven framework that leverages single-cell technologies to build robust signatures of healthy and diseased phenotypes. While some approaches focus on multicolor flow cytometry data and other methods are designed to analyze high-content image-based screens, we emphasize the so-called Supercell/SVM paradigm (recently developed by the authors of this review and collaborators) as a unified framework that captures mesoscopic-scale emergence to build reliable phenotypes. Beyond their specific contributions to basic and translational biomedical research, these efforts illustrate, from a larger perspective, the powerful synergy that might be achieved from bringing together methods and ideas from statistical physics, data mining, and mathematics to solve the most pressing problems currently facing the life sciences.Comment: 25 pages, 7 figures; revised version with minor changes. To appear in J. Phys.: Cond. Mat

    Computational and Systems Biology Advances to Enable Bioagent-Agnostic Signatures

    Full text link
    Enumerated threat agent lists have long driven biodefense priorities. The global SARS-CoV-2 pandemic demonstrated the limitations of searching for known threat agents as compared to a more agnostic approach. Recent technological advances are enabling agent-agnostic biodefense, especially through the integration of multi-modal observations of host-pathogen interactions directed by a human immunological model. Although well-developed technical assays exist for many aspects of human-pathogen interaction, the analytic methods and pipelines to combine and holistically interpret the results of such assays are immature and require further investments to exploit new technologies. In this manuscript, we discuss potential immunologically based bioagent-agnostic approaches and the computational tool gaps the community should prioritize filling

    Machine learning in marine ecology: an overview of techniques and applications

    Get PDF
    Machine learning covers a large set of algorithms that can be trained to identify patterns in data. Thanks to the increase in the amount of data and computing power available, it has become pervasive across scientific disciplines. We first highlight why machine learning is needed in marine ecology. Then we provide a quick primer on machine learning techniques and vocabulary. We built a database of ∼1000 publications that implement such techniques to analyse marine ecology data. For various data types (images, optical spectra, acoustics, omics, geolocations, biogeochemical profiles, and satellite imagery), we present a historical perspective on applications that proved influential, can serve as templates for new work, or represent the diversity of approaches. Then, we illustrate how machine learning can be used to better understand ecological systems, by combining various sources of marine data. Through this coverage of the literature, we demonstrate an increase in the proportion of marine ecology studies that use machine learning, the pervasiveness of images as a data source, the dominance of machine learning for classification-type problems, and a shift towards deep learning for all data types. This overview is meant to guide researchers who wish to apply machine learning methods to their marine datasets.Machine learning in marine ecology: an overview of techniques and applicationspublishedVersio

    Challenges in the Multivariate Analysis of Mass Cytometry Data: The Effect of Randomization

    Get PDF
    Cytometry by time-of-flight (CyTOF) has emerged as a high-throughput single cell technology able to provide large samples of protein readouts. Already, there exists a large pool of advanced high-dimensional analysis algorithms that explore the observed heterogeneous distributions making intriguing biological inferences. A fact largely overlooked by these methods, however, is the effect of the established data preprocessing pipeline to the distributions of the measured quantities. In this article, we focus on randomization, a transformation used for improving data visualization, which can negatively affect multivariate data analysis methods such as dimensionality reduction, clustering, and network reconstruction algorithms. Our results indicate that randomization should be used only for visualization purposes, but not in conjunction with high-dimensional analytical tools

    Chronic helminth infection burden differentially affects haematopoietic cell development while ageing selectively impairs adaptive responses to infection

    Get PDF
    Throughout the lifespan of an individual, the immune system undergoes complex changes while facing novel and chronic infections. Helminths, which infect over one billion people and impose heavy livestock productivity losses, typically cause chronic infections by avoiding and suppressing host immunity. Yet, how age affects immune responses to lifelong parasitic infection is poorly understood. To disentangle the processes involved, we employed supervised statistical learning techniques to identify which factors among haematopoietic stem and progenitor cells (HSPC), and both innate and adaptive responses regulate parasite burdens and how they are affected by host age. Older mice harboured greater numbers of the parasites’ offspring than younger mice. Protective immune responses that did not vary with age were dominated by HSPC, while ageing specifically eroded adaptive immunity, with reduced numbers of naïve T cells, poor T cell responsiveness to parasites, and impaired antibody production. We identified immune factors consistent with previously-reported immune responses to helminths, and also revealed novel interactions between helminths and HSPC maturation. Our approach thus allowed disentangling the concurrent effects of ageing and infection across the full maturation cycle of the immune response and highlights the potential of such approaches to improve understanding of the immune system within the whole organism

    A Kernel-Based Change Detection Method to Map Shifts in Phytoplankton Communities Measured by Flow Cytometry

    Get PDF
    1. Automated, ship-board flow cytometers provide high-resolution maps of phytoplankton composition over large swaths of the world\u27s oceans. They therefore pave the way for understanding how environmental conditions shape community structure. Identification of community changes along a cruise transect commonly segments the data into distinct regions. However, existing segmentation methods are generally not applicable to flow cytometry data, as these data are recorded as ‘point cloud’ data, with hundreds or thousands of particles measured during each time interval. Moreover, nonparametric segmentation methods that do not rely on prior knowledge of the number of species are desirable to map community shifts. 2. We present CytoSegmenter, a kernel-based change-point estimation method for segmenting point cloud data. Our method allows us to represent and summarize a point cloud of data points by a single element in a Hilbert space. The change-point locations can be found using a fast dynamic programming algorithm. 3. Through an analysis of 12 cruises, we demonstrate that CytoSegmenter allows us to locate abrupt changes in phytoplankton community structure. We show that the changes in community structure generally coincide with changes in the temperature and salinity of the ocean. We also illustrate how the main parameter of CytoSegmenter can be easily calibrated using limited auxiliary annotated data. 4. CytoSegmenter is generally applicable for segmenting series of point cloud data from any domain. Moreover, it readily scales to thousands of point clouds, each containing thousands of points. In the context of flow cytometry data collected during research cruises, it does not require prior clustering of particles to define taxa labels, eliminating a potential source of error. This represents an important advance in automating the analysis of large datasets now emerging in biological oceanography and other fields. It also allows for the approach to be applied during research cruises
    • …
    corecore