Current efforts in the biomedical sciences and related interdisciplinary
fields are focused on gaining a molecular understanding of health and disease,
which is a problem of daunting complexity that spans many orders of magnitude
in characteristic length scales, from small molecules that regulate cell
function to cell ensembles that form tissues and organs working together as an
organism. In order to uncover the molecular nature of the emergent properties
of a cell, it is essential to measure multiple cell components simultaneously
in the same cell. In turn, cell heterogeneity requires multiple cells to be
measured in order to understand health and disease in the organism. This review
summarizes current efforts towards a data-driven framework that leverages
single-cell technologies to build robust signatures of healthy and diseased
phenotypes. While some approaches focus on multicolor flow cytometry data and
other methods are designed to analyze high-content image-based screens, we
emphasize the so-called Supercell/SVM paradigm (recently developed by the
authors of this review and collaborators) as a unified framework that captures
mesoscopic-scale emergence to build reliable phenotypes. Beyond their specific
contributions to basic and translational biomedical research, these efforts
illustrate, from a larger perspective, the powerful synergy that might be
achieved from bringing together methods and ideas from statistical physics,
data mining, and mathematics to solve the most pressing problems currently
facing the life sciences.Comment: 25 pages, 7 figures; revised version with minor changes. To appear in
J. Phys.: Cond. Mat