Search CORE

2,593 research outputs found

Data reduction for spectral clustering to analyze high throughput flow cytometry data

Author: Brinkman Ryan R.
Gupta Arvind
Shooshtari Parisa
Zare Habil
Publication venue: Scholarship@Western
Publication date: 28/07/2010
Field of study

Background: Recent biological discoveries have shown that clustering large datasets is essential for better understanding biology in many areas. Spectral clustering in particular has proven to be a powerful tool amenable for many applications. However, it cannot be directly applied to large datasets due to time and memory limitations. To address this issue, we have modified spectral clustering by adding an information preserving sampling procedure and applying a post-processing stage. We call this entire algorithm SamSPECTRAL.Results: We tested our algorithm on flow cytometry data as an example of large, multidimensional data containing potentially hundreds of thousands of data points (i.e., events in flow cytometry, typically corresponding to cells). Compared to two state of the art model-based flow cytometry clustering methods, SamSPECTRAL demonstrates significant advantages in proper identification of populations with non-elliptical shapes, low density populations close to dense ones, minor subpopulations of a major population and rare populations.Conclusions: This work is the first successful attempt to apply spectral methodology on flow cytometry data. An implementation of our algorithm as an R package is freely available through BioConductor. © 2010 Zare et al; licensee BioMed Central Ltd

Scholarship@Western

Understanding Health and Disease with Multidimensional Single-Cell Methods

Author: Banavar Jayanth R.
Candia Julián
Losert Wolfgang
Publication venue
Publication date: 01/12/2013
Field of study

Current efforts in the biomedical sciences and related interdisciplinary fields are focused on gaining a molecular understanding of health and disease, which is a problem of daunting complexity that spans many orders of magnitude in characteristic length scales, from small molecules that regulate cell function to cell ensembles that form tissues and organs working together as an organism. In order to uncover the molecular nature of the emergent properties of a cell, it is essential to measure multiple cell components simultaneously in the same cell. In turn, cell heterogeneity requires multiple cells to be measured in order to understand health and disease in the organism. This review summarizes current efforts towards a data-driven framework that leverages single-cell technologies to build robust signatures of healthy and diseased phenotypes. While some approaches focus on multicolor flow cytometry data and other methods are designed to analyze high-content image-based screens, we emphasize the so-called Supercell/SVM paradigm (recently developed by the authors of this review and collaborators) as a unified framework that captures mesoscopic-scale emergence to build reliable phenotypes. Beyond their specific contributions to basic and translational biomedical research, these efforts illustrate, from a larger perspective, the powerful synergy that might be achieved from bringing together methods and ideas from statistical physics, data mining, and mathematics to solve the most pressing problems currently facing the life sciences.Comment: 25 pages, 7 figures; revised version with minor changes. To appear in J. Phys.: Cond. Mat

arXiv.org e-Print Archive

CONICET Digital

PubMed Central

Recommended from our members

Comprehensive Immune Monitoring of Clinical Trials to Advance Human Immunotherapy.

Author: Amir El-Ad D
Babdor Joel
Bendall Sean C
Gherardini Pier Federico
Hartmann Felix J
Jones Kyle
Krutzik Peter
Maecker Holden T
Marquez Diana M
Meyer Everett
O'Donnell Erika
Sahaf Bita
Sigal Natalia
Spitzer Matthew H
Publication venue: eScholarship, University of California
Publication date: 01/07/2019
Field of study

The success of immunotherapy has led to a myriad of clinical trials accompanied by efforts to gain mechanistic insight and identify predictive signatures for personalization. However, many immune monitoring technologies face investigator bias, missing unanticipated cellular responses in limited clinical material. We present here a mass cytometry (CyTOF) workflow for standardized, systems-level biomarker discovery in immunotherapy trials. To broadly enumerate immune cell identity and activity, we established and extensively assessed a reference panel of 33 antibodies to cover major cell subsets, simultaneously quantifying activation and immune checkpoint molecules in a single assay. This assay enumerates ≥98% of peripheral immune cells with ≥4 positively identifying antigens. Robustness and reproducibility are demonstrated on multiple samples types, across two research centers and by orthogonal measurements. Using automated analysis, we identify stratifying immune signatures in bone marrow transplantation-associated graft-versus-host disease. Together, this validated workflow ensures comprehensive immunophenotypic analysis and data comparability and will accelerate biomarker discovery

eScholarship - University of California

ART

A computational framework to emulate the human perspective in flow cytometric data analysis

Author: AP Dempster
B Ellis
B Lindsay
BG Lindsay
BW Silverman
BW Silverman
C Jarque
Christopher V. Rao
D Novo
D Sarkar
DJ Marchette
DR Parks
E Choy
E Lugli
F Hahne
F Hahne
F Hahne
G Finak
G Finak
G Luta
G McLachlan
H Zare
J Li
J Trotter
JA Hartigan
JA Hartigan
JM Irish
JP Baudry
K Lo
L Herzenberg
LM Maier
MC Minnotte
MY Cheng
MY Cheng
MY Cheng
N Aghaeepour
PM Hartigan
R Scheuermann
R Tibshirani
RR Brinkman
S Pyne
S Ray
S Ray
Saumyadipta Pyne
Surajit Ray
T Lin
T Lin
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2012
Field of study

Background: In recent years, intense research efforts have focused on developing methods for automated flow cytometric data analysis. However, while designing such applications, little or no attention has been paid to the human perspective that is absolutely central to the manual gating process of identifying and characterizing cell populations. In particular, the assumption of many common techniques that cell populations could be modeled reliably with pre-specified distributions may not hold true in real-life samples, which can have populations of arbitrary shapes and considerable inter-sample variation. Results: To address this, we developed a new framework flowScape for emulating certain key aspects of the human perspective in analyzing flow data, which we implemented in multiple steps. First, flowScape begins with creating a mathematically rigorous map of the high-dimensional flow data landscape based on dense and sparse regions defined by relative concentrations of events around modes. In the second step, these modal clusters are connected with a global hierarchical structure. This representation allows flowScape to perform ridgeline analysis for both traversing the landscape and isolating cell populations at different levels of resolution. Finally, we extended manual gating with a new capacity for constructing templates that can identify target populations in terms of their relative parameters, as opposed to the more commonly used absolute or physical parameters. This allows flowScape to apply such templates in batch mode for detecting the corresponding populations in a flexible, sample-specific manner. We also demonstrated different applications of our framework to flow data analysis and show its superiority over other analytical methods. Conclusions: The human perspective, built on top of intuition and experience, is a very important component of flow cytometric data analysis. By emulating some of its approaches and extending these with automation and rigor, flowScape provides a flexible and robust framework for computational cytomics

CiteSeerX

Crossref

Directory of Open Access Journals

PubMed Central

Enlighten

Single Cell Proteomics in Biomedicine: High-dimensional Data Acquisition, Visualization and Analysis

Author: Shi Qihui
Su Yapeng
Wei Wei
Publication venue: 'Royal College of Obstetricians & Gynaecologists (RCOG)'
Publication date: 01/02/2017
Field of study

New insights on cellular heterogeneity in the last decade provoke the development of a variety of single cell omics tools at a lightning pace. The resultant high-dimensional single cell data generated by these tools require new theoretical approaches and analytical algorithms for effective visualization and interpretation. In this review, we briefly survey the state-of-the-art single cell proteomic tools with a particular focus on data acquisition and quantification, followed by an elaboration of a number of statistical and computational approaches developed to date for dissecting the high-dimensional single cell data. The underlying assumptions, unique features, and limitations of the analytical methods with the designated biological questions they seek to answer will be discussed. Particular attention will be given to those information theoretical approaches that are anchored in a set of first principles of physics and can yield detailed (and often surprising) predictions

Caltech Authors

Development of machine learning techniques for flow cytometry data

Author: Van Gassen Sofie
Publication venue: Ghent University. Faculty of Engineering and Architecture
Publication date: 01/01/2017
Field of study

Ghent University Academic Bibliography

Computational approaches in high-throughput proteomics data analysis

Author: Lahesmaa-Korpinen Anna-Maria
Publication venue: 'University of Helsinki Libraries'
Publication date: 29/06/2012
Field of study

Proteins are key components in biological systems as they mediate the signaling responsible for information processing in a cell and organism. In biomedical research, one goal is to elucidate the mechanisms of cellular signal transduction pathways to identify possible defects that cause disease. Advancements in technologies such as mass spectrometry and flow cytometry enable the measurement of multiple proteins from a system. Proteomics, or the large-scale study of proteins of a system, thus plays an important role in biomedical research. The analysis of all high-throughput proteomics data requires the use of advanced computational methods. Thus, the combination of bioinformatics and proteomics has become an important part in research of signal transduction pathways. The main objective in this study was to develop and apply computational methods for the preprocessing, analysis and interpretation of high-throughput proteomics data. The methods focused on data from tandem mass spectrometry and single cell flow cytometry, and integration of proteomics data with gene expression microarray data and information from various biological databases. Overall, the methods developed and applied in this study have led to new ways of management and preprocessing of proteomics data. Additionally, the available tools have successfully been used to help interpret biomedical data and to facilitate analysis of data that would have been cumbersome to do without the use of computational methods.Proteiineilla on tärkeä merkitys biologisissa systeemeissä sillä ne koordinoivat erilaisia solujen ja organismien prosesseja. Yksi biolääketieteellisen tutkimuksen tavoitteista on valottaa solujen viestintäreittejä ja niiden toiminnassa tapahtuvia muutoksia eri sairauksien yhteydessä, jotta tällaisia muutoksia voitaisiin korjata. Proteomiikka on proteiinien laajamittaista tutkimista solusta, kudoksesta tai organismista. Proteomiikan menetelmät kuten massaspektrometria ja virtaussytometria ovat keskeisiä biolääketieteellisen tutkimuksen menetelmiä, joilla voidaan mitata näytteestä samanaikaisesti useita proteiineja. Nykyajan kehittyneet proteomiikan mittausteknologiat tuottavat suuria tulosaineistoja ja edellyttävät laskennallisten menetelmien käyttöä aineiston analyysissä. Bioinformatiikan menetelmät ovatkin nousseet tärkeäksi osaksi proteomiikka-analyysiä ja viestintäreittien tutkimusta. Tämän tutkimuksen päätavoite oli kehittää ja soveltaa tehokkaita laskennallisia menetelmiä laajamittaisten proteomiikka-aineistojen esikäsittelyyn, analyysiin ja tulkintaan. Tässä tutkimuksessa kehitettiin esikäsittelymenetelmä massaspektrometria-aineistolle sekä automatisoitu analyysimenetelmä virtaussytometria-aineistolle. Proteiinitason tietoa yhdistettiin mittauksiin geenien transkriptiotasoista ja olemassaolevaan biologisista tietokannoista poimittuun tietoon. Väitöskirjatyö osoittaa, että laskennallisilla menetelmillä on keskeinen merkitys proteomiikan aineistojen hallinnassa, esikäsittelyssä ja analyysissä. Tutkimuksessa kehitetyt analyysimenetelmät edistävät huomattavasti biolääketieteellisen tiedon laajempaa hyödyntämistä ja ymmärtämistä

Helsingin yliopiston digitaalinen arkisto

Flow cytometry data standards

Author: F Hahne
J Spidlen
J Spidlen
JA Lee
Josef Spidlen
Parisa Shooshtari
R Gentleman
R Gentleman
RF Murphy
Ryan R Brinkman
Tobias R Kollmann
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Flow cytometry is a widely used analytical technique for examining microscopic particles, such as cells. The Flow Cytometry Standard (FCS) was developed in 1984 for storing flow data and it is supported by all instrument and third party software vendors. However, FCS does not capture the full scope of flow cytometry (FCM)-related data and metadata, and data standards have recently been developed to address this shortcoming. Findings The Data Standards Task Force (DSTF) of the International Society for the Advancement of Cytometry (ISAC) has developed several data standards to complement the raw data encoded in FCS files. Efforts started with the Minimum Information about a Flow Cytometry Experiment, a minimal data reporting standard of details necessary to include when publishing FCM experiments to facilitate third party understanding. MIFlowCyt is now being recommended to authors by publishers as part of manuscript submission, and manuscripts are being checked by reviewers and editors for compliance. Gating-ML was then introduced to capture gating descriptions - an essential part of FCM data analysis describing the selection of cell populations of interest. The Classification Results File Format was developed to accommodate results of the gating process, mostly within the context of automated clustering. Additionally, the Archival Cytometry Standard bundles data with all the other components describing experiments. Here, we introduce these recent standards and provide the very first example of how they can be used to report FCM data including analysis and results in a standardized, computationally exchangeable form. Conclusions Reporting standards and open file formats are essential for scientific collaboration and independent validation. The recently developed FCM data standards are now being incorporated into third party software tools and data repositories, which will ultimately facilitate understanding and data reuse.</p

Crossref

Scholarship@Western

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central