Search CORE

653 research outputs found

Merging Mixture Components for Cell Population Identification in Flow Cytometry

Author: Bashashati Ali
Brinkman Ryan
Finak Greg
Gottardo Raphaël
Publication venue: Hindawi Publishing Corporation
Publication date: 01/01/2009
Field of study

We present a framework for the identification of cell subpopulations in flow cytometry data based on merging mixture components using the flowClust methodology. We show that the cluster merging algorithm under our framework improves model fit and provides a better estimate of the number of distinct cell subpopulations than either Gaussian mixture models or flowClust, especially for complicated flow cytometry data distributions. Our framework allows the automated selection of the number of distinct cell subpopulations and we are able to identify cases where the algorithm fails, thus making it suitable for application in a high throughput FCM analysis pipeline. Furthermore, we demonstrate a method for summarizing complex merged cell subpopulations in a simple manner that integrates with the existing flowClust framework and enables downstream data analysis. We demonstrate the performance of our framework on simulated and real FCM data. The software is available in the flowMerge package through the Bioconductor project

Crossref

Directory of Open Access Journals

PubMed Central

A computational framework to emulate the human perspective in flow cytometric data analysis

Author: AP Dempster
B Ellis
B Lindsay
BG Lindsay
BW Silverman
BW Silverman
C Jarque
Christopher V. Rao
D Novo
D Sarkar
DJ Marchette
DR Parks
E Choy
E Lugli
F Hahne
F Hahne
F Hahne
G Finak
G Finak
G Luta
G McLachlan
H Zare
J Li
J Trotter
JA Hartigan
JA Hartigan
JM Irish
JP Baudry
K Lo
L Herzenberg
LM Maier
MC Minnotte
MY Cheng
MY Cheng
MY Cheng
N Aghaeepour
PM Hartigan
R Scheuermann
R Tibshirani
RR Brinkman
S Pyne
S Ray
S Ray
Saumyadipta Pyne
Surajit Ray
T Lin
T Lin
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2012
Field of study

Background: In recent years, intense research efforts have focused on developing methods for automated flow cytometric data analysis. However, while designing such applications, little or no attention has been paid to the human perspective that is absolutely central to the manual gating process of identifying and characterizing cell populations. In particular, the assumption of many common techniques that cell populations could be modeled reliably with pre-specified distributions may not hold true in real-life samples, which can have populations of arbitrary shapes and considerable inter-sample variation. <p/>Results: To address this, we developed a new framework flowScape for emulating certain key aspects of the human perspective in analyzing flow data, which we implemented in multiple steps. First, flowScape begins with creating a mathematically rigorous map of the high-dimensional flow data landscape based on dense and sparse regions defined by relative concentrations of events around modes. In the second step, these modal clusters are connected with a global hierarchical structure. This representation allows flowScape to perform ridgeline analysis for both traversing the landscape and isolating cell populations at different levels of resolution. Finally, we extended manual gating with a new capacity for constructing templates that can identify target populations in terms of their relative parameters, as opposed to the more commonly used absolute or physical parameters. This allows flowScape to apply such templates in batch mode for detecting the corresponding populations in a flexible, sample-specific manner. We also demonstrated different applications of our framework to flow data analysis and show its superiority over other analytical methods. <p/>Conclusions: The human perspective, built on top of intuition and experience, is a very important component of flow cytometric data analysis. By emulating some of its approaches and extending these with automation and rigor, flowScape provides a flexible and robust framework for computational cytomics

CiteSeerX

Crossref

Directory of Open Access Journals

PubMed Central

Enlighten

Identification and visualization of multidimensional antigen-specific T-cell populations in polychromatic cytometry data.

Author: Bart P.A.
DeRosa S.
Finak G.
Frelinger J.
Gottardo R.
Jiang W.
Lin L.
McElrath J.
Pantaleo G.
Seshadri C.
Publication venue: 'Wiley'
Publication date: 01/01/2015
Field of study

An important aspect of immune monitoring for vaccine development, clinical trials, and research is the detection, measurement, and comparison of antigen-specific T-cells from subject samples under different conditions. Antigen-specific T-cells compose a very small fraction of total T-cells. Developments in cytometry technology over the past five years have enabled the measurement of single-cells in a multivariate and high-throughput manner. This growth in both dimensionality and quantity of data continues to pose a challenge for effective identification and visualization of rare cell subsets, such as antigen-specific T-cells. Dimension reduction and feature extraction play pivotal role in both identifying and visualizing cell populations of interest in large, multi-dimensional cytometry datasets. However, the automated identification and visualization of rare, high-dimensional cell subsets remains challenging. Here we demonstrate how a systematic and integrated approach combining targeted feature extraction with dimension reduction can be used to identify and visualize biological differences in rare, antigen-specific cell populations. By using OpenCyto to perform semi-automated gating and features extraction of flow cytometry data, followed by dimensionality reduction with t-SNE we are able to identify polyfunctional subpopulations of antigen-specific T-cells and visualize treatment-specific differences between them

Serveur académique lausannois

PubMed Central

gEM/GANN: a multivariate computational strategy for auto-characterizing relationships between cellular and clinical phenotypes and predicting disease progression time using high-dimensional flow cytometry data

Author: Aghaeepour
Bagwell
Barlogie
Boedigheimer
Do
Edwards
Finak
Hall
Herzenberg
Jaye
Krutzik
Laerum
Lo
Moon
Novo
Parks
Pyne
Schierz
Tong
Tong
Valet
Publication venue: 'Wiley'
Publication date: 08/01/2015
Field of study

The dramatic increase in the complexity of flow cytometric datasets requires the development of new computational based approaches that can maximize the amount of information derived and overcome the limitations of traditional gating strategies. Herein, we present a multivariate computational analysis of the HIV-infected flow cytometry datasets that were provided as part of the FlowCAP-IV Challenge using unsupervised and supervised learning techniques. Out of 383 samples (stimulated and unstimulated), 191 samples were used as a training set (34 individuals whose disease did not progress, and 157 individuals whose disease did progress). Using the results from the training set, the participants in the Challenge were then asked to predict the condition and progression time of the remaining individuals (45 ‘non-progressors’ and 147 ‘progressors’). To achieve this, we first scaled down data resolution. We then excluded doublet cells from the analysis using Expectation Maximization approaches. We then standardized all samples into histograms and used Genetic Algorithm-Neural Network to extract feature sets from the datasets, the reliability of which were examined using WEKA-implemented classifiers. The selected feature set resulted in a high sensitivity and specificity for the discrimination of progressors and non-progressors in the training set (average True Positive Rate = 1.00 and average False Positive Rate = 0.033). The capacity of the feature set to predict real-time survival time was better when using data from the ‘unstimulated’ training set (r = 0.825). The p-values and 95% confidence interval logrank ratios between actual and predicted survival time in the test set were 0.682 and 0.9542±0.24 for the unstimulated dataset, and 0.4451 and 0.9173±0.23 for the stimulated dataset. Our analytic strategy has demonstrated a promising capacity to extract useful information from complex flow cytometry datasets, despite a significance imbalance and variation between the training and test sets

Crossref

Nottingham Trent Institutional Repository (IRep)

Clinico-pathological and transcriptomic determinants of SLFN11 expression in invasive breast carcinoma

Author: B Haibe-Kains
C Desmedt
G Finak
G Zoppoli
J Friedman
M Li
P Farmer
W Huang da
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Agile workflow for interactive analysis of mass cytometry data

Author: Almeida
Amir
Angerer
Brodin
Cervera
Chen
Dix
Ellis
Finak
Galli
Höllt
Kotecha
Nowicka
Qiu
Qiu
Simpson
Spidlen
Van Der
van Unen
Van Unen
Weber
Publication venue
Publication date: 14/12/2020
Field of study

Motivation: Single-cell proteomics technologies, such as mass cytometry, have enabled characterization of cell-tocell variation and cell populations at a single-cell resolution. These large amounts of data, require dedicated, interactive tools for translating the data into knowledge. Results: We present a comprehensive, interactive method called Cyto to streamline analysis of large-scale cytometry data. Cyto is a workflow-based open-source solution that automates the use of state-of-the-art single-cell analysis methods with interactive visualization. We show the utility of Cyto by applying it to mass cytometry data from peripheral blood and high-grade serous ovarian cancer (HGSOC) samples. Our results show that Cyto is able to reliably capture the immune cell sub-populations from peripheral blood and cellular compositions of unique immune- and cancer cell subpopulations in HGSOC tumor and ascites samples.Peer reviewe

Crossref

PubMed Central

Helsingin yliopiston digitaalinen arkisto

MAST: a flexible statistical framework for assessing transcriptional changes and characterizing heterogeneity in single-cell RNA sequencing data

Author: Deng Jingyuan
Finak Greg
Gersuk Vivian
Gottardo Raphael
Linsley Peter S.
McDavid Andrew
McElrath M. Juliana
Miller Hannah W.
Prlic Martin
Shalek Alex
Slichter Chloe K.
Yajima Masanao
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

Single-cell transcriptomics reveals gene expression heterogeneity but suffers from stochastic dropout and characteristic bimodal expression distributions in which expression is either strongly non-zero or non-detectable. We propose a two-part, generalized linear model for such bimodal data that parameterizes both of these features. We argue that the cellular detection rate, the fraction of genes expressed in a cell, should be adjusted for as a source of nuisance variation. Our model provides gene set enrichment analysis tailored to single-cell data. It provides insights into how networks of co-expressed genes evolve across an experimental treatment. MAST is available at https://github.com/RGLab/MAST

DSpace@MIT

Crossref

Boston University Institutional Repository (OpenBU)

PubMed Central

Optimizing transformations for automated, high throughput analysis of flow cytometry data

Author: A Bashashati
Andrew Weng
C Bagwell
D Novo
D Parks
F Hahne
F Hahne
G Box
G Finak
G Walther
Greg Finak
J Dvorak
JJ Gosink
Juan-Manuel Perez
K Lo
K Lo
L Herzenberg
M Boedigheimer
P Mahalanobis
R Gentleman
R Gottardo
R Ihaka
Raphael Gottardo
S Johnson
S Pyne
U Naumann
U Naumann
W Rogers
WT Rogers
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Crossref

Springer - Publisher Connector

PubMed Central