9 research outputs found
An Active Learning Approach for Rapid Characterization of Endothelial Cells in Human Tumors
Currently, no available pathological or molecular measures of tumor angiogenesis predict response to antiangiogenic therapies used in clinical practice. Recognizing that tumor endothelial cells (EC) and EC activation and survival signaling are the direct targets of these therapies, we sought to develop an automated platform for quantifying activity of critical signaling pathways and other biological events in EC of patient tumors by histopathology. Computer image analysis of EC in highly heterogeneous human tumors by a statistical classifier trained using examples selected by human experts performed poorly due to subjectivity and selection bias. We hypothesized that the analysis can be optimized by a more active process to aid experts in identifying informative training examples. To test this hypothesis, we incorporated a novel active learning (AL) algorithm into FARSIGHT image analysis software that aids the expert by seeking out informative examples for the operator to label. The resulting FARSIGHT-AL system identified EC with specificity and sensitivity consistently greater than 0.9 and outperformed traditional supervised classification algorithms. The system modeled individual operator preferences and generated reproducible results. Using the results of EC classification, we also quantified proliferation (Ki67) and activity in important signal transduction pathways (MAP kinase, STAT3) in immunostained human clear cell renal cell carcinoma and other tumors. FARSIGHT-AL enables characterization of EC in conventionally preserved human tumors in a more automated process suitable for testing and validating in clinical trials. The results of our study support a unique opportunity for quantifying angiogenesis in a manner that can now be tested for its ability to identify novel predictive and response biomarkers
Recommended from our members
An active learning framework for enhancing identification of non-artifactual intracranial pressure waveforms
Objective: Intracranial pressure (ICP) is an important and established clinical measurement that is used in the management of severe acute brain injury. ICP waveforms are usually triphasic and are susceptible to artifact because of transient catheter malfunction or routine patient care. Existing methods for artifact detection include threshold-based, stability-based, or template matching, and result in higher false positives (when there is variability in the ICP waveforms) or higher false negatives (when the ICP waveforms lack complete triphasic components but are valid). Approach: We hypothesized that artifact labeling of ICP waveforms can be optimized by an active learning approach which includes interactive querying of domain experts to identify a manageable number of informative training examples. Main results: The resulting active learning based framework identified non-artifactual ICP pulses with a superior AUC of 0.96 + 0.012, compared to existing methods: template matching (AUC: 0.71 + 0.04), ICP stability (AUC: 0.51 + 0.036) and threshold-based (AUC: 0.5 + 0.02). Significance: The proposed active learning framework will support real-time ICP-derived analytics by improving precision of artifact-labelling
Comparison of FARSIGHT-AL performance with other feature selection algorithms.
<p>Mean classification accuracy of 25 independent simulations plotted as a function of number of training examples for differnet automated feature selection algorithms including FARSIGHT AL (blue lines) on four different datasets. FARSIGHT-AL selected 50 training examples sequentially based on the increase in information gain whereas logistic regression was used to classify examples after feature selection by PCA (green), T-Test (purple), MRMR (Orange). Standard Logistic regression (red) with no feature selection performs poorly compared to other algorithms. The bars indicate standard error of the mean of classification accuracy. For the STS dataset, the features chosen by MRMR and T-Test were identical which result in identical classifier performance indicated by the orange line.</p
FARSIGHT-AL EC classification performance metrics.
<p>*Sensitivity = True Positive/(True Positive + False Negative).</p>†<p>Specificity = True Negative/(True Negative + False Positive).</p>▴<p>PPV = True Positive/(True Positive + False Positive).</p>•<p>NPV = True Negative/(True Negative + False Negative).</p
Analysis of analyte expression for 22 ccRCC tumors.
<p>(<b>A</b>) Results of endothelial cell (EC) classification and analysis of EC analyte expression performed on 22 ccRCC by Farsight-AL using the auto-select trainer 2 EC classification model and models for analyte classification specific to each analyte with operator-defined thresholds. Shown are the proportion of total cells classified as EC <b>(top)</b> and of EC staining positively for Ki67, p-ERK, and p-STAT3. Identifying codes for individual tumors are provided along the horizontal axis. EC proportions are represented by the median (solid circles) and the 75th and 25th percentile (upper and lower bars) values for the 10–12 image set collected for each tumor-analyte. (<b>B</b>) Similar results as in (A) for manual feature selection.</p
Venn diagrams for patterns of agreement in ccRCC20 dataset.
<p>Patterns of agreement between the classification calls made by each of the three experts and the FARSIGHT-AL classification models trained by them. The agreement was quantified based on a subset of cells in the ccRCC20 image set with cells that were classified as non-EC by all three experts being excluded in the analysis. (<b>A</b>) When classifying cells in this subset as EC or non-EC, the three experts agreed with one another only in ∼64% of the cases. (<b>B</b>) The agreement in FARSIGHT-AL classification models trained by each of the experts was ∼74%. Trainer 2 exhibits the least degree of “idiosyncrasy” in terms of classification calls and agrees better with trainer 1 than trainer 3.</p
FARSIGHT–AL software interface.
<p>The FARSIGHT-AL software interface integrates multiple views of the image data in a linked manner. Image view (<b>A</b>) allows the user to adjust the complexity of visual input with respect to the number of biomarkers he/she wishes to view. It also shows the pre-algorithm mask image to illustarte the detection capability of the software. The Scatterplot view (<b>B</b>), Histogram view (<b>C</b>), and Table View (<b>D</b>) enable the user to visualize the data in ways intended to extract different types of information. All the views are hot-linked i.e., a cell selected in the image view is highlighted in the scatter plot and table views as well (indicated by arrows). FARSIGHT-AL query window (shown in (<b>E</b>)) displays the informative examples selected by the algorithm for labeling along with an image snapshot of the cell. The 5 most important features selected by the algorithm are also displayed to the user. The evolving heatmap (<b>F</b>) shows an emerging structure with the active learning iterations (from i-vi) that provide an indication of convergence of classification.</p
Quantifying trainer agreement evaluation using Kappa<sup>ψ</sup> (κ) statistic.
<p><b><sup>ψ</sup></b>Kappa Statistic is the ratio of observed agreement between raters to perfect agreement, controlling for agreement expected by chance alone. Refer to the “Kappa Statistics and Inter-Observer Agreements” sub-section in the <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0090495#s4" target="_blank">Methods</a> section to understand how Kappa values are calculated. The values in the above table reflect the agreement between the raters for EC classification in ccRCC20 image set. The diagonal values in the above table are all 1 as every rater agrees perfectly with himself/herself.</p