9 research outputs found

    Relationship between SSNR and endpoint predictability based on 60 and 120 training samples.

    No full text
    <p>The relationship between SSNR values and endpoint predictability (prediction MCC) based on (a) 60 and (b) 120 training samples using <i>NCentroid</i>, respectively. Here blue columns and black bars represent the means and SDs of SSNR values in 100 repetitions, while yellow rectangles and red bars are means and SDs of MCC values.</p

    Impact of training sample size.

    No full text
    <p>Prediction MCC based on different number of training samples for 10 endpoints using <i>NCentroid</i>.</p

    Determination of Minimum Training Sample Size for Microarray-Based Cancer Outcome Prediction–An Empirical Assessment

    Get PDF
    <div><p>The promise of microarray technology in providing prediction classifiers for cancer outcome estimation has been confirmed by a number of demonstrable successes. However, the reliability of prediction results relies heavily on the accuracy of statistical parameters involved in classifiers. It cannot be reliably estimated with only a small number of training samples. Therefore, it is of vital importance to determine the minimum number of training samples and to ensure the clinical value of microarrays in cancer outcome prediction. We evaluated the impact of training sample size on model performance extensively based on 3 large-scale cancer microarray datasets provided by the second phase of MicroArray Quality Control project (MAQC-II). An SSNR-based (scale of signal-to-noise ratio) protocol was proposed in this study for minimum training sample size determination. External validation results based on another 3 cancer datasets confirmed that the SSNR-based approach could not only determine the minimum number of training samples efficiently, but also provide a valuable strategy for estimating the underlying performance of classifiers in advance. Once translated into clinical routine applications, the SSNR-based protocol would provide great convenience in microarray-based cancer outcome prediction in improving classifier reliability.</p></div

    External validation for impact of training sample size.

    No full text
    <p>Prediction MCC based on different number of training samples for three external validation datasets.</p

    Relationship between SSNR and endpoint predictability based on all training samples.

    No full text
    <p>The ex post facto relationship between SSNR values and endpoint predictability (prediction MCC) based on (a) normal and (b) swap modeling using <i>NCentroid</i> on all training samples. Here green (a) and orange columns (b) represent the SSNR values obtained from original training and validation sets, while the rectangles faced yellow are corresponding prediction MCC values of models on original validation and training samples, respectively.</p

    Study work flow.

    No full text
    <p>Work flow for evaluating the impact of different number of training samples.</p

    A concise summary of datasets.

    No full text
    a<p>BR - Breast Cancer; MM - Multiple Myeloma; NB - Neuroblastoma; pCR - Pathologic Complete Response; erpos – ER Positive; OS – Overall Survive; EFS – Event-free Survival; NHL- non-hodgkin lymphoma; PC – Positive Control; NC – Negative Control;</p>b<p>Ratio of good to poor prognoses (i.e., good/poor prognoses).</p

    Correlation between slope rate and Cohen's <i>d</i> for the <i>kNN</i> classifier.

    No full text
    <p>The slopes are obtained from regression analysis based on the linear portion of the confidence-MCC curve, while Cohen's <i>d</i> represents the inherent predictability of the dataset.</p

    Overall survival (OS) curves for patients with different clinical confidences using <i>kNN</i>, where ‘LC’, ‘MC’, and ‘HC’ denote ‘low confidence (0.6)’, ‘medium confidence (0.8)’, and ‘high confidence (1)’, respectively.

    No full text
    <p>Overall survival (OS) curves for patients with different clinical confidences using <i>kNN</i>, where ‘LC’, ‘MC’, and ‘HC’ denote ‘low confidence (0.6)’, ‘medium confidence (0.8)’, and ‘high confidence (1)’, respectively.</p
    corecore