19 research outputs found

    Disparate Censorship & Undertesting: A Source of Label Bias in Clinical Machine Learning

    Full text link
    As machine learning (ML) models gain traction in clinical applications, understanding the impact of clinician and societal biases on ML models is increasingly important. While biases can arise in the labels used for model training, the many sources from which these biases arise are not yet well-studied. In this paper, we highlight disparate censorship (i.e., differences in testing rates across patient groups) as a source of label bias that clinical ML models may amplify, potentially causing harm. Many patient risk-stratification models are trained using the results of clinician-ordered diagnostic and laboratory tests of labels. Patients without test results are often assigned a negative label, which assumes that untested patients do not experience the outcome. Since orders are affected by clinical and resource considerations, testing may not be uniform in patient populations, giving rise to disparate censorship. Disparate censorship in patients of equivalent risk leads to undertesting in certain groups, and in turn, more biased labels for such groups. Using such biased labels in standard ML pipelines could contribute to gaps in model performance across patient groups. Here, we theoretically and empirically characterize conditions in which disparate censorship or undertesting affect model performance across subgroups. Our findings call attention to disparate censorship as a source of label bias in clinical ML models.Comment: 48 pages, 18 figures. Machine Learning for Healthcare Conference (MLHC 2022

    Lost in Transmission: On the Impact of Networking Corruptions on Video Machine Learning Models

    Full text link
    We study how networking corruptions--data corruptions caused by networking errors--affect video machine learning (ML) models. We discover apparent networking corruptions in Kinetics-400, a benchmark video ML dataset. In a simulation study, we investigate (1) what artifacts networking corruptions cause, (2) how such artifacts affect ML models, and (3) whether standard robustness methods can mitigate their negative effects. We find that networking corruptions cause visual and temporal artifacts (i.e., smeared colors or frame drops). These networking corruptions degrade performance on a variety of video ML tasks, but effects vary by task and dataset, depending on how much temporal context the tasks require. Lastly, we evaluate data augmentation--a standard defense for data corruptions--but find that it does not recover performance.Comment: 12 pages, 12 figures (with supplemental: 34 pages

    Prostate-specific antigen 10–20 ng/mL: A predictor of degree of upgrading to ≥8 among patients with biopsy Gleason score 6

    No full text
    Purpose: This study aimed to identify the predictors of upgrading and degree of upgrading among patients who have initial Gleason score (GS) 6 treated with robot-assisted radical prostatectomy (RARP). Materials and Methods: A retrospective review of the data of 359 men with an initial biopsy GS 6, localized prostate cancer who underwent RARP between July 2005 to June 2010 was performed. They were grouped into group 1 (nonupgrade) and group 2 (upgraded) based on their prostatectomy specimen GS. Logistic regression analysis of studied cases identified significant predictors of upgrading and the degree of upgrading after RARP. Results: The mean age and prostate-specific antigen (PSA) was 63±7.5 years, 8.9±8.77 ng/mL, respectively. Median follow-up was 59 months (interquartile range, 47–70 months). On multivariable analysis, age, PSA, PSA density and ≥2 cores positive were predictors of upgrading with (odds ratio [OR], 1.03; 95% confidence interval [CI], 1.01–1.06; p=0.003; OR, 1.006; 95% CI, 1.01–1.11; p=0.018; OR, 0.65; 95% CI, 0.43–0.98, p=0.04), respectively. On subanalysis, only PSA level of 10–20 ng/mL is associated with upgrading into GS ≥8. They also had lower biochemical recurrence free survival, cancer specific survival, and overall survival (p≤0.001, p=0.003, and p=0.01, respectively). Conclusions: Gleason score 6 patients with PSA (10–20 ng/mL) have an increased risk of upgrading to pathologic GS (≥8), subsequently poorer oncological outcome thus require a stricter follow-up. These patients should be carefully counseled in making an optimal treatment decision
    corecore