2 research outputs found

    Development of Training Materials for Pathologists to Provide Machine Learning Validation Data of Tumor-Infiltrating Lymphocytes in Breast Cancer

    No full text
    The High Throughput Truthing project aims to develop a dataset for validating artificial intelligence and machine learning models (AI/ML) fit for regulatory purposes. The context of this AI/ML validation dataset is the reporting of stromal tumor-infiltrating lymphocytes (sTILs) density evaluations in hematoxylin and eosin-stained invasive breast cancer biopsy specimens. After completing the pilot study, we found notable variability in the sTILs estimates as well as inconsistencies and gaps in the provided training to pathologists. Using the pilot study data and an expert panel, we created custom training materials to improve pathologist annotation quality for the pivotal study. We categorized regions of interest (ROIs) based on their mean sTILs density and selected ROIs with the highest and lowest sTILs variability. In a series of eight one-hour sessions, the expert panel reviewed each ROI and provided verbal density estimates and comments on features that confounded the sTILs evaluation. We aggregated and shaped the comments to identify pitfalls and instructions to improve our training materials. From these selected ROIs, we created a training set and proficiency test set to improve pathologist training with the goal to improve data collection for the pivotal study. We are not exploring AI/ML performance in this paper. Instead, we are creating materials that will train crowd-sourced pathologists to be the reference standard in a pivotal study to create an AI/ML model validation dataset. The issues discussed here are also important for clinicians to understand about the evaluation of sTILs in clinical practice and can provide insight to developers of AI/ML models

    Interobserver Variation in the Assessment of Immunohistochemistry Expression Levels in HER2-Negative Breast Cancer: Can We Improve the Identification of Low Levels of HER2 Expression by Adjusting the Criteria? An International Interobserver Study

    Get PDF
    The classification of human epidermal growth factor receptor 2 (HER2) expression is optimized to detect HER2-amplified breast cancer (BC). However, novel HER2-targeting agents are also effective for BCs with low levels of HER2. This raises the question whether the current guidelines for HER2 testing are sufficiently reproducible to identify HER2-low BC. The aim of this multicenter international study was to assess the interobserver agreement of specific HER2 immunohistochemistry scores in cases with negative HER2 results (0, 1+, or 2+/in situ hybridization negative) according to the current American Society of Clinical Oncology/College of American Pathologists (ASCO/CAP) guidelines. Furthermore, we evaluated whether the agreement improved by redefining immunohistochemistry (IHC) scoring criteria or by adding fluorescent in situ hybridization (FISH). We conducted a 2-round study of 105 nonamplified BCs. During the first assessment, 16 pathologists used the latest version of the ASCO/CAP guidelines. After a consensus meeting, the same pathologists scored the same digital slides using modified IHC scoring criteria based on the 2007 ASCO/CAP guidelines, and an extra "ultralow" category was added. Overall, the interobserver agreement was limited (4.7% of cases with 100% agreement) in the first round, but this was improved by clustering IHC categories. In the second round, the highest reproducibility was observed when comparing IHC 0 with the ultralow/1+/2+ grouped cluster (74.3% of cases with 100% agreement). The FISH results were not statistically different between HER2-0 and HER2-low cases, regardless of the IHC criteria used. In conclusion, our study suggests that the modified 2007 ASCO/CAP criteria were more reproducible in distinguishing HER2-0 from HER2-low cases than the 2018 ASCO/CAP criteria. However, the reproducibility was still moderate, which was not improved by adding FISH. This could lead to a suboptimal selection of patients eligible for novel HER2-targeting agents. If the threshold between HER2 IHC 0 and 1+ is to be clinically actionable, there is a need for clearer, more reproducible IHC definitions, training, and/or development of more accurate methods to detect this subtle difference in protein expression levels
    corecore