28 research outputs found

    LESS: Label-efficient Multi-scale Learning for Cytological Whole Slide Image Screening

    Full text link
    In computational pathology, multiple instance learning (MIL) is widely used to circumvent the computational impasse in giga-pixel whole slide image (WSI) analysis. It usually consists of two stages: patch-level feature extraction and slide-level aggregation. Recently, pretrained models or self-supervised learning have been used to extract patch features, but they suffer from low effectiveness or inefficiency due to overlooking the task-specific supervision provided by slide labels. Here we propose a weakly-supervised Label-Efficient WSI Screening method, dubbed LESS, for cytological WSI analysis with only slide-level labels, which can be effectively applied to small datasets. First, we suggest using variational positive-unlabeled (VPU) learning to uncover hidden labels of both benign and malignant patches. We provide appropriate supervision by using slide-level labels to improve the learning of patch-level features. Next, we take into account the sparse and random arrangement of cells in cytological WSIs. To address this, we propose a strategy to crop patches at multiple scales and utilize a cross-attention vision transformer (CrossViT) to combine information from different scales for WSI classification. The combination of our two steps achieves task-alignment, improving effectiveness and efficiency. We validate the proposed label-efficient method on a urine cytology WSI dataset encompassing 130 samples (13,000 patches) and FNAC 2019 dataset with 212 samples (21,200 patches). The experiment shows that the proposed LESS reaches 84.79%, 85.43%, 91.79% and 78.30% on a urine cytology WSI dataset, and 96.88%, 96.86%, 98.95%, 97.06% on FNAC 2019 dataset in terms of accuracy, AUC, sensitivity and specificity. It outperforms state-of-the-art MIL methods on pathology WSIs and realizes automatic cytological WSI cancer screening.Comment: This paper was submitted to Medical Image Analysis. It is under revie

    Applications of machine and deep learning to thyroid cytology and histopathology: a review.

    Get PDF
    This review synthesises past research into how machine and deep learning can improve the cyto- and histopathology processing pipelines for thyroid cancer diagnosis. The current gold-standard preoperative technique of fine-needle aspiration cytology has high interobserver variability, often returns indeterminate samples and cannot reliably identify some pathologies; histopathology analysis addresses these issues to an extent, but it requires surgical resection of the suspicious lesions so cannot influence preoperative decisions. Motivated by these issues, as well as by the chronic shortage of trained pathologists, much research has been conducted into how artificial intelligence could improve current pipelines and reduce the pressure on clinicians. Many past studies have indicated the significant potential of automated image analysis in classifying thyroid lesions, particularly for those of papillary thyroid carcinoma, but these have generally been retrospective, so questions remain about both the practical efficacy of these automated tools and the realities of integrating them into clinical workflows. Furthermore, the nature of thyroid lesion classification is significantly more nuanced in practice than many current studies have addressed, and this, along with the heterogeneous nature of processing pipelines in different laboratories, means that no solution has proven itself robust enough for clinical adoption. There are, therefore, multiple avenues for future research: examine the practical implementation of these algorithms as pathologist decision-support systems; improve interpretability, which is necessary for developing trust with clinicians and regulators; and investigate multiclassification on diverse multicentre datasets, aiming for methods that demonstrate high performance in a process- and equipment-agnostic manner

    Towards Interpretable Machine Learning in Medical Image Analysis

    Get PDF
    Over the past few years, ML has demonstrated human expert level performance in many medical image analysis tasks. However, due to the black-box nature of classic deep ML models, translating these models from the bench to the bedside to support the corresponding stakeholders in the desired tasks brings substantial challenges. One solution is interpretable ML, which attempts to reveal the working mechanisms of complex models. From a human-centered design perspective, interpretability is not a property of the ML model but an affordance, i.e., a relationship between algorithm and user. Thus, prototyping and user evaluations are critical to attaining solutions that afford interpretability. Following human-centered design principles in highly specialized and high stakes domains, such as medical image analysis, is challenging due to the limited access to end users. This dilemma is further exacerbated by the high knowledge imbalance between ML designers and end users. To overcome the predicament, we first define 4 levels of clinical evidence that can be used to justify the interpretability to design ML models. We state that designing ML models with 2 levels of clinical evidence: 1) commonly used clinical evidence, such as clinical guidelines, and 2) iteratively developed clinical evidence with end users are more likely to design models that are indeed interpretable to end users. In this dissertation, we first address how to design interpretable ML in medical image analysis that affords interpretability with these two different levels of clinical evidence. We further highly recommend formative user research as the first step of the interpretable model design to understand user needs and domain requirements. We also indicate the importance of empirical user evaluation to support transparent ML design choices to facilitate the adoption of human-centered design principles. All these aspects in this dissertation increase the likelihood that the algorithms afford interpretability and enable stakeholders to capitalize on the benefits of interpretable ML. In detail, we first propose neural symbolic reasoning to implement public clinical evidence into the designed models for various routinely performed clinical tasks. We utilize the routinely applied clinical taxonomy for abnormality classification in chest x-rays. We also establish a spleen injury grading system by strictly following the clinical guidelines for symbolic reasoning with the detected and segmented salient clinical features. Then, we propose the entire interpretable pipeline for UM prognostication with cytopathology images. We first perform formative user research and found that pathologists believe cell composition is informative for UM prognostication. Thus, we build a model to analyze cell composition directly. Finally, we conduct a comprehensive user study to assess the human factors of human-machine teaming with the designed model, e.g., whether the proposed model indeed affords interpretability to pathologists. The human-centered design process is proven to be truly interpretable to pathologists for UM prognostication. All in all, this dissertation introduces a comprehensive human-centered design for interpretable ML solutions in medical image analysis that affords interpretability to end users

    Deep weakly-supervised learning methods for classification and localization in histology images: a survey

    Full text link
    Using state-of-the-art deep learning models for cancer diagnosis presents several challenges related to the nature and availability of labeled histology images. In particular, cancer grading and localization in these images normally relies on both image- and pixel-level labels, the latter requiring a costly annotation process. In this survey, deep weakly-supervised learning (WSL) models are investigated to identify and locate diseases in histology images, without the need for pixel-level annotations. Given training data with global image-level labels, these models allow to simultaneously classify histology images and yield pixel-wise localization scores, thereby identifying the corresponding regions of interest (ROI). Since relevant WSL models have mainly been investigated within the computer vision community, and validated on natural scene images, we assess the extent to which they apply to histology images which have challenging properties, e.g. very large size, similarity between foreground/background, highly unstructured regions, stain heterogeneity, and noisy/ambiguous labels. The most relevant models for deep WSL are compared experimentally in terms of accuracy (classification and pixel-wise localization) on several public benchmark histology datasets for breast and colon cancer -- BACH ICIAR 2018, BreaKHis, CAMELYON16, and GlaS. Furthermore, for large-scale evaluation of WSL models on histology images, we propose a protocol to construct WSL datasets from Whole Slide Imaging. Results indicate that several deep learning models can provide a high level of classification accuracy, although accurate pixel-wise localization of cancer regions remains an issue for such images. Code is publicly available.Comment: 35 pages, 18 figure

    Deep Learning in Medical Image Analysis

    Get PDF
    The accelerating power of deep learning in diagnosing diseases will empower physicians and speed up decision making in clinical environments. Applications of modern medical instruments and digitalization of medical care have generated enormous amounts of medical images in recent years. In this big data arena, new deep learning methods and computational models for efficient data processing, analysis, and modeling of the generated data are crucially important for clinical applications and understanding the underlying biological process. This book presents and highlights novel algorithms, architectures, techniques, and applications of deep learning for medical image analysis

    Evaluation of PD-L1 expression in various formalin-fixed paraffin embedded tumour tissue samples using SP263, SP142 and QR1 antibody clones

    Get PDF
    Background & objectives: Cancer cells can avoid immune destruction through the inhibitory ligand PD-L1. PD-1 is a surface cell receptor, part of the immunoglobulin family. Its ligand PD-L1 is expressed by tumour cells and stromal tumour infltrating lymphocytes (TIL). Methods: Forty-four cancer cases were included in this study (24 triple-negative breast cancers (TNBC), 10 non-small cell lung cancer (NSCLC) and 10 malignant melanoma cases). Three clones of monoclonal primary antibodies were compared: QR1 (Quartett), SP 142 and SP263 (Ventana). For visualization, ultraView Universal DAB Detection Kit from Ventana was used on an automated platform for immunohistochemical staining Ventana BenchMark GX. Results: Comparing the sensitivity of two different clones on same tissue samples from TNBC, we found that the QR1 clone gave higher percentage of positive cells than clone SP142, but there was no statistically significant difference. Comparing the sensitivity of two different clones on same tissue samples from malignant melanoma, the SP263 clone gave higher percentage of positive cells than the QR1 clone, but again the difference was not statistically significant. Comparing the sensitivity of two different clones on same tissue samples from NSCLC, we found higher percentage of positive cells using the QR1 clone in comparison with the SP142 clone, but once again, the difference was not statistically significant. Conclusion: The three different antibody clones from two manufacturers Ventana and Quartett, gave comparable results with no statistically significant difference in staining intensity/ percentage of positive tumour and/or immune cells. Therefore, different PD-L1 clones from different manufacturers can potentially be used to evaluate the PD- L1 status in different tumour tissues. Due to the serious implications of the PD-L1 analysis in further treatment decisions for cancer patients, every antibody clone, staining protocol and evaluation process should be carefully and meticulously validated

    AI-basierte volumetrische Analyse der Lebermetastasenlast bei Patienten mit neuroendokrinen Neoplasmen (NEN)

    Get PDF
    Background: Quantification of liver tumor load in patients with liver metastases from neuroendocrine neoplasms is essential for therapeutic management. However, accurate measurement of three-dimensional (3D) volumes is time-consuming and difficult to achieve. Even though the common criteria for assessing treatment response have simplified the measurement of liver metastases, the workload of following up patients with neuroendocrine liver metastases (NELMs) remains heavy for radiologists due to their increased morbidity and prolonged survival. Among the many imaging methods, gadoxetic acid (Gd-EOB)-enhanced magnetic resonance imaging (MRI) has shown the highest accuracy. Methods: 3D-volumetric segmentation of NELM and livers were manually performed in 278 Gd-EOB MRI scans from 118 patients. Eighty percent (222 scans) of them were randomly divided into training datasets and the other 20% (56 scans) were internal validation datasets. An additional 33 patients from a different time period, who underwent Gd-EOB MRI at both baseline and 12-month follow-up examinations, were collected for external and clinical validation (n = 66). Model measurement results (NELM volume; hepatic tumor load (HTL)) and the respective absolute (ΔabsNELM; ΔabsHTL) and relative changes (ΔrelNELM; ΔrelHTL) for baseline and follow-up-imaging were used and correlated with multidisciplinary cancer conferences (MCC) decisions (treatment success/failure). Three readers manually segmented MRI images of each slice, blinded to clinical data and independently. All images were reviewed by another senior radiologist. Results: The model’s performance showed high accuracy between NELM and liver in both internal and external validation (Matthew’s correlation coefficient (ϕ): 0.76/0.95, 0.80/0.96, respectively). And in internal validation dataset, the group with higher NELM volume (> 16.17 cm3) showed higher ϕ than the group with lower NELM volume (ϕ = 0.80 vs. 0.71; p = 0.0025). In the external validation dataset, all response variables (∆absNELM; ∆absHTL; ∆relNELM; ∆relHTL) reflected significant differences across MCC decision groups (all p < 0.001). The AI model correctly detected the response trend based on ∆relNELM and ∆relHTL in all the 33 MCC patients and showed the optimal discrimination between treatment success and failure at +56.88% and +57.73%, respectively (AUC: 1.000; P < 0.001). Conclusions: The created AI-based segmentation model performed well in the three-dimensional quantification of NELMs and HTL in Gd-EOB-MRI. Moreover, the model showed good agreement with the evaluation of treatment response of the MCC’s decision.Hintergrund: Die Quantifizierung der Lebertumorlast bei Patienten mit Lebermetastasen von neuroendokrinen Neoplasien ist für die Behandlung unerlässlich. Eine genaue Messung des dreidimensionalen (3D) Volumens ist jedoch zeitaufwändig und schwer zu erreichen. Obwohl standardisierte Kriterien für die Beurteilung des Ansprechens auf die Behandlung die Messung von Lebermetastasen vereinfacht haben, bleibt die Arbeitsbelastung für Radiologen bei der Nachbeobachtung von Patienten mit neuroendokrinen Lebermetastasen (NELMs) aufgrund der höheren Fallzahlen durch erhöhte Morbidität und verlängerter Überlebenszeit hoch. Unter den zahlreichen bildgebenden Verfahren hat die Gadoxetsäure (Gd-EOB)-verstärkte Magnetresonanztomographie (MRT) die höchste Genauigkeit gezeigt. Methoden: Manuelle 3D-Segmentierungen von NELM und Lebern wurden in 278 Gd-EOB-MRT-Scans von 118 Patienten durchgeführt. 80% (222 Scans) davon wurden nach dem Zufallsprinzip in den Trainingsdatensatz eingeteilt, die übrigen 20% (56 Scans) waren interne Validierungsdatensätze. Zur externen und klinischen Validierung (n = 66) wurden weitere 33 Patienten aus einer späteren Zeitspanne des Multidisziplinäre Krebskonferenzen (MCC) erfasst, welche sich sowohl bei der Erstuntersuchung als auch bei der Nachuntersuchung nach 12 Monaten einer Gd-EOB-MRT unterzogen hatten. Die Messergebnisse des Modells (NELM-Volumen; hepatische Tumorlast (HTL)) mit den entsprechenden absoluten (ΔabsNELM; ΔabsHTL) und relativen Veränderungen (ΔrelNELM; ΔrelHTL) bei der Erstuntersuchung und der Nachuntersuchung wurden zum Vergleich mit MCC-Entscheidungen (Behandlungserfolg/-versagen) herangezogen. Drei Leser segmentierten die MRT-Bilder jeder Schicht manuell, geblindet und unabhängig. Alle Bilder wurden von einem weiteren Radiologen überprüft. Ergebnisse: Die Leistung des Modells zeigte sowohl bei der internen als auch bei der externen Validierung eine hohe Genauigkeit zwischen NELM und Leber (Matthew's Korrelationskoeffizient (ϕ): 0,76/0,95 bzw. 0,80/0,96). Und im internen Validierungsdatensatz zeigte die Gruppe mit höherem NELM-Volumen (> 16,17 cm3) einen höheren ϕ als die Gruppe mit geringerem NELM-Volumen (ϕ = 0,80 vs. 0,71; p = 0,0025). Im externen Validierungsdatensatz wiesen alle Antwortvariablen (∆absNELM; ∆absHTL; ∆relNELM; ∆relHTL) signifikante Unterschiede zwischen den MCC-Entscheidungsgruppen auf (alle p < 0,001). Das KI-Modell erkannte das Therapieansprechen auf der Grundlage von ∆relNELM und ∆relHTL bei allen 33 MCC-Patienten korrekt und zeigte bei +56,88% bzw. +57,73% eine optimale Unterscheidung zwischen Behandlungserfolg und -versagen (AUC: 1,000; P < 0,001). Schlussfolgerungen: Das Modell zeigte eine hohe Genauigkeit bei der dreidimensionalen Quantifizierung des NELMs-Volumens und der HTL in der Gd-EOB-MRT. Darüber hinaus zeigte das Modell eine gute Übereinstimmung bei der Bewertung des Ansprechens auf die Behandlung mit der Entscheidung des Tumorboards
    corecore