79 research outputs found

    Towards generalizable machine learning models for computer-aided diagnosis in medicine

    Get PDF
    Hidden stratification represents a phenomenon in which a training dataset contains unlabeled (hidden) subsets of cases that may affect machine learning model performance. Machine learning models that ignore the hidden stratification phenomenon--despite promising overall performance measured as accuracy and sensitivity--often fail at predicting the low prevalence cases, but those cases remain important. In the medical domain, patients with diseases are often less common than healthy patients, and a misdiagnosis of a patient with a disease can have significant clinical impacts. Therefore, to build a robust and trustworthy CAD system and a reliable treatment effect prediction model, we cannot only pursue machine learning models with high overall accuracy, but we also need to discover any hidden stratification in the data and evaluate the proposing machine learning models with respect to both overall performance and the performance on certain subsets (groups) of the data, such as the ‘worst group’. In this study, I investigated three approaches for data stratification: a novel algorithmic deep learning (DL) approach that learns similarities among cases and two schema completion approaches that utilize domain expert knowledge. I further proposed an innovative way to integrate the discovered latent groups into the loss functions of DL models to allow for better model generalizability under the domain shift scenario caused by the data heterogeneity. My results on lung nodule Computed Tomography (CT) images and breast cancer histopathology images demonstrate that learning homogeneous groups within heterogeneous data significantly improves the performance of the computer-aided diagnosis (CAD) system, particularly for low-prevalence or worst-performing cases. This study emphasizes the importance of discovering and learning the latent stratification within the data, as it is a critical step towards building ML models that are generalizable and reliable. Ultimately, this discovery can have a profound impact on clinical decision-making, particularly for low-prevalence cases

    A survey on generative adversarial networks for imbalance problems in computer vision tasks

    Get PDF
    Any computer vision application development starts off by acquiring images and data, then preprocessing and pattern recognition steps to perform a task. When the acquired images are highly imbalanced and not adequate, the desired task may not be achievable. Unfortunately, the occurrence of imbalance problems in acquired image datasets in certain complex real-world problems such as anomaly detection, emotion recognition, medical image analysis, fraud detection, metallic surface defect detection, disaster prediction, etc., are inevitable. The performance of computer vision algorithms can significantly deteriorate when the training dataset is imbalanced. In recent years, Generative Adversarial Neural Networks (GANs) have gained immense attention by researchers across a variety of application domains due to their capability to model complex real-world image data. It is particularly important that GANs can not only be used to generate synthetic images, but also its fascinating adversarial learning idea showed good potential in restoring balance in imbalanced datasets. In this paper, we examine the most recent developments of GANs based techniques for addressing imbalance problems in image data. The real-world challenges and implementations of synthetic image generation based on GANs are extensively covered in this survey. Our survey first introduces various imbalance problems in computer vision tasks and its existing solutions, and then examines key concepts such as deep generative image models and GANs. After that, we propose a taxonomy to summarize GANs based techniques for addressing imbalance problems in computer vision tasks into three major categories: 1. Image level imbalances in classification, 2. object level imbalances in object detection and 3. pixel level imbalances in segmentation tasks. We elaborate the imbalance problems of each group, and provide GANs based solutions in each group. Readers will understand how GANs based techniques can handle the problem of imbalances and boost performance of the computer vision algorithms

    Adversarial Machine Learning For Advanced Medical Imaging Systems

    Get PDF
    Although deep neural networks (DNNs) have achieved significant advancement in various challenging tasks of computer vision, they are also known to be vulnerable to so-called adversarial attacks. With only imperceptibly small perturbations added to a clean image, adversarial samples can drastically change models’ prediction, resulting in a significant drop in DNN’s performance. This phenomenon poses a serious threat to security-critical applications of DNNs, such as medical imaging, autonomous driving, and surveillance systems. In this dissertation, we present adversarial machine learning approaches for natural image classification and advanced medical imaging systems. We start by describing our advanced medical imaging systems to tackle the major challenges of on-device deployment: automation, uncertainty, and resource constraint. It is followed by novel unsupervised and semi-supervised robust training schemes to enhance the adversarial robustness of these medical imaging systems. These methods are designed to tackle the unique challenges of defending against adversarial attacks on medical imaging systems and are sufficiently flexible to generalize to various medical imaging modalities and problems. We continue on developing novel training scheme to enhance adversarial robustness of the general DNN based natural image classification models. Based on a unique insight into the predictive behavior of DNNs that they tend to misclassify adversarial samples into the most probable false classes, we propose a new loss function as a drop-in replacement for the cross-entropy loss to improve DNN\u27s adversarial robustness. Specifically, it enlarges the probability gaps between true class and false classes and prevents them from being melted by small perturbations. Finally, we conclude the dissertation by summarizing original contributions and discussing our future work that leverages DNN interpretability constraint on adversarial training to tackle the central machine learning problem of generalization gap

    Deep learning in medical image registration

    Get PDF
    Image registration is a fundamental task in multiple medical image analysis applications. With the advent of deep learning, there have been significant advances in algorithmic performance for various computer vision tasks in recent years, including medical image registration. The last couple of years have seen a dramatic increase in the development of deep learning-based medical image registration algorithms. Consequently, a comprehensive review of the current state-of-the-art algorithms in the field is timely, and necessary. This review is aimed at understanding the clinical applications and challenges that drove this innovation, analysing the functionality and limitations of existing approaches, and at providing insights to open challenges and as yet unmet clinical needs that could shape future research directions. To this end, the main contributions of this paper are: (a) discussion of all deep learning-based medical image registration papers published since 2013 with significant methodological and/or functional contributions to the field; (b) analysis of the development and evolution of deep learning-based image registration methods, summarising the current trends and challenges in the domain; and (c) overview of unmet clinical needs and potential directions for future research in deep learning-based medical image registration

    Towards Secure and Intelligent Diagnosis: Deep Learning and Blockchain Technology for Computer-Aided Diagnosis Systems

    Get PDF
    Cancer is the second leading cause of death across the world after cardiovascular disease. The survival rate of patients with cancerous tissue can significantly decrease due to late-stage diagnosis. Nowadays, advancements of whole slide imaging scanners have resulted in a dramatic increase of patient data in the domain of digital pathology. Large-scale histopathology images need to be analyzed promptly for early cancer detection which is critical for improving patient's survival rate and treatment planning. Advances of medical image processing and deep learning methods have facilitated the extraction and analysis of high-level features from histopathological data that could assist in life-critical diagnosis and reduce the considerable healthcare cost associated with cancer. In clinical trials, due to the complexity and large variance of collected image data, developing computer-aided diagnosis systems to support quantitative medical image analysis is an area of active research. The first goal of this research is to automate the classification and segmentation process of cancerous regions in histopathology images of different cancer tissues by developing models using deep learning-based architectures. In this research, a framework with different modules is proposed, including (1) data pre-processing, (2) data augmentation, (3) feature extraction, and (4) deep learning architectures. Four validation studies were designed to conduct this research. (1) differentiating benign and malignant lesions in breast cancer (2) differentiating between immature leukemic blasts and normal cells in leukemia cancer (3) differentiating benign and malignant regions in lung cancer, and (4) differentiating benign and malignant regions in colorectal cancer. Training machine learning models, disease diagnosis, and treatment often requires collecting patients' medical data. Privacy and trusted authenticity concerns make data owners reluctant to share their personal and medical data. Motivated by the advantages of Blockchain technology in healthcare data sharing frameworks, the focus of the second part of this research is to integrate Blockchain technology in computer-aided diagnosis systems to address the problems of managing access control, authentication, provenance, and confidentiality of sensitive medical data. To do so, a hierarchical identity and attribute-based access control mechanism using smart contract and Ethereum Blockchain is proposed to securely process healthcare data without revealing sensitive information to an unauthorized party leveraging the trustworthiness of transactions in a collaborative healthcare environment. The proposed access control mechanism provides a solution to the challenges associated with centralized access control systems and ensures data transparency and traceability for secure data sharing, and data ownership
    • …
    corecore