4 research outputs found

    MDF-Net for Abnormality Detection by Fusing X-Rays with Clinical Data

    Full text link
    This study investigates the effects of including patients' clinical information on the performance of deep learning (DL) classifiers for disease location in chest X-ray images. Although current classifiers achieve high performance using chest X-ray images alone, our interviews with radiologists indicate that clinical data is highly informative and essential for interpreting images and making proper diagnoses. In this work, we propose a novel architecture consisting of two fusion methods that enable the model to simultaneously process patients' clinical data (structured data) and chest X-rays (image data). Since these data modalities are in different dimensional spaces, we propose a spatial arrangement strategy, spatialization, to facilitate the multimodal learning process in a Mask R-CNN model. We performed an extensive experimental evaluation using MIMIC-Eye, a dataset comprising modalities: MIMIC-CXR (chest X-ray images), MIMIC IV-ED (patients' clinical data), and REFLACX (annotations of disease locations in chest X-rays). Results show that incorporating patients' clinical data in a DL model together with the proposed fusion methods improves the disease localization in chest X-rays by 12\% in terms of Average Precision compared to a standard Mask R-CNN using only chest X-rays. Further ablation studies also emphasize the importance of multimodal DL architectures and the incorporation of patients' clinical data in disease localization. The architecture proposed in this work is publicly available to promote the scientific reproducibility of our study (https://github.com/ChihchengHsieh/multimodal-abnormalities-detection

    DiCE4EL: Interpreting Process Predictions using a Milestone-Aware Counterfactual Approach

    No full text
    Predictive process analytics often apply machine learning to predict the future states of a running business process. However, the internal mechanisms of many existing predictive algorithms are opaque and a human decision-maker is unable to understand why a certain activity was predicted. Recently, counterfactuals have been proposed in the literature to derive human-understandable explanations from predictive models. Current counterfactual approaches consist of finding the minimum feature change that can make a certain prediction flip its outcome. Although many algorithms have been proposed, their application to multi-dimensional sequence data like event logs has not been explored in the literature. In this paper, we explore the use of a recent, popular model-agnostic counterfactual algorithm, DiCE, in the context of predictive process analytics. The analysis reveals that DiCE is unable to derive explanations for process predictions, due to (1) process domain knowledge not being taken into account, (2) long traces of process execution that often tend to be less understandable, and (3) difficulties in optimising the counterfactual search with categorical variables. We design an extension of DiCE, namely DiCE4EL (DiCE for Event Logs), that can generate counterfactual explanations for process prediction, and propose an approach that supports deriving milestone-aware counterfactual explanations at key intermediate stages along process execution to promote interpretability. We apply our approach to a publicly available real-life event log and the analysis results demonstrate the effectiveness of the proposed approach

    MIMIC-Eye: Integrating MIMIC Datasets with REFLACX and Eye Gaze for Multimodal Deep Learning Applications

    No full text
    Deep learning technologies have been widely adopted in medical imaging due to their ability to extract features from images and make accurate diagnoses automatically. Medical imaging technologies are particularly useful because they can be trained to detect subtle differences in images that are hard to detect for human radiologists. In the real world, radiologists must rely on various types of patient information to assess medical images confidently. However, most DL applications in medical imaging only utilize image data, mainly because the literature on medical datasets combining different data modalities is scarce. In this study, we present MIMIC-EYE, a dataset that encompasses a comprehensive integration of several datasets related to MIMIC. This dataset includes a comprehensive range of patient information, including medical images and reports (MIMIC CXR and MIMIC JPG), clinical data (MIMIC IV ED), a detailed account of the patient's hospital journey (MIMIC IV), and eye tracking data containing gaze information and pupil dilations together with image annotations (REFLACX and EYE GAZE). Integrating eye tracking data with the various MIMIC modalities may provide a more comprehensive understanding of radiologists' visual search behavior patterns and facilitate the development of more robust, accurate, and reproducible deep-learning models for medical imaging diagnosis

    EyeXNet: Enhancing Abnormality Detection and Diagnosis via Eye-Tracking and X-ray Fusion

    No full text
    Integrating eye gaze data with chest X-ray images in deep learning (DL) has led to contradictory conclusions in the literature. Some authors assert that eye gaze data can enhance prediction accuracy, while others consider eye tracking irrelevant for predictive tasks. We argue that this disagreement lies in how researchers process eye-tracking data as most remain agnostic to the human component and apply the data directly to DL models without proper preprocessing. We present EyeXNet, a multimodal DL architecture that combines images and radiologists’ fixation masks to predict abnormality locations in chest X-rays. We focus on fixation maps during reporting moments as radiologists are more likely to focus on regions with abnormalities and provide more targeted regions to the predictive models. Our analysis compares radiologist fixations in both silent and reporting moments, revealing that more targeted and focused fixations occur during reporting. Our results show that integrating the fixation masks in a multimodal DL architecture outperformed the baseline model in five out of eight experiments regarding average Recall and six out of eight regarding average Precision. Incorporating fixation masks representing radiologists’ classification patterns in a multimodal DL architecture benefits lesion detection in chest X-ray (CXR) images, particularly when there is a strong correlation between fixation masks and generated proposal regions. This highlights the potential of leveraging fixation masks to enhance multimodal DL architectures for CXR image analysis. This work represents a first step towards human-centered DL, moving away from traditional data-driven and human-agnostic approaches
    corecore