100 research outputs found

    Building Trust in Artificial Intelligence: Findings from Healthcare Organization

    Get PDF
    Artificial intelligence (AI) is commonly applied to the diagnostic process, thus developing a treatment protocol, personalized medicine, and patient care. Second-order cognitive processes are utilized by physicians to control their reasoning while evaluating AI advice. Inadequate diagnostic decisions often result from deficiencies in the use of metacognitions both related to decision-makers\u27 reasoning (self-monitoring) and the AI-based system (system monitoring). Physicians will then fall for decisions based on beliefs as opposed to real data or seek out inappropriate superficial information. Inappropriate diagnostic decisions are, therefore, linked to a lack of trust in AI. This article aims to understand how trust in AI is built among hospital practitioners. A 20-month ethnographic study was conducted in a medical research center wherein hospital practitioners daily apply AI for their medical processes. This research work demonstrates that trust around AI is built through cognition and emotion. Factors like peer validation or social imagination play an important role in AI for creating trust

    Efficient Scopeformer: Towards Scalable and Rich Feature Extraction for Intracranial Hemorrhage Detection

    Full text link
    The quality and richness of feature maps extracted by convolution neural networks (CNNs) and vision Transformers (ViTs) directly relate to the robust model performance. In medical computer vision, these information-rich features are crucial for detecting rare cases within large datasets. This work presents the "Scopeformer," a novel multi-CNN-ViT model for intracranial hemorrhage classification in computed tomography (CT) images. The Scopeformer architecture is scalable and modular, which allows utilizing various CNN architectures as the backbone with diversified output features and pre-training strategies. We propose effective feature projection methods to reduce redundancies among CNN-generated features and to control the input size of ViTs. Extensive experiments with various Scopeformer models show that the model performance is proportional to the number of convolutional blocks employed in the feature extractor. Using multiple strategies, including diversifying the pre-training paradigms for CNNs, different pre-training datasets, and style transfer techniques, we demonstrate an overall improvement in the model performance at various computational budgets. Later, we propose smaller compute-efficient Scopeformer versions with three different types of input and output ViT configurations. Efficient Scopeformers use four different pre-trained CNN architectures as feature extractors to increase feature richness. Our best Efficient Scopeformer model achieved an accuracy of 96.94\% and a weighted logarithmic loss of 0.083 with an eight times reduction in the number of trainable parameters compared to the base Scopeformer. Another version of the Efficient Scopeformer model further reduced the parameter space by almost 17 times with negligible performance reduction. Hybrid CNNs and ViTs might provide the desired feature richness for developing accurate medical computer vision model

    The Importance of Robust Features in Mitigating Catastrophic Forgetting

    Full text link
    Continual learning (CL) is an approach to address catastrophic forgetting, which refers to forgetting previously learned knowledge by neural networks when trained on new tasks or data distributions. The adversarial robustness has decomposed features into robust and non-robust types and demonstrated that models trained on robust features significantly enhance adversarial robustness. However, no study has been conducted on the efficacy of robust features from the lens of the CL model in mitigating catastrophic forgetting in CL. In this paper, we introduce the CL robust dataset and train four baseline models on both the standard and CL robust datasets. Our results demonstrate that the CL models trained on the CL robust dataset experienced less catastrophic forgetting of the previously learned tasks than when trained on the standard dataset. Our observations highlight the significance of the features provided to the underlying CL models, showing that CL robust features can alleviate catastrophic forgetting

    Exploring Robustness of Neural Networks through Graph Measures

    Get PDF
    Motivated by graph theory, artificial neural networks (ANNs) are traditionally structured as layers of neurons (nodes), which learn useful information by the passage of data through interconnections (edges). In the machine learning realm, graph structures (i.e., neurons and connections) of ANNs have recently been explored using various graph-theoretic measures linked to their predictive performance. On the other hand, in network science (NetSci), certain graph measures including entropy and curvature are known to provide insight into the robustness and fragility of real-world networks. In this work, we use these graph measures to explore the robustness of various ANNs to adversarial attacks. To this end, we (1) explore the design space of inter-layer and intra-layers connectivity regimes of ANNs in the graph domain and record their predictive performance after training under different types of adversarial attacks, (2) use graph representations for both inter-layer and intra-layers connectivity regimes to calculate various graph-theoretic measures, including curvature and entropy, and (3) analyze the relationship between these graph measures and the adversarial performance of ANNs. We show that curvature and entropy, while operating in the graph domain, can quantify the robustness of ANNs without having to train these ANNs. Our results suggest that the real-world networks, including brain networks, financial networks, and social networks may provide important clues to the neural architecture search for robust ANNs. We propose a search strategy that efficiently finds robust ANNs amongst a set of well-performing ANNs without having a need to train all of these ANNs.Comment: 18 pages, 15 figure

    Approximate kernel reconstruction for time-varying networks

    Get PDF
    Most existing algorithms for modeling and analyzing molecular networks assume a static or time-invariant network topology. Such view, however, does not render the temporal evolution of the underlying biological process as molecular networks are typically “re-wired” over time in response to cellular development and environmental changes. In our previous work, we formulated the inference of time-varying or dynamic networks as a tracking problem, where the target state is the ensemble of edges in the network. We used the Kalman filter to track the network topology over time. Unfortunately, the output of the Kalman filter does not reflect known properties of molecular networks, such as sparsity

    Out-of-distribution Object Detection through Bayesian Uncertainty Estimation

    Full text link
    The superior performance of object detectors is often established under the condition that the test samples are in the same distribution as the training data. However, in many practical applications, out-of-distribution (OOD) instances are inevitable and usually lead to uncertainty in the results. In this paper, we propose a novel, intuitive, and scalable probabilistic object detection method for OOD detection. Unlike other uncertainty-modeling methods that either require huge computational costs to infer the weight distributions or rely on model training through synthetic outlier data, our method is able to distinguish between in-distribution (ID) data and OOD data via weight parameter sampling from proposed Gaussian distributions based on pre-trained networks. We demonstrate that our Bayesian object detector can achieve satisfactory OOD identification performance by reducing the FPR95 score by up to 8.19% and increasing the AUROC score by up to 13.94% when trained on BDD100k and VOC datasets as the ID datasets and evaluated on COCO2017 dataset as the OOD dataset

    Inception Modules Enhance Brain Tumor Segmentation.

    Get PDF
    Magnetic resonance images of brain tumors are routinely used in neuro-oncology clinics for diagnosis, treatment planning, and post-treatment tumor surveillance. Currently, physicians spend considerable time manually delineating different structures of the brain. Spatial and structural variations, as well as intensity inhomogeneity across images, make the problem of computer-assisted segmentation very challenging. We propose a new image segmentation framework for tumor delineation that benefits from two state-of-the-art machine learning architectures in computer vision, i.e., Inception modules and U-Net image segmentation architecture. Furthermore, our framework includes two learning regimes, i.e., learning to segment intra-tumoral structures (necrotic and non-enhancing tumor core, peritumoral edema, and enhancing tumor) or learning to segment glioma sub-regions (whole tumor, tumor core, and enhancing tumor). These learning regimes are incorporated into a newly proposed loss function which is based on the Dice similarity coefficient (DSC). In our experiments, we quantified the impact of introducing the Inception modules in the U-Net architecture, as well as, changing the objective function for the learning algorithm from segmenting the intra-tumoral structures to glioma sub-regions. We found that incorporating Inception modules significantly improved the segmentation performance (p \u3c 0.001) for all glioma sub-regions. Moreover, in architectures with Inception modules, the models trained with the learning objective of segmenting the intra-tumoral structures outperformed the models trained with the objective of segmenting the glioma sub-regions for the whole tumor (p \u3c 0.001). The improved performance is linked to multiscale features extracted by newly introduced Inception module and the modified loss function based on the DSC

    Kernel Reconstruction: an Exact Greedy Algorithm for Compressive Sensing

    Get PDF
    Abstract-Compressive sensing is the theory of sparse signal recovery from undersampled measurements or observations. Exact signal reconstruction is an NP hard problem. A convex approximation using the l1-norm has received a great deal of theoretical attention. Exact recovery using the l1 approximation is only possible under strict conditions on the measurement matrix, which are difficult to check. Many greedy algorithms have thus been proposed. However, none of them is guaranteed to lead to the optimal (sparsest) solution. In this paper, we present a new greedy algorithm that provides an exact sparse solution of the problem. Unlike other greedy approaches, which are only approximations of the exact sparse solution, the proposed greedy approach, called Kernel Reconstruction, leads to the exact optimal solution in less operations than the original combinatorial problem. An application to the recovery of sparse gene regulatory networks is presented
    • …
    corecore