80 research outputs found
Locality Guided Neural Networks for Explainable Artificial Intelligence
In current deep network architectures, deeper layers in networks tend to contain hundreds of independent neurons which makes it hard for humans to understand how they interact with each other. By organizing the neurons by correlation, humans can observe how clusters of neighbouring neurons interact with each other. In this paper, we propose a novel algorithm for back propagation, called Locality Guided Neural Network(LGNN) for training networks that preserves locality between neighbouring neurons within each layer of a deep network. Heavily motivated by Self-Organizing Map (SOM), the goal is to enforce a local topology on each layer of a deep network such that neighbouring neurons are highly correlated with each other. This method contributes to the domain of Explainable Artificial Intelligence (XAI), which aims to alleviate the black-box nature of current AI methods and make them understandable by humans. Our method aims to achieve XAI in deep learning without changing the structure of current models nor requiring any post processing. This paper focuses on Convolutional Neural Networks (CNNs), but can theoretically be applied to any type of deep learning architecture. In our experiments, we train various VGG and Wide ResNet (WRN) networks for image classification on CIFAR100. In depth analyses presenting both qualitative and quantitative results demonstrate that our method is capable of enforcing a topology on each layer while achieving a small increase in classification accuracy</p
Real-Time System for Human Activity Analysis
We propose a real-time human activity analysis system, where a user's activity can be quantiatively evaluated with respect to a ground truth recording. We use two Kinects to solve the ptorblem of self-occlusion through extraction optimal joint positions using Singular Value Decomposition (SVD) and Sequential Quadratic Programming (SQP). Incremental Dynamic Time Warping (IDTW) is used to compare the user and expert (ground truth) to quantiatively score the user's performance. Furthermore, the user's performance is displayed through a visual feedback system, where colors on the skeleton represent the user's score. Our experiements use a motion capture suit as ground truth to compare our dual Kinect setup to a single Kinect. We also show that with out visual feedback method, users gain statistically significant boost to learning as opposed to watching a simple video.</p
Intuitive volume exploration through spherical self-organizing map and color harmonization
Ā Ā
Finding an appropriate transfer function (TF) for mapping color and opacity values in direct volume rendering (DVR) can be a daunting task. This paper presents a novel approach towards TF generation for DVR, where the traditional low-level color and opacity parameter manipulations are not necessary. The TF generation process is hidden behind a simple and intuitive spherical self-organizing map (SSOM) visualization. The SSOM represents a visual form of the topological relations among the clusters. The user interacts with the SSOM lattice to find interesting regions in the volume. The color and opacity values are generated automatically from the voxel features based on the user's perception. We also use harmonic colors to present a visually pleasing result. Due to the independence of SSOM from feature type, our proposed method is flexible in nature and can be integrated with any set of features. The GPU implementation provides real-time volume rendering and fast interaction. Experimental results on several benchmark volume datasets show the effectiveness of our proposed method.</p
ImmerVol: An Immersive Volume Visualization System
Volume visualization is a popular technique for analyzing 3D datasets, especially in the medical domain. An immersive visual environment provides easier navigation through the rendered dataset. However, visualization is only one part of the problem. Finding an appropriate Transfer Function (TF) for mapping color and opacity values in Direct Volume Rendering (DVR) is difļ¬cult. This paper combines the beneļ¬ts of the CAVE Automatic Virtual Environment with a novel approach towards TF generation for DVR, where the traditional low-level color and opacity parameter manipulations are eliminated. The TF generation process is hidden behind a Spherical Self Organizing Map (SSOM). The user interacts with the visual form of the SSOM lattice on a mobile device while viewing the corresponding rendering of the volume dataset in real time in the CAVE. The SSOM lattice is obtained through high-dimensional features extracted from the volume dataset. The color and opacity values of the TF are automatically generated based on the userās perception. Hence, the resulting TF can expose complex structures in the dataset within seconds, which the user can analyze easily and efļ¬ciently through complete immersion.
</p
A Novel Image-centric Approach Towards Direct Volume Rendering
Transfer Function (TF) generation is a fundamental problem in Direct Volume Rendering (DVR). ATF maps voxels to color and opacity values to reveal inner structures. Existing TF tools are complex and unintuitive for the users who are more likely to be medical professionals than computer scientists. In this paper, we propose a novel image-centric method for TF generation where instead of complex tools, the user directly manipulates volume data to generate DVR. The userās work is further simpliļ¬ed by presenting only the most informative volume slices for selection. Based on the selected parts, the voxels are classiļ¬ed using our novel Sparse Nonparametric Support Vector Machine classiļ¬er, which combines both local and near-global distributional information of the training data. The voxel classes are mapped to aesthetically pleasing and distinguishable color and opacity values using harmonic colors. Experimental results on several benchmark datasets and a detailed user survey show the effectiveness of the proposed method.</p
Machine Learning on Biomedical Images: Interactive Learning, Transfer Learning, Class Imbalance, and Beyond
In this paper, we highlight three issues that limit performance of machine learning on biomedical images, and tackle them through 3 case studies: 1) Interactive Machine Learning (IML): we show how IML can drastically improve exploration time and quality of direct volume rendering. 2) transfer learning: we show how transfer learning along with intelligent pre-processing can result in better Alzheimer's diagnosis using a much smaller training set 3) data imbalance: we show how our novel focal Tversky loss function can provide better segmentation results taking into account the imbalanced nature of segmentation datasets. The case studies are accompanied by in-depth analytical discussion of results with possible future directions.</p
ECG Heartbeat Classification Using Multimodal Fusion
Electrocardiogram (ECG) is an authoritative source to diagnose and counter critical cardiovascular syndromes such as arrhythmia and myocardial infarction (MI). Current machine learning techniques either depend on manually extracted features or large and complex deep learning networks which merely utilize the 1D ECG signal directly. Since intelligent multimodal fusion can perform at the state-of-the-art level with an efficient deep network, therefore, in this paper, we propose two computationally efficient multimodal fusion frameworks for ECG heart beat classification called Multimodal Image Fusion (MIF) and Multimodal Feature Fusion (MFF). At the input of these frameworks, we convert the raw ECG data into three different images using Gramian Angular Field (GAF), Recurrence Plot (RP) and Markov Transition Field (MTF). In MIF, we first perform image fusion by combining three imaging modalities to create a single image modality which serves as input to the Convolutional Neural Network (CNN). In MFF, we extracted features from penultimate layer of CNNs and fused them to get unique and interdependent information necessary for better performance of classifier. These informational features are finally used to train a Support Vector Machine (SVM) classifier for ECG heart-beat classification. We demonstrate the superiority of the proposed fusion models by performing experiments on PhysioNet's MIT-BIH dataset for five distinct conditions of arrhythmias which are consistent with the AAMI EC57 protocols and on PTB diagnostics dataset for Myocardial Infarction (MI) classification. We achieved classification accuracy of 99.7% and 99.2% on arrhythmia and MI classification, respectively. Source code at https://github.com/zaamad/ECG-Heartbeat-Classification-Using-Multimodal-Fusion.</p
Interpretable Artificial Intelligence through Locality Guided Neural Networks
In current deep learning architectures, each of the deeper layers in networks tends to contain hundreds of unorganized neurons which makes it hard for humans to understand how they interact with each other. By organizing the neurons using correlation as the criteria, humans can observe how clusters of neighbouring neurons interact with each other. Research in Explainable Artificial Intelligence (XAI) aims to all alleviate the black-box nature of current AI methods and make them understandable by humans. In this paper, we extend our previous algorithm for XAI in deep learning, called Locality Guided Neural Network (LGNN). LGNN preserves locality between neighbouring neurons within each layer of a deep network during training. Motivated by Self-Organizing Maps (SOMs), the goal is to enforce a local topology on each layer of a deep network such that neighbouring neurons are highly correlated with each other. Our algorithm can be easily plugged into current state of the art Convolutional Neural Network (CNN) models to make the neighbouring neurons more correlated. A cluster of neighbouring neurons activating for a class makes the network both quantitatively and qualitatively more interpretable when visualized, as we show through our experiments. This paper focuses on image processing with CNNs, but can theoretically be applied to any type of deep learning architecture. In our experiments, we train VGG and WRN networks for image classification on CIFAR100 and Imagenette. Our experiments analyse different perceptible clusters of activations in response to different input classes.</p
ECG Heartbeat Classification Using Multimodal Fusion
Electrocardiogram (ECG) is an authoritative source to diagnose and counter critical cardiovascular syndromes such as arrhythmia and myocardial infarction (MI). Current machine learning techniques either depend on manually extracted features or large and complex deep learning networks which merely utilize the 1D ECG signal directly. Since intelligent multimodal fusion can perform at the stateof-the-art level with an efficient deep network, therefore, in this paper, we propose two computationally efficient multimodal fusion frameworks for ECG heart beat classification called Multimodal Image Fusion (MIF) and Multimodal Feature Fusion (MFF). At the input of these frameworks, we convert the raw ECG data into three different images using Gramian Angular Field (GAF), Recurrence Plot (RP) and Markov Transition Field (MTF). In MIF, we first perform image fusion by combining three imaging modalities to create a single image modality which serves as input to the Convolutional Neural Network (CNN). In MFF, we extracted features from penultimate layer of CNNs and fused them to get unique and interdependent information necessary for better performance of classifier. These informational features are finally used to train a Support Vector Machine (SVM) classifier for ECG heart-beat classification. We demonstrate the superiority of the proposed fusion models by performing experiments on PhysioNets MIT-BIH dataset for five distinct conditions of arrhythmias which are consistent with the AAMI EC57 protocols and on PTB diagnostics dataset for Myocardial Infarction (MI) classification. We achieved classification accuracy of 99.7% and 99.2% on arrhythmia and MI classification, respectively.</p
Covariance-guided One-Class Support Vector Machine
In one-class classification, the low variance directions in the training data carry crucial information to build a good model of the target class. Boundary-based methods like One-Class Support Vector Machine (OSVM) preferentially separates the data from outliers along the large variance directions. On the other hand, retaining only the low variance directions can result in sacrificing some initial properties of the original data and is not desirable, specially in case of limited training samples. This paper introduces a Covariance-guided One-Class Support Vector Machine (COSVM) classification method which emphasizes the low variance projectional directions of the training data without compromising any important characteristics. COSVM improves upon the OSVM method by controlling the direction of the separating hyperplane through incorporation of the estimated covariance matrix from the training data. Our proposed method is a convex optimization problem resulting in one global optimum solution which can be solved efficiently with the help of existing numerical methods. The method also keeps the principal structure of the OSVM method intact, and can be implemented easily with the existing OSVM libraries. Comparative experimental results with contemporary one-class classifiers on numerous artificial and benchmark datasets demonstrate that our method results in significantly better classification performance.</p
- ā¦