14 research outputs found

    Visualizing Convolutional Networks for MRI-based Diagnosis of Alzheimer's Disease

    Full text link
    Visualizing and interpreting convolutional neural networks (CNNs) is an important task to increase trust in automatic medical decision making systems. In this study, we train a 3D CNN to detect Alzheimer's disease based on structural MRI scans of the brain. Then, we apply four different gradient-based and occlusion-based visualization methods that explain the network's classification decisions by highlighting relevant areas in the input image. We compare the methods qualitatively and quantitatively. We find that all four methods focus on brain regions known to be involved in Alzheimer's disease, such as inferior and middle temporal gyrus. While the occlusion-based methods focus more on specific regions, the gradient-based methods pick up distributed relevance patterns. Additionally, we find that the distribution of relevance varies across patients, with some having a stronger focus on the temporal lobe, whereas for others more cortical areas are relevant. In summary, we show that applying different visualization methods is important to understand the decisions of a CNN, a step that is crucial to increase clinical impact and trust in computer-based decision support systems.Comment: MLCN 201

    M2Net: Multi-modal Multi-channel Network for Overall Survival Time Prediction of Brain Tumor Patients

    Get PDF
    Early and accurate prediction of overall survival (OS) time can help to obtain better treatment planning for brain tumor patients. Although many OS time prediction methods have been developed and obtain promising results, there are still several issues. First, conventional prediction methods rely on radiomic features at the local lesion area of a magnetic resonance (MR) volume, which may not represent the full image or model complex tumor patterns. Second, different types of scanners (i.e., multi-modal data) are sensitive to different brain regions, which makes it challenging to effectively exploit the complementary information across multiple modalities and also preserve the modality-specific properties. Third, existing methods focus on prediction models, ignoring complex data-to-label relationships. To address the above issues, we propose an end-to-end OS time prediction model; namely, Multi-modal Multi-channel Network (M2Net). Specifically, we first project the 3D MR volume onto 2D images in different directions, which reduces computational costs, while preserving important information and enabling pre-trained models to be transferred from other tasks. Then, we use a modality-specific network to extract implicit and high-level features from different MR scans. A multi-modal shared network is built to fuse these features using a bilinear pooling model, exploiting their correlations to provide complementary information. Finally, we integrate the outputs from each modality-specific network and the multi-modal shared network to generate the final prediction result. Experimental results demonstrate the superiority of our M2Net model over other methods.Comment: Accepted by MICCAI'2

    Autonomous Navigation for Mobile Robots: Machine Learning-based Techniques for Obstacle Avoidance

    Get PDF
    Department of System Design and Control EngineeringAutonomous navigation of unmanned aerial vehicles (UAVs) has posed several challenges due to the limitations regarding the number and size of sensors that can be attached to the mobile robots. Although sensors such as LIDARs that directly obtain distance information of the surrounding environment have proven to be effective for obstacle avoidance, the weight and cost of the sensor contribute to the restrictions on usage for UAVs as recent trends require smaller sizes of UAVs. One practical option is the utilization of monocular vision sensors which tend to be lightweight and have a relatively low cost, yet still the main drawback is that it is difficult to draw a certain rule from the sensor data. Conventional methods regarding visual navigation makes use of features within the image data or estimate the depth of the image using various techniques such as optical flow. These features and methodologies however still rely on human-based rules and features, meaning that robustness can become an issue. A more recent approach to vision-based obstacle avoidance exploits heuristic methods based on artificial intelligence such as deep learning technologies, which have shown state-of-the-art performance in fields such as image processing or voice recognition. These technologies are capable of automatically selecting important features for classification or prediction tasks, hence allowing superior performance. Such heuristic methods have proven to be more efficient as the rules and features that are drawn from the image are automatically determined, unlike conventional methods where the rules and features are explicitly determined by humans. In this thesis, we propose an imitation learning framework based on deep learning technologies that can be applied to the obstacle avoidance of UAVs, where the neural networks in this framework are trained upon the flight data obtained from human experts, extracting the necessary features and rules to carry out designated tasks. The system introduced in this thesis mainly consists of three parts: the data acquisition and preprocessing phase, the model training phase, and the model application phase. A CNN (Convolutional Neural Network), 3D-CNN, and a DNN (Deep Neural Network) will each be applied to the framework and tested with respect to the collision ratios to validate the obstacle avoidance performance.ope
    corecore