56,016 research outputs found
Multi-modal Medical Neurological Image Fusion using Wavelet Pooled Edge Preserving Autoencoder
Medical image fusion integrates the complementary diagnostic information of
the source image modalities for improved visualization and analysis of
underlying anomalies. Recently, deep learning-based models have excelled the
conventional fusion methods by executing feature extraction, feature selection,
and feature fusion tasks, simultaneously. However, most of the existing
convolutional neural network (CNN) architectures use conventional pooling or
strided convolutional strategies to downsample the feature maps. It causes the
blurring or loss of important diagnostic information and edge details available
in the source images and dilutes the efficacy of the feature extraction
process. Therefore, this paper presents an end-to-end unsupervised fusion model
for multimodal medical images based on an edge-preserving dense autoencoder
network. In the proposed model, feature extraction is improved by using wavelet
decomposition-based attention pooling of feature maps. This helps in preserving
the fine edge detail information present in both the source images and enhances
the visual perception of fused images. Further, the proposed model is trained
on a variety of medical image pairs which helps in capturing the intensity
distributions of the source images and preserves the diagnostic information
effectively. Substantial experiments are conducted which demonstrate that the
proposed method provides improved visual and quantitative results as compared
to the other state-of-the-art fusion methods.Comment: 8 pages, 5 figures, 6 table
Deep learning based approaches for imitation learning.
Imitation learning refers to an agent's ability to mimic a desired behaviour by learning from observations. The field is rapidly gaining attention due to recent advances in computational and communication capabilities as well as rising demand for intelligent applications. The goal of imitation learning is to describe the desired behaviour by providing demonstrations rather than instructions. This enables agents to learn complex behaviours with general learning methods that require minimal task specific information. However, imitation learning faces many challenges. The objective of this thesis is to advance the state of the art in imitation learning by adopting deep learning methods to address two major challenges of learning from demonstrations. Firstly, representing the demonstrations in a manner that is adequate for learning. We propose novel Convolutional Neural Networks (CNN) based methods to automatically extract feature representations from raw visual demonstrations and learn to replicate the demonstrated behaviour. This alleviates the need for task specific feature extraction and provides a general learning process that is adequate for multiple problems. The second challenge is generalizing a policy over unseen situations in the training demonstrations. This is a common problem because demonstrations typically show the best way to perform a task and don't offer any information about recovering from suboptimal actions. Several methods are investigated to improve the agent's generalization ability based on its initial performance. Our contributions in this area are three fold. Firstly, we propose an active data aggregation method that queries the demonstrator in situations of low confidence. Secondly, we investigate combining learning from demonstrations and reinforcement learning. A deep reward shaping method is proposed that learns a potential reward function from demonstrations. Finally, memory architectures in deep neural networks are investigated to provide context to the agent when taking actions. Using recurrent neural networks addresses the dependency between the state-action sequences taken by the agent. The experiments are conducted in simulated environments on 2D and 3D navigation tasks that are learned from raw visual data, as well as a 2D soccer simulator. The proposed methods are compared to state of the art deep reinforcement learning methods. The results show that deep learning architectures can learn suitable representations from raw visual data and effectively map them to atomic actions. The proposed methods for addressing generalization show improvements over using supervised learning and reinforcement learning alone. The results are thoroughly analysed to identify the benefits of each approach and situations in which it is most suitable
Understanding and Improving Features Learned in Deep Functional Maps
Deep functional maps have recently emerged as a successful paradigm for
non-rigid 3D shape correspondence tasks. An essential step in this pipeline
consists in learning feature functions that are used as constraints to solve
for a functional map inside the network. However, the precise nature of the
information learned and stored in these functions is not yet well understood.
Specifically, a major question is whether these features can be used for any
other objective, apart from their purely algebraic role in solving for
functional map matrices. In this paper, we show that under some mild
conditions, the features learned within deep functional map approaches can be
used as point-wise descriptors and thus are directly comparable across
different shapes, even without the necessity of solving for a functional map at
test time. Furthermore, informed by our analysis, we propose effective
modifications to the standard deep functional map pipeline, which promote
structural properties of learned features, significantly improving the matching
results. Finally, we demonstrate that previously unsuccessful attempts at using
extrinsic architectures for deep functional map feature extraction can be
remedied via simple architectural changes, which encourage the theoretical
properties suggested by our analysis. We thus bridge the gap between intrinsic
and extrinsic surface-based learning, suggesting the necessary and sufficient
conditions for successful shape matching. Our code is available at
https://github.com/pvnieo/clover.Comment: 16 pages, 8 figures, 8 tables, to be published in 2023 The IEEE
Conference on Computer Vision and Pattern Recognition (CVPR
- …