542 research outputs found
Towards Visually Explaining Variational Autoencoders
Recent advances in Convolutional Neural Network (CNN) model interpretability
have led to impressive progress in visualizing and understanding model
predictions. In particular, gradient-based visual attention methods have driven
much recent effort in using visual attention maps as a means for visual
explanations. A key problem, however, is these methods are designed for
classification and categorization tasks, and their extension to explaining
generative models, e.g. variational autoencoders (VAE) is not trivial. In this
work, we take a step towards bridging this crucial gap, proposing the first
technique to visually explain VAEs by means of gradient-based attention. We
present methods to generate visual attention from the learned latent space, and
also demonstrate such attention explanations serve more than just explaining
VAE predictions. We show how these attention maps can be used to localize
anomalies in images, demonstrating state-of-the-art performance on the MVTec-AD
dataset. We also show how they can be infused into model training, helping
bootstrap the VAE into learning improved latent space disentanglement,
demonstrated on the Dsprites dataset
A Comprehensive Survey of Deep Learning in Remote Sensing: Theories, Tools and Challenges for the Community
In recent years, deep learning (DL), a re-branding of neural networks (NNs),
has risen to the top in numerous areas, namely computer vision (CV), speech
recognition, natural language processing, etc. Whereas remote sensing (RS)
possesses a number of unique challenges, primarily related to sensors and
applications, inevitably RS draws from many of the same theories as CV; e.g.,
statistics, fusion, and machine learning, to name a few. This means that the RS
community should be aware of, if not at the leading edge of, of advancements
like DL. Herein, we provide the most comprehensive survey of state-of-the-art
RS DL research. We also review recent new developments in the DL field that can
be used in DL for RS. Namely, we focus on theories, tools and challenges for
the RS community. Specifically, we focus on unsolved challenges and
opportunities as it relates to (i) inadequate data sets, (ii)
human-understandable solutions for modelling physical phenomena, (iii) Big
Data, (iv) non-traditional heterogeneous data sources, (v) DL architectures and
learning algorithms for spectral, spatial and temporal data, (vi) transfer
learning, (vii) an improved theoretical understanding of DL systems, (viii)
high barriers to entry, and (ix) training and optimizing the DL.Comment: 64 pages, 411 references. To appear in Journal of Applied Remote
Sensin
A survey on generative adversarial networks for imbalance problems in computer vision tasks
Any computer vision application development starts off by acquiring images and data, then preprocessing and pattern recognition steps to perform a task. When the acquired images are highly imbalanced and not adequate, the desired task may not be achievable. Unfortunately, the occurrence of imbalance problems in acquired image datasets in certain complex real-world problems such as anomaly detection, emotion recognition, medical image analysis, fraud detection, metallic surface defect detection, disaster prediction, etc., are inevitable. The performance of computer vision algorithms can significantly deteriorate when the training dataset is imbalanced. In recent years, Generative Adversarial Neural Networks (GANs) have gained immense attention by researchers across a variety of application domains due to their capability to model complex real-world image data. It is particularly important that GANs can not only be used to generate synthetic images, but also its fascinating adversarial learning idea showed good potential in restoring balance in imbalanced datasets. In this paper, we examine the most recent developments of GANs based techniques for addressing imbalance problems in image data. The real-world challenges and implementations of synthetic image generation based on GANs are extensively covered in this survey. Our survey first introduces various imbalance problems in computer vision tasks and its existing solutions, and then examines key concepts such as deep generative image models and GANs. After that, we propose a taxonomy to summarize GANs based techniques for addressing imbalance problems in computer vision tasks into three major categories: 1. Image level imbalances in classification, 2. object level imbalances in object detection and 3. pixel level imbalances in segmentation tasks. We elaborate the imbalance problems of each group, and provide GANs based solutions in each group. Readers will understand how GANs based techniques can handle the problem of imbalances and boost performance of the computer vision algorithms
Efficient Deep Learning Models with Autoencoder Regularization and Information Bottleneck Compression
Improving efficiency in deep learning models implies achieving a more accurate model for a given computational budget, or conversely a faster, leaner model without losing accuracy. In order to improve efficiency, we can use regularization to to improve generalization to the real world, and compression to improve speed. Due to the information-restricting nature of regularization, these two methods are related. Firstly we present a novel autoencoder architecture as a method of regularization for Pedestrian Detection. Secondly, we present a hyperparameter-free, iterative compression method based on measuring the information content of the model with the Information Bottleneck principle.Thesis (MPhil) -- University of Adelaide, School of Computer Science, 201
- …