9,081 research outputs found
Using KL-divergence to focus Deep Visual Explanation
We present a method for explaining the image classification predictions of
deep convolution neural networks, by highlighting the pixels in the image which
influence the final class prediction. Our method requires the identification of
a heuristic method to select parameters hypothesized to be most relevant in
this prediction, and here we use Kullback-Leibler divergence to provide this
focus. Overall, our approach helps in understanding and interpreting deep
network predictions and we hope contributes to a foundation for such
understanding of deep learning networks. In this brief paper, our experiments
evaluate the performance of two popular networks in this context of
interpretability.Comment: Presented at NIPS 2017 Symposium on Interpretable Machine Learnin
How (not) to Train your Generative Model: Scheduled Sampling, Likelihood, Adversary?
Modern applications and progress in deep learning research have created
renewed interest for generative models of text and of images. However, even
today it is unclear what objective functions one should use to train and
evaluate these models. In this paper we present two contributions.
Firstly, we present a critique of scheduled sampling, a state-of-the-art
training method that contributed to the winning entry to the MSCOCO image
captioning benchmark in 2015. Here we show that despite this impressive
empirical performance, the objective function underlying scheduled sampling is
improper and leads to an inconsistent learning algorithm.
Secondly, we revisit the problems that scheduled sampling was meant to
address, and present an alternative interpretation. We argue that maximum
likelihood is an inappropriate training objective when the end-goal is to
generate natural-looking samples. We go on to derive an ideal objective
function to use in this situation instead. We introduce a generalisation of
adversarial training, and show how such method can interpolate between maximum
likelihood training and our ideal training objective. To our knowledge this is
the first theoretical analysis that explains why adversarial training tends to
produce samples with higher perceived quality
Manifold: A Model-Agnostic Framework for Interpretation and Diagnosis of Machine Learning Models
Interpretation and diagnosis of machine learning models have gained renewed
interest in recent years with breakthroughs in new approaches. We present
Manifold, a framework that utilizes visual analysis techniques to support
interpretation, debugging, and comparison of machine learning models in a more
transparent and interactive manner. Conventional techniques usually focus on
visualizing the internal logic of a specific model type (i.e., deep neural
networks), lacking the ability to extend to a more complex scenario where
different model types are integrated. To this end, Manifold is designed as a
generic framework that does not rely on or access the internal logic of the
model and solely observes the input (i.e., instances or features) and the
output (i.e., the predicted result and probability distribution). We describe
the workflow of Manifold as an iterative process consisting of three major
phases that are commonly involved in the model development and diagnosis
process: inspection (hypothesis), explanation (reasoning), and refinement
(verification). The visual components supporting these tasks include a
scatterplot-based visual summary that overviews the models' outcome and a
customizable tabular view that reveals feature discrimination. We demonstrate
current applications of the framework on the classification and regression
tasks and discuss other potential machine learning use scenarios where Manifold
can be applied
Learning Multimodal Transition Dynamics for Model-Based Reinforcement Learning
In this paper we study how to learn stochastic, multimodal transition
dynamics in reinforcement learning (RL) tasks. We focus on evaluating
transition function estimation, while we defer planning over this model to
future work. Stochasticity is a fundamental property of many task environments.
However, discriminative function approximators have difficulty estimating
multimodal stochasticity. In contrast, deep generative models do capture
complex high-dimensional outcome distributions. First we discuss why, amongst
such models, conditional variational inference (VI) is theoretically most
appealing for model-based RL. Subsequently, we compare different VI models on
their ability to learn complex stochasticity on simulated functions, as well as
on a typical RL gridworld with multimodal dynamics. Results show VI
successfully predicts multimodal outcomes, but also robustly ignores these for
deterministic parts of the transition dynamics. In summary, we show a robust
method to learn multimodal transitions using function approximation, which is a
key preliminary for model-based RL in stochastic domains.Comment: Scaling Up Reinforcement Learning (SURL) Workshop @ European Machine
Learning Conference (ECML
Examining CNN Representations with respect to Dataset Bias
Given a pre-trained CNN without any testing samples, this paper proposes a
simple yet effective method to diagnose feature representations of the CNN. We
aim to discover representation flaws caused by potential dataset bias. More
specifically, when the CNN is trained to estimate image attributes, we mine
latent relationships between representations of different attributes inside the
CNN. Then, we compare the mined attribute relationships with ground-truth
attribute relationships to discover the CNN's blind spots and failure modes due
to dataset bias. In fact, representation flaws caused by dataset bias cannot be
examined by conventional evaluation strategies based on testing images, because
testing images may also have a similar bias. Experiments have demonstrated the
effectiveness of our method.Comment: in AAAI 201
Confident Multiple Choice Learning
Ensemble methods are arguably the most trustworthy techniques for boosting
the performance of machine learning models. Popular independent ensembles (IE)
relying on naive averaging/voting scheme have been of typical choice for most
applications involving deep neural networks, but they do not consider advanced
collaboration among ensemble models. In this paper, we propose new ensemble
methods specialized for deep neural networks, called confident multiple choice
learning (CMCL): it is a variant of multiple choice learning (MCL) via
addressing its overconfidence issue.In particular, the proposed major
components of CMCL beyond the original MCL scheme are (i) new loss, i.e.,
confident oracle loss, (ii) new architecture, i.e., feature sharing and (iii)
new training method, i.e., stochastic labeling. We demonstrate the effect of
CMCL via experiments on the image classification on CIFAR and SVHN, and the
foreground-background segmentation on the iCoseg. In particular, CMCL using 5
residual networks provides 14.05% and 6.60% relative reductions in the top-1
error rates from the corresponding IE scheme for the classification task on
CIFAR and SVHN, respectively.Comment: Accepted in ICML 201
InfoMask: Masked Variational Latent Representation to Localize Chest Disease
The scarcity of richly annotated medical images is limiting supervised deep
learning based solutions to medical image analysis tasks, such as localizing
discriminatory radiomic disease signatures. Therefore, it is desirable to
leverage unsupervised and weakly supervised models. Most recent weakly
supervised localization methods apply attention maps or region proposals in a
multiple instance learning formulation. While attention maps can be noisy,
leading to erroneously highlighted regions, it is not simple to decide on an
optimal window/bag size for multiple instance learning approaches. In this
paper, we propose a learned spatial masking mechanism to filter out irrelevant
background signals from attention maps. The proposed method minimizes mutual
information between a masked variational representation and the input while
maximizing the information between the masked representation and class labels.
This results in more accurate localization of discriminatory regions. We tested
the proposed model on the ChestX-ray8 dataset to localize pneumonia from chest
X-ray images without using any pixel-level or bounding-box annotations.Comment: Accepted to MICCAI 201
An Information Bottleneck Approach for Controlling Conciseness in Rationale Extraction
Decisions of complex language understanding models can be rationalized by
limiting their inputs to a relevant subsequence of the original text. A
rationale should be as concise as possible without significantly degrading task
performance, but this balance can be difficult to achieve in practice. In this
paper, we show that it is possible to better manage this trade-off by
optimizing a bound on the Information Bottleneck (IB) objective. Our fully
unsupervised approach jointly learns an explainer that predicts sparse binary
masks over sentences, and an end-task predictor that considers only the
extracted rationale. Using IB, we derive a learning objective that allows
direct control of mask sparsity levels through a tunable sparse prior.
Experiments on ERASER benchmark tasks demonstrate significant gains over
norm-minimization techniques for both task performance and agreement with human
rationales. Furthermore, we find that in the semi-supervised setting, a modest
amount of gold rationales (25% of training examples) closes the gap with a
model that uses the full input.Comment: EMNLP 2020 main track accepted pape
TSXplain: Demystification of DNN Decisions for Time-Series using Natural Language and Statistical Features
Neural networks (NN) are considered as black-boxes due to the lack of
explainability and transparency of their decisions. This significantly hampers
their deployment in environments where explainability is essential along with
the accuracy of the system. Recently, significant efforts have been made for
the interpretability of these deep networks with the aim to open up the
black-box. However, most of these approaches are specifically developed for
visual modalities. In addition, the interpretations provided by these systems
require expert knowledge and understanding for intelligibility. This indicates
a vital gap between the explainability provided by the systems and the novice
user. To bridge this gap, we present a novel framework i.e. Time-Series
eXplanation (TSXplain) system which produces a natural language based
explanation of the decision taken by a NN. It uses the extracted statistical
features to describe the decision of a NN, merging the deep learning world with
that of statistics. The two-level explanation provides ample description of the
decision made by the network to aid an expert as well as a novice user alike.
Our survey and reliability assessment test confirm that the generated
explanations are meaningful and correct. We believe that generating natural
language based descriptions of the network's decisions is a big step towards
opening up the black-box.Comment: Pre-prin
Generative Adversarial Networks (GANs): What it can generate and What it cannot?
In recent years, Generative Adversarial Networks (GANs) have received
significant attention from the research community. With a straightforward
implementation and outstanding results, GANs have been used for numerous
applications. Despite the success, GANs lack a proper theoretical explanation.
These models suffer from issues like mode collapse, non-convergence, and
instability during training. To address these issues, researchers have proposed
theoretically rigorous frameworks inspired by varied fields of Game theory,
Statistical theory, Dynamical systems, etc.
In this paper, we propose to give an appropriate structure to study these
contributions systematically. We essentially categorize the papers based on the
issues they raise and the kind of novelty they introduce to address them.
Besides, we provide insight into how each of the discussed articles solves the
concerned problems. We compare and contrast different results and put forth a
summary of theoretical contributions about GANs with focus on image/visual
applications. We expect this summary paper to give a bird's eye view to a
person wishing to understand the theoretical progress in GANs so far
- …