8,600 research outputs found
Unmasking Clever Hans Predictors and Assessing What Machines Really Learn
Current learning machines have successfully solved hard application problems,
reaching high accuracy and displaying seemingly "intelligent" behavior. Here we
apply recent techniques for explaining decisions of state-of-the-art learning
machines and analyze various tasks from computer vision and arcade games. This
showcases a spectrum of problem-solving behaviors ranging from naive and
short-sighted, to well-informed and strategic. We observe that standard
performance evaluation metrics can be oblivious to distinguishing these diverse
problem solving behaviors. Furthermore, we propose our semi-automated Spectral
Relevance Analysis that provides a practically effective way of characterizing
and validating the behavior of nonlinear learning machines. This helps to
assess whether a learned model indeed delivers reliably for the problem that it
was conceived for. Furthermore, our work intends to add a voice of caution to
the ongoing excitement about machine intelligence and pledges to evaluate and
judge some of these recent successes in a more nuanced manner.Comment: Accepted for publication in Nature Communication
An Interpretable Machine Vision Approach to Human Activity Recognition using Photoplethysmograph Sensor Data
The current gold standard for human activity recognition (HAR) is based on
the use of cameras. However, the poor scalability of camera systems renders
them impractical in pursuit of the goal of wider adoption of HAR in mobile
computing contexts. Consequently, researchers instead rely on wearable sensors
and in particular inertial sensors. A particularly prevalent wearable is the
smart watch which due to its integrated inertial and optical sensing
capabilities holds great potential for realising better HAR in a non-obtrusive
way. This paper seeks to simplify the wearable approach to HAR through
determining if the wrist-mounted optical sensor alone typically found in a
smartwatch or similar device can be used as a useful source of data for
activity recognition. The approach has the potential to eliminate the need for
the inertial sensing element which would in turn reduce the cost of and
complexity of smartwatches and fitness trackers. This could potentially
commoditise the hardware requirements for HAR while retaining the functionality
of both heart rate monitoring and activity capture all from a single optical
sensor. Our approach relies on the adoption of machine vision for activity
recognition based on suitably scaled plots of the optical signals. We take this
approach so as to produce classifications that are easily explainable and
interpretable by non-technical users. More specifically, images of
photoplethysmography signal time series are used to retrain the penultimate
layer of a convolutional neural network which has initially been trained on the
ImageNet database. We then use the 2048 dimensional features from the
penultimate layer as input to a support vector machine. Results from the
experiment yielded an average classification accuracy of 92.3%. This result
outperforms that of an optical and inertial sensor combined (78%) and
illustrates the capability of HAR systems using...Comment: 26th AIAI Irish Conference on Artificial Intelligence and Cognitive
Scienc
Interpreting Deep Visual Representations via Network Dissection
The success of recent deep convolutional neural networks (CNNs) depends on
learning hidden representations that can summarize the important factors of
variation behind the data. However, CNNs often criticized as being black boxes
that lack interpretability, since they have millions of unexplained model
parameters. In this work, we describe Network Dissection, a method that
interprets networks by providing labels for the units of their deep visual
representations. The proposed method quantifies the interpretability of CNN
representations by evaluating the alignment between individual hidden units and
a set of visual semantic concepts. By identifying the best alignments, units
are given human interpretable labels across a range of objects, parts, scenes,
textures, materials, and colors. The method reveals that deep representations
are more transparent and interpretable than expected: we find that
representations are significantly more interpretable than they would be under a
random equivalently powerful basis. We apply the method to interpret and
compare the latent representations of various network architectures trained to
solve different supervised and self-supervised training tasks. We then examine
factors affecting the network interpretability such as the number of the
training iterations, regularizations, different initializations, and the
network depth and width. Finally we show that the interpreted units can be used
to provide explicit explanations of a prediction given by a CNN for an image.
Our results highlight that interpretability is an important property of deep
neural networks that provides new insights into their hierarchical structure.Comment: *B. Zhou and D. Bau contributed equally to this work. 15 pages, 27
figure
Learning and Interpreting Multi-Multi-Instance Learning Networks
We introduce an extension of the multi-instance learning problem where
examples are organized as nested bags of instances (e.g., a document could be
represented as a bag of sentences, which in turn are bags of words). This
framework can be useful in various scenarios, such as text and image
classification, but also supervised learning over graphs. As a further
advantage, multi-multi instance learning enables a particular way of
interpreting predictions and the decision function. Our approach is based on a
special neural network layer, called bag-layer, whose units aggregate bags of
inputs of arbitrary size. We prove theoretically that the associated class of
functions contains all Boolean functions over sets of sets of instances and we
provide empirical evidence that functions of this kind can be actually learned
on semi-synthetic datasets. We finally present experiments on text
classification, on citation graphs, and social graph data, which show that our
model obtains competitive results with respect to accuracy when compared to
other approaches such as convolutional networks on graphs, while at the same
time it supports a general approach to interpret the learnt model, as well as
explain individual predictions.Comment: JML
- …