180 research outputs found
Understanding Convolutional Neural Networks in Terms of Category-Level Attributes
Abstract. It has been recently reported that convolutional neural net-works (CNNs) show good performances in many image recognition tasks. They significantly outperform the previous approaches that are not based on neural networks particularly for object category recognition. These performances are arguably owing to their ability of discovering better image features for recognition tasks through learning, resulting in the acquisition of better internal representations of the inputs. However, in spite of the good performances, it remains an open question why CNNs work so well and/or how they can learn such good representations. In this study, we conjecture that the learned representation can be interpreted as category-level attributes that have good properties. We conducted sev-eral experiments by using the dataset AwA (Animals with Attributes) and a CNN trained for ILSVRC-2012 in a fully supervised setting to ex-amine this conjecture. We report that there exist units in the CNN that can predict some of the 85 semantic attributes fairly accurately, along with a detailed observation that this is true only for visual attributes and not for non-visual ones. It is more natural to think that the CNN may discover not only semantic attributes but non-semantic ones (or ones that are difficult to represent as a word). To explore this possibility, we perform zero-shot learning by regarding the activation pattern of upper layers as attributes describing the categories. The result shows that it outperforms the state-of-the-art with a significant margin.
Semantic bottleneck for computer vision tasks
This paper introduces a novel method for the representation of images that is
semantic by nature, addressing the question of computation intelligibility in
computer vision tasks. More specifically, our proposition is to introduce what
we call a semantic bottleneck in the processing pipeline, which is a crossing
point in which the representation of the image is entirely expressed with
natural language , while retaining the efficiency of numerical representations.
We show that our approach is able to generate semantic representations that
give state-of-the-art results on semantic content-based image retrieval and
also perform very well on image classification tasks. Intelligibility is
evaluated through user centered experiments for failure detection
Learning the Roots of Visual Domain Shift
In this paper we focus on the spatial nature of visual domain shift,
attempting to learn where domain adaptation originates in each given image of
the source and target set. We borrow concepts and techniques from the CNN
visualization literature, and learn domainnes maps able to localize the degree
of domain specificity in images. We derive from these maps features related to
different domainnes levels, and we show that by considering them as a
preprocessing step for a domain adaptation algorithm, the final classification
performance is strongly improved. Combined with the whole image representation,
these features provide state of the art results on the Office dataset.Comment: Extended Abstrac
Visualizing Convolutional Networks for MRI-based Diagnosis of Alzheimer's Disease
Visualizing and interpreting convolutional neural networks (CNNs) is an
important task to increase trust in automatic medical decision making systems.
In this study, we train a 3D CNN to detect Alzheimer's disease based on
structural MRI scans of the brain. Then, we apply four different gradient-based
and occlusion-based visualization methods that explain the network's
classification decisions by highlighting relevant areas in the input image. We
compare the methods qualitatively and quantitatively. We find that all four
methods focus on brain regions known to be involved in Alzheimer's disease,
such as inferior and middle temporal gyrus. While the occlusion-based methods
focus more on specific regions, the gradient-based methods pick up distributed
relevance patterns. Additionally, we find that the distribution of relevance
varies across patients, with some having a stronger focus on the temporal lobe,
whereas for others more cortical areas are relevant. In summary, we show that
applying different visualization methods is important to understand the
decisions of a CNN, a step that is crucial to increase clinical impact and
trust in computer-based decision support systems.Comment: MLCN 201
Hierarchical ResNeXt Models for Breast Cancer Histology Image Classification
Microscopic histology image analysis is a cornerstone in early detection of
breast cancer. However these images are very large and manual analysis is error
prone and very time consuming. Thus automating this process is in high demand.
We proposed a hierarchical system of convolutional neural networks (CNN) that
classifies automatically patches of these images into four pathologies: normal,
benign, in situ carcinoma and invasive carcinoma. We evaluated our system on
the BACH challenge dataset of image-wise classification and a small dataset
that we used to extend it. Using a train/test split of 75%/25%, we achieved an
accuracy rate of 0.99 on the test split for the BACH dataset and 0.96 on that
of the extension. On the test of the BACH challenge, we've reached an accuracy
of 0.81 which rank us to the 8th out of 51 teams
DeepAPT: Nation-State APT Attribution Using End-to-End Deep Neural Networks
In recent years numerous advanced malware, aka advanced persistent threats
(APT) are allegedly developed by nation-states. The task of attributing an APT
to a specific nation-state is extremely challenging for several reasons. Each
nation-state has usually more than a single cyber unit that develops such
advanced malware, rendering traditional authorship attribution algorithms
useless. Furthermore, those APTs use state-of-the-art evasion techniques,
making feature extraction challenging. Finally, the dataset of such available
APTs is extremely small.
In this paper we describe how deep neural networks (DNN) could be
successfully employed for nation-state APT attribution. We use sandbox reports
(recording the behavior of the APT when run dynamically) as raw input for the
neural network, allowing the DNN to learn high level feature abstractions of
the APTs itself. Using a test set of 1,000 Chinese and Russian developed APTs,
we achieved an accuracy rate of 94.6%
VConv-DAE: Deep Volumetric Shape Learning Without Object Labels
With the advent of affordable depth sensors, 3D capture becomes more and more
ubiquitous and already has made its way into commercial products. Yet,
capturing the geometry or complete shapes of everyday objects using scanning
devices (e.g. Kinect) still comes with several challenges that result in noise
or even incomplete shapes. Recent success in deep learning has shown how to
learn complex shape distributions in a data-driven way from large scale 3D CAD
Model collections and to utilize them for 3D processing on volumetric
representations and thereby circumventing problems of topology and
tessellation. Prior work has shown encouraging results on problems ranging from
shape completion to recognition. We provide an analysis of such approaches and
discover that training as well as the resulting representation are strongly and
unnecessarily tied to the notion of object labels. Thus, we propose a full
convolutional volumetric auto encoder that learns volumetric representation
from noisy data by estimating the voxel occupancy grids. The proposed method
outperforms prior work on challenging tasks like denoising and shape
completion. We also show that the obtained deep embedding gives competitive
performance when used for classification and promising results for shape
interpolation
Part Detector Discovery in Deep Convolutional Neural Networks
Current fine-grained classification approaches often rely on a robust
localization of object parts to extract localized feature representations
suitable for discrimination. However, part localization is a challenging task
due to the large variation of appearance and pose. In this paper, we show how
pre-trained convolutional neural networks can be used for robust and efficient
object part discovery and localization without the necessity to actually train
the network on the current dataset. Our approach called "part detector
discovery" (PDD) is based on analyzing the gradient maps of the network outputs
and finding activation centers spatially related to annotated semantic parts or
bounding boxes.
This allows us not just to obtain excellent performance on the CUB200-2011
dataset, but in contrast to previous approaches also to perform detection and
bird classification jointly without requiring a given bounding box annotation
during testing and ground-truth parts during training. The code is available at
http://www.inf-cv.uni-jena.de/part_discovery and
https://github.com/cvjena/PartDetectorDisovery.Comment: Accepted for publication on Asian Conference on Computer Vision
(ACCV) 201
Right for the Right Reason: Training Agnostic Networks
We consider the problem of a neural network being requested to classify
images (or other inputs) without making implicit use of a "protected concept",
that is a concept that should not play any role in the decision of the network.
Typically these concepts include information such as gender or race, or other
contextual information such as image backgrounds that might be implicitly
reflected in unknown correlations with other variables, making it insufficient
to simply remove them from the input features. In other words, making accurate
predictions is not good enough if those predictions rely on information that
should not be used: predictive performance is not the only important metric for
learning systems. We apply a method developed in the context of domain
adaptation to address this problem of "being right for the right reason", where
we request a classifier to make a decision in a way that is entirely 'agnostic'
to a given protected concept (e.g. gender, race, background etc.), even if this
could be implicitly reflected in other attributes via unknown correlations.
After defining the concept of an 'agnostic model', we demonstrate how the
Domain-Adversarial Neural Network can remove unwanted information from a model
using a gradient reversal layer.Comment: Author's original versio
Food Recognition using Fusion of Classifiers based on CNNs
With the arrival of convolutional neural networks, the complex problem of
food recognition has experienced an important improvement in recent years. The
best results have been obtained using methods based on very deep convolutional
neural networks, which show that the deeper the model,the better the
classification accuracy will be obtain. However, very deep neural networks may
suffer from the overfitting problem. In this paper, we propose a combination of
multiple classifiers based on different convolutional models that complement
each other and thus, achieve an improvement in performance. The evaluation of
our approach is done on two public datasets: Food-101 as a dataset with a wide
variety of fine-grained dishes, and Food-11 as a dataset of high-level food
categories, where our approach outperforms the independent CNN models
- …