49,611 research outputs found
Structured Binary Neural Networks for Image Recognition
We propose methods to train convolutional neural networks (CNNs) with both
binarized weights and activations, leading to quantized models that are
specifically friendly to mobile devices with limited power capacity and
computation resources. Previous works on quantizing CNNs often seek to
approximate the floating-point information using a set of discrete values,
which we call value approximation, typically assuming the same architecture as
the full-precision networks. Here we take a novel "structure approximation"
view of quantization -- it is very likely that different architectures designed
for low-bit networks may be better for achieving good performance. In
particular, we propose a "network decomposition" strategy, termed Group-Net, in
which we divide the network into groups. Thus, each full-precision group can be
effectively reconstructed by aggregating a set of homogeneous binary branches.
In addition, we learn effective connections among groups to improve the
representation capability. Moreover, the proposed Group-Net shows strong
generalization to other tasks. For instance, we extend Group-Net for accurate
semantic segmentation by embedding rich context into the binary structure.
Furthermore, for the first time, we apply binary neural networks to object
detection. Experiments on both classification, semantic segmentation and object
detection tasks demonstrate the superior performance of the proposed methods
over various quantized networks in the literature. Our methods outperform the
previous best binary neural networks in terms of accuracy and computation
efficiency.Comment: 15 pages. Extended version of the conference version arXiv:1811.1041
Counting the learnable functions of structured data
Cover's function counting theorem is a milestone in the theory of artificial
neural networks. It provides an answer to the fundamental question of
determining how many binary assignments (dichotomies) of points in
dimensions can be linearly realized. Regrettably, it has proved hard to extend
the same approach to more advanced problems than the classification of points.
In particular, an emerging necessity is to find methods to deal with structured
data, and specifically with non-pointlike patterns. A prominent case is that of
invariant recognition, whereby identification of a stimulus is insensitive to
irrelevant transformations on the inputs (such as rotations or changes in
perspective in an image). An object is therefore represented by an extended
perceptual manifold, consisting of inputs that are classified similarly. Here,
we develop a function counting theory for structured data of this kind, by
extending Cover's combinatorial technique, and we derive analytical expressions
for the average number of dichotomies of generically correlated sets of
patterns. As an application, we obtain a closed formula for the capacity of a
binary classifier trained to distinguish general polytopes of any dimension.
These results may help extend our theoretical understanding of generalization,
feature extraction, and invariant object recognition by neural networks
Counting the learnable functions of geometrically structured data
Cover's function counting theorem is a milestone in the theory of artificial neural networks. It provides an answer to the fundamental question of determining how many binary assignments (dichotomies) of
p
points in
n
dimensions can be linearly realized. Regrettably, it has proved hard to extend the same approach to more advanced problems than the classification of points. In particular, an emerging necessity is to find methods to deal with geometrically structured data, and specifically with non-point-like patterns. A prominent case is that of invariant recognition, whereby identification of a stimulus is insensitive to irrelevant transformations on the inputs (such as rotations or changes in perspective in an image). An object is thus represented by an extended perceptual manifold, consisting of inputs that are classified similarly. Here, we develop a function counting theory for structured data of this kind, by extending Cover's combinatorial technique, and we derive analytical expressions for the average number of dichotomies of generically correlated sets of patterns. As an application, we obtain a closed formula for the capacity of a binary classifier trained to distinguish general polytopes of any dimension. These results extend our theoretical understanding of the role of data structure in machine learning, and provide useful quantitative tools for the analysis of generalization, feature extraction, and invariant object recognition by neural networks
Multi-Label Zero-Shot Learning with Structured Knowledge Graphs
In this paper, we propose a novel deep learning architecture for multi-label
zero-shot learning (ML-ZSL), which is able to predict multiple unseen class
labels for each input instance. Inspired by the way humans utilize semantic
knowledge between objects of interests, we propose a framework that
incorporates knowledge graphs for describing the relationships between multiple
labels. Our model learns an information propagation mechanism from the semantic
label space, which can be applied to model the interdependencies between seen
and unseen class labels. With such investigation of structured knowledge graphs
for visual reasoning, we show that our model can be applied for solving
multi-label classification and ML-ZSL tasks. Compared to state-of-the-art
approaches, comparable or improved performances can be achieved by our method.Comment: CVPR 201
Learning Structured Inference Neural Networks with Label Relations
Images of scenes have various objects as well as abundant attributes, and
diverse levels of visual categorization are possible. A natural image could be
assigned with fine-grained labels that describe major components,
coarse-grained labels that depict high level abstraction or a set of labels
that reveal attributes. Such categorization at different concept layers can be
modeled with label graphs encoding label information. In this paper, we exploit
this rich information with a state-of-art deep learning framework, and propose
a generic structured model that leverages diverse label relations to improve
image classification performance. Our approach employs a novel stacked label
prediction neural network, capturing both inter-level and intra-level label
semantics. We evaluate our method on benchmark image datasets, and empirical
results illustrate the efficacy of our model.Comment: Conference on Computer Vision and Pattern Recognition(CVPR) 201
- …