10,373 research outputs found
Group Invariance, Stability to Deformations, and Complexity of Deep Convolutional Representations
The success of deep convolutional architectures is often attributed in part
to their ability to learn multiscale and invariant representations of natural
signals. However, a precise study of these properties and how they affect
learning guarantees is still missing. In this paper, we consider deep
convolutional representations of signals; we study their invariance to
translations and to more general groups of transformations, their stability to
the action of diffeomorphisms, and their ability to preserve signal
information. This analysis is carried by introducing a multilayer kernel based
on convolutional kernel networks and by studying the geometry induced by the
kernel mapping. We then characterize the corresponding reproducing kernel
Hilbert space (RKHS), showing that it contains a large class of convolutional
neural networks with homogeneous activation functions. This analysis allows us
to separate data representation from learning, and to provide a canonical
measure of model complexity, the RKHS norm, which controls both stability and
generalization of any learned model. In addition to models in the constructed
RKHS, our stability analysis also applies to convolutional networks with
generic activations such as rectified linear units, and we discuss its
relationship with recent generalization bounds based on spectral norms
Generalization in Deep Learning
This paper provides theoretical insights into why and how deep learning can
generalize well, despite its large capacity, complexity, possible algorithmic
instability, nonrobustness, and sharp minima, responding to an open question in
the literature. We also discuss approaches to provide non-vacuous
generalization guarantees for deep learning. Based on theoretical observations,
we propose new open problems and discuss the limitations of our results.Comment: To appear in Mathematics of Deep Learning, Cambridge University
Press. All previous results remain unchange
The Structure Transfer Machine Theory and Applications
Representation learning is a fundamental but challenging problem, especially
when the distribution of data is unknown. We propose a new representation
learning method, termed Structure Transfer Machine (STM), which enables feature
learning process to converge at the representation expectation in a
probabilistic way. We theoretically show that such an expected value of the
representation (mean) is achievable if the manifold structure can be
transferred from the data space to the feature space. The resulting structure
regularization term, named manifold loss, is incorporated into the loss
function of the typical deep learning pipeline. The STM architecture is
constructed to enforce the learned deep representation to satisfy the intrinsic
manifold structure from the data, which results in robust features that suit
various application scenarios, such as digit recognition, image classification
and object tracking. Compared to state-of-the-art CNN architectures, we achieve
the better results on several commonly used benchmarks\footnote{The source code
is available. https://github.com/stmstmstm/stm }
Detail-Preserving Pooling in Deep Networks
Most convolutional neural networks use some method for gradually downscaling
the size of the hidden layers. This is commonly referred to as pooling, and is
applied to reduce the number of parameters, improve invariance to certain
distortions, and increase the receptive field size. Since pooling by nature is
a lossy process, it is crucial that each such layer maintains the portion of
the activations that is most important for the network's discriminability. Yet,
simple maximization or averaging over blocks, max or average pooling, or plain
downsampling in the form of strided convolutions are the standard. In this
paper, we aim to leverage recent results on image downscaling for the purposes
of deep learning. Inspired by the human visual system, which focuses on local
spatial changes, we propose detail-preserving pooling (DPP), an adaptive
pooling method that magnifies spatial changes and preserves important
structural detail. Importantly, its parameters can be learned jointly with the
rest of the network. We analyze some of its theoretical properties and show its
empirical benefits on several datasets and networks, where DPP consistently
outperforms previous pooling approaches.Comment: To appear at CVPR 201
- …