98,585 research outputs found
Deep Learning Meets Sparse Regularization: A Signal Processing Perspective
Deep learning has been wildly successful in practice and most
state-of-the-art machine learning methods are based on neural networks.
Lacking, however, is a rigorous mathematical theory that adequately explains
the amazing performance of deep neural networks. In this article, we present a
relatively new mathematical framework that provides the beginning of a deeper
understanding of deep learning. This framework precisely characterizes the
functional properties of neural networks that are trained to fit to data. The
key mathematical tools which support this framework include transform-domain
sparse regularization, the Radon transform of computed tomography, and
approximation theory, which are all techniques deeply rooted in signal
processing. This framework explains the effect of weight decay regularization
in neural network training, the use of skip connections and low-rank weight
matrices in network architectures, the role of sparsity in neural networks, and
explains why neural networks can perform well in high-dimensional problems
G-equivariant convolutional neural networks
Over the past decade, deep learning has revolutionized industry and academic research. Neural networks have been used to solve a multitude of previously unsolved problems and to significantly improve the state-of-the-art on other tasks, in some cases reaching superhuman levels of performance. However, most neural networks have to be carefully adapted to each application and often require large amounts of data and computational resources.Geometric deep learning aims to reduce the amount of information that neural networks have to learn, by taking advantage of geometric properties in data. In particular, equivariant neural networks use (local or global) symmetry to reduce the complexity of a learning task.In this thesis, we investigate a popular deep learning model for tasks exhibiting global symmetry: G-equivariant convolutional neural networks (GCNNs). We analyze the mathematical foundations of GCNNs and discuss where this model fits in the broader scheme of equivariant learning. More specifically, we discuss a general framework for equivariant neural networks using notions from gauge theory, and then show how GCNNs arise from this framework in the presence of global symmetry. We also characterize convolutional layers, the main building blocks of GCNNs, in terms of more general G-equivariant layers that preserve the underlying global symmetry
From Maxout to Channel-Out: Encoding Information on Sparse Pathways
Motivated by an important insight from neural science, we propose a new
framework for understanding the success of the recently proposed "maxout"
networks. The framework is based on encoding information on sparse pathways and
recognizing the correct pathway at inference time. Elaborating further on this
insight, we propose a novel deep network architecture, called "channel-out"
network, which takes a much better advantage of sparse pathway encoding. In
channel-out networks, pathways are not only formed a posteriori, but they are
also actively selected according to the inference outputs from the lower
layers. From a mathematical perspective, channel-out networks can represent a
wider class of piece-wise continuous functions, thereby endowing the network
with more expressive power than that of maxout networks. We test our
channel-out networks on several well-known image classification benchmarks,
setting new state-of-the-art performance on CIFAR-100 and STL-10, which
represent some of the "harder" image classification benchmarks.Comment: 10 pages including the appendix, 9 figure
- …