Search CORE

98,585 research outputs found

Deep Learning Meets Sparse Regularization: A Signal Processing Perspective

Author: Nowak Robert D.
Parhi Rahul
Publication venue
Publication date: 08/06/2023
Field of study

Deep learning has been wildly successful in practice and most state-of-the-art machine learning methods are based on neural networks. Lacking, however, is a rigorous mathematical theory that adequately explains the amazing performance of deep neural networks. In this article, we present a relatively new mathematical framework that provides the beginning of a deeper understanding of deep learning. This framework precisely characterizes the functional properties of neural networks that are trained to fit to data. The key mathematical tools which support this framework include transform-domain sparse regularization, the Radon transform of computed tomography, and approximation theory, which are all techniques deeply rooted in signal processing. This framework explains the effect of weight decay regularization in neural network training, the use of skip connections and low-rank weight matrices in network architectures, the role of sparsity in neural networks, and explains why neural networks can perform well in high-dimensional problems

arXiv.org e-Print Archive

G-equivariant convolutional neural networks

Author: Aronsson Jimmy
Publication venue
Publication date: 01/01/2021
Field of study

Over the past decade, deep learning has revolutionized industry and academic research. Neural networks have been used to solve a multitude of previously unsolved problems and to significantly improve the state-of-the-art on other tasks, in some cases reaching superhuman levels of performance. However, most neural networks have to be carefully adapted to each application and often require large amounts of data and computational resources.Geometric deep learning aims to reduce the amount of information that neural networks have to learn, by taking advantage of geometric properties in data. In particular, equivariant neural networks use (local or global) symmetry to reduce the complexity of a learning task.In this thesis, we investigate a popular deep learning model for tasks exhibiting global symmetry: G-equivariant convolutional neural networks (GCNNs). We analyze the mathematical foundations of GCNNs and discuss where this model fits in the broader scheme of equivariant learning. More specifically, we discuss a general framework for equivariant neural networks using notions from gauge theory, and then show how GCNNs arise from this framework in the presence of global symmetry. We also characterize convolutional layers, the main building blocks of GCNNs, in terms of more general G-equivariant layers that preserve the underlying global symmetry

Chalmers Research

From Maxout to Channel-Out: Encoding Information on Sparse Pathways

Author: JaJa Joseph
Wang Qi
Publication venue
Publication date: 18/11/2013
Field of study

Motivated by an important insight from neural science, we propose a new framework for understanding the success of the recently proposed "maxout" networks. The framework is based on encoding information on sparse pathways and recognizing the correct pathway at inference time. Elaborating further on this insight, we propose a novel deep network architecture, called "channel-out" network, which takes a much better advantage of sparse pathway encoding. In channel-out networks, pathways are not only formed a posteriori, but they are also actively selected according to the inference outputs from the lower layers. From a mathematical perspective, channel-out networks can represent a wider class of piece-wise continuous functions, thereby endowing the network with more expressive power than that of maxout networks. We test our channel-out networks on several well-known image classification benchmarks, setting new state-of-the-art performance on CIFAR-100 and STL-10, which represent some of the "harder" image classification benchmarks.Comment: 10 pages including the appendix, 9 figure

arXiv.org e-Print Archive

Crossref