Search CORE

144,466 research outputs found

Approximation results for Gradient Descent trained Shallow Neural Networks in $1d$

Author: Gentile R.
Welper G.
Publication venue
Publication date: 17/09/2022
Field of study

Two aspects of neural networks that have been extensively studied in the recent literature are their function approximation properties and their training by gradient descent methods. The approximation problem seeks accurate approximations with a minimal number of weights. In most of the current literature these weights are fully or partially hand-crafted, showing the capabilities of neural networks but not necessarily their practical performance. In contrast, optimization theory for neural networks heavily relies on an abundance of weights in over-parametrized regimes. This paper balances these two demands and provides an approximation result for shallow networks in

1d

with non-convex weight optimization by gradient descent. We consider finite width networks and infinite sample limits, which is the typical setup in approximation theory. Technically, this problem is not over-parametrized, however, some form of redundancy reappears as a loss in approximation rate compared to best possible rates

arXiv.org e-Print Archive

Approximation and Non-parametric Estimation of ResNet-type Convolutional Neural Networks

Author: Oono Kenta
Suzuki Taiji
Publication venue
Publication date: 06/01/2021
Field of study

Convolutional neural networks (CNNs) have been shown to achieve optimal approximation and estimation error rates (in minimax sense) in several function classes. However, previous analyzed optimal CNNs are unrealistically wide and difficult to obtain via optimization due to sparse constraints in important function classes, including the H\"older class. We show a ResNet-type CNN can attain the minimax optimal error rates in these classes in more plausible situations -- it can be dense, and its width, channel size, and filter size are constant with respect to sample size. The key idea is that we can replicate the learning ability of Fully-connected neural networks (FNNs) by tailored CNNs, as long as the FNNs have \textit{block-sparse} structures. Our theory is general in a sense that we can automatically translate any approximation rate achieved by block-sparse FNNs into that by CNNs. As an application, we derive approximation and estimation error rates of the aformentioned type of CNNs for the Barron and H\"older classes with the same strategy.Comment: 8 pages + References 2 pages + Supplemental material 18 page

arXiv.org e-Print Archive

On the Universal Approximation Property and Equivalence of Stochastic Computing-based Neural Networks and Binary Neural Networks

Author: Li Jiayu
Lin Xue
Tang Jian
Wang Siyue
Wang Yanzhi
Wen Wujie
Yuan Bo
Zhan Zheng
Zhao Liang
Publication venue
Publication date: 09/06/2018
Field of study

Large-scale deep neural networks are both memory intensive and computation-intensive, thereby posing stringent requirements on the computing platforms. Hardware accelerations of deep neural networks have been extensively investigated in both industry and academia. Specific forms of binary neural networks (BNNs) and stochastic computing based neural networks (SCNNs) are particularly appealing to hardware implementations since they can be implemented almost entirely with binary operations. Despite the obvious advantages in hardware implementation, these approximate computing techniques are questioned by researchers in terms of accuracy and universal applicability. Also it is important to understand the relative pros and cons of SCNNs and BNNs in theory and in actual hardware implementations. In order to address these concerns, in this paper we prove that the "ideal" SCNNs and BNNs satisfy the universal approximation property with probability 1 (due to the stochastic behavior). The proof is conducted by first proving the property for SCNNs from the strong law of large numbers, and then using SCNNs as a "bridge" to prove for BNNs. Based on the universal approximation property, we further prove that SCNNs and BNNs exhibit the same energy complexity. In other words, they have the same asymptotic energy consumption with the growing of network size. We also provide a detailed analysis of the pros and cons of SCNNs and BNNs for hardware implementations and conclude that SCNNs are more suitable for hardware.Comment: 9 pages, 3 figure

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

A Comprehensive Survey on Functional Approximation

Author: Hua Yucheng
Publication venue: Bowdoin Digital Commons
Publication date: 01/01/2022
Field of study

The theory of functional approximation has numerous applications in sciences and industry. This thesis focuses on the possible approaches to approximate a continuous function on a compact subset of R2 using a variety of constructions. The results are presented from the following four general topics: polynomials, Fourier series, wavelets, and neural networks. Approximation with polynomials on subsets of R leads to the discussion of the Stone-Weierstrass theorem. Convergence of Fourier series is characterized on the unit circle. Wavelets are introduced following the Fourier transform, and their construction as well as ability to approximate functions in L2(R) is discussed. At the end, the universal approximation theorem for artificial neural networks is presented, and the function representation and approximation with single- and multilayer neural networks on R2 is constructed

Bowdoin College